
At GigaOM’s recent Structure 08 event, Meebo co-founder and engineering chief, Sandy Jen, joined a panel to talk about scaling your computing infrastructure for explosive growth. Jen also spoke with Found|READ, this time to offer founders tips on how to overcome what she calls the internal scaling challenge: hiring.
Meebo launched in September 2005, when it unveiled the first Ajax application that allowed users to access several instant messaging clients (AIM, Jabber, Google Talk, etc.) from its home page. Back then, Jen and co-founders Seth Sternberg and Elaine Wherry were bootstrapping, even using personal credit cards to lease the three servers they needed in order to launch. With no money left over for marketing, they went guerrilla.
“Digg had started about six months earlier, so we said, ‘Let’s just Digg ourselves,’” Jen recalled. “We wrote a quick description of Meebo — ‘Web IM: AIM! Yahoo!; No downloads; draggable windows! It’s free!’— and went to bed. The next morning we had 600 Diggs, and our servers were overloaded.”
Three years later, Meebo has raised $37.5 million in venture capital, has all sorts of new products (and servers), gets 30 million unique visitors a month, and faces its toughest scaling challenge yet: “The No. 1 thing we worry about is hiring,” said Jen. To keep up with user demand, Meebo must grown to 50 employees from its current 30 by 2009 — a 67 percent increase.
In a fast-growing startup, maintaining your core values is crucial. “But how do you hire and keep your small team culture? It’s really hard,” Jen told us. “In the beginning it’s easy to ask your friends and people you trust for names. But eventually you’ll tap out your networks. Then where do you look for talent?”
In order to uncover new recruits — and not just the very talented people, but the right people — for her company, Jen has developed a few tricks:
1. Go to industry events. You want to hire people who are interested in the same things that you’re interested in. That means reaching out to people who attend the same events that you do. Once you’ve seen the same person at four of five events, make your move.
2. Keep track of smart comments in blogs and forums. Pay attention to the people who are commenting smartly on the stories you’re reading — especially if they’re doing so frequently. This is an indicator of their engagement and passion.
3. Look for people through your extra-curricular activities. You want people interested in your technology, but the right cultural fit means finding people who share your other values, too. A good indicator of shared values is a shared extra-curricular activity. Do you rock climb? Play ultimate Frisbee? (Jen does.) Common fun offers opportunities for bonding, which can be a great way to find new staff.
4. Go outside your geographic circle. There’s a lot of talent in the world. One of the first things Meebo did was commission its graphic design from a guy in Italy, whose work they found on an art web site. They hired him on a trial basis; today he’s Meebo’s Agent Icon.
5. Leverage contract arrangements. As Jen acknowledged, getting H-1B visas is a long process and a pain in the butt. But they’re worth it. If you find someone you want on your team, get them in the door, excited about your company and under contract as soon as possible. Meebo usually has six or seven people working under contract at any time.
6. Commit and be generous. Really talented people rarely advertise themselves, at least not as much as we’d like them to. You must court them. There is a lot of competition, so this could mean being flexible with hours or remote work options. And once you decide to hire someone, you have to welcome them with open arms.
7. Fire fast. When someone isn’t working out, have them leave quickly. In three years, two people have left Meebo — one left in three weeks, the other, in a few months. But a bad fit will contaminate your culture. You can’t afford that.
(Photo credit: Lea Suzuki, San Francisco Chronicle.)
For more on how Jen manages Meebo’s infrastructure, check out her interview with Om, below.


At GigaOM???s recent Structure 08 event, Meebo co-founder and engineering chief, Sandy Jen, joined a panel to talk about scaling your computing infrastructure for explosive growth. Jen also spoke with Found|READ, this time to offer founders tips on how to overcome what she calls the internal scaling challenge: hiring.
Meebo launched in September 2005, when it unveiled the first Ajax application that allowed users to access several instant messaging clients (AIM, Jabber, Google Talk, etc.) from its home page. Back then, Jen and co-founders Seth Sternberg and Elaine Wherry were bootstrapping, even using personal credit cards to lease the three servers they needed in order to launch. With no money left over for marketing, they went guerrilla.
???Digg had started about six months earlier, so we said, ???Let???s just Digg ourselves,’” Jen recalled. “We wrote a quick description of Meebo ??? ‘Web IM: AIM! Yahoo!; No downloads; draggable windows! It???s free!?????? and went to bed. The next morning we had 600 Diggs, and our servers were overloaded.???
Three years later, Meebo has raised $37.5 million in venture capital, has all sorts of new products (and servers), gets 30 million unique visitors a month, and faces its toughest scaling challenge yet: ???The No. 1 thing we worry about is hiring,??? said Jen. To keep up with user demand, Meebo must grown to 50 employees from its current 30 by 2009 ??? a 67 percent increase.
In a fast-growing startup, maintaining your core values is crucial. ???But how do you hire and keep your small team culture? It???s really hard,??? Jen told us. ???In the beginning it???s easy to ask your friends and people you trust for names. But eventually you???ll tap out your networks. Then where do you look for talent????
In order to uncover new recruits ??? and not just the very talented people, but the right people ??? for her company, Jen has developed a few tricks:
1. Go to industry events. You want to hire people who are interested in the same things that you’re interested in. That means reaching out to people who attend the same events that you do. Once you???ve seen the same person at four of five events, make your move.
2. Keep track of smart comments in blogs and forums. Pay attention to the people who are commenting smartly on the stories you’re reading — especially if they’re doing so frequently. This is an indicator of their engagement and passion.
3. Look for people through your extra-curricular activities. You want people interested in your technology, but the right cultural fit means finding people who share your other values, too. A good indicator of shared values is a shared extra-curricular activity. Do you rock climb? Play ultimate Frisbee? (Jen does.) Common fun offers opportunities for bonding, which can be a great way to find new staff.
4. Go outside your geographic circle. There’s a lot of talent in the world. One of the first things Meebo did was commission its graphic design from a guy in Italy, whose work they found on an art web site. They hired him on a trial basis; today he’s Meebo???s Agent Icon.
5. Leverage contract arrangements. As Jen acknowledged, getting H-1B visas is a long process and a pain in the butt. But they’re worth it. If you find someone you want on your team, get them in the door, excited about your company and under contract as soon as possible. Meebo usually has six or seven people working under contract at any time.
6. Commit and be generous. Really talented people rarely advertise themselves, at least not as much as we’d like them to. You must court them. There is a lot of competition, so this could mean being flexible with hours or remote work options. And once you decide to hire someone, you have to welcome them with open arms.
7. Fire fast. When someone isn???t working out, have them leave quickly. In three years, two people have left Meebo — one left in three weeks, the other, in a few months. But a bad fit will contaminate your culture. You can???t afford that.
(Photo credit: Lea Suzuki, San Francisco Chronicle.)
For more on how Jen manages Meebo’s infrastructure, check out her interview with Om, below.

A few minutes after she delivered a speech at our Structure 08 conference in San Francisco, I caught up with Microsoft’s corporate VP of global foundation services, Debra Chrapaty, for a video chat. I think a more appropriate title for her would be Mr. Softie’s Internet Infrastructure Czar. I found her very knowledgeable, engaging and open with her opinions. “We have some new innovations up our sleeve that are going to knock the socks of anything anyone is doing, including our friends down south,” she told me. She didn’t name Google, of course, but we all know who she was talking about.
Her candor was one of the reasons I wanted decided to share the video with you guys. The common theme of the conversation: Microsoft is spending liberally to build out its Internet infrastructure, including upgrading its backbone network and scaling out its data center infrastructure by adding new technologies.
When I asked her exactly how much Microsoft was spending on it, she dodged the question, saying just that it was a big number. This much we do know: Two years ago, the company was spending close to $2 billion on its infrastructure; it has since undertaken the development of six data centers, with parts of two networks already online.
| Adding 10,000 servers a month |
| New data centers being planned/under construction are equivalent of over 15 US football fields of data center space. |
| Plans to cut of 30% to 40% in data-center power costs company-wide over the next two years. |
| Current network backbone runs at about 100 gigabits per second, but soon Microsoft plans to bump it to 500 Gigabits. I think this could be big for Level 3, long time partner of Microsoft. |
| Building out its own CDN (Edge) network - 99 nodes on a 100 gigabit per second backbone. |
| For Microsoft, total data grows ten times every three years. The data in near future will soon approach 100s of petabytes. This includes data from all of their online services. |
| Source: Microsoft, GigaOM |
| When complete, it will consume 48 megawatts of energy. Microsoft can tap up to 72 MW of energy coming from hydro power. Microsoft is paying about 1.8 cents per kilowatt, but will rise to between 2.6-to-2.9 cents per kilowatt as more capacity goes online. Two data centers in this location. | |
| It will be 447,000 square feet on 44 acres. Microsoft is building two data centers here | |
| first Windows Live data center outside the U.S. | |
| The first floor of this facility is going to be entirely made of containers and would house Microsoft search. | |
| Source: Microsoft |
Watch the video to get the full low-down, but if you’re in a hurry, here are some highlights, including her quotes from our conversation.

Oooh, our first panel of the morning, Working the Clouds: NextGen Infrastructure for New Entrepreneurs. We’ve got a six-person lineup to give us their perspectives, and our own Alistair Croll to throw them questions. The lineup includes:
Here are some notes:
Alistair: When we are moving to clouds, are we selling our souls? Should we be happy with our cloud overlords?
AT&T Joe: I have a prediction that is not surprising. There will be a proprietary stack and there will be an open business model based on the cloud that will leverage standards, commodization, price-compression, and differential vs dynamic pricing.
GigaSpace’s Geva: There’s room for both models. People are talking about specialized clouds.
Joyent’s Jason: The question is is it about selling your soul. You can’t leave. Until Google open sources Big Table . . .
Alistair: Fluffy clouds was coined here at Structure 08 (chuckles).
XCalibre’s Tony: Google’s Big Table is basically a lock in.
Google’s Christophe: I claim that Google is possibly a little bit ahead as far as Big Table. But people can build a better mouse trap and come in and compete. It’s a developer preview, but the theory is that the API is open and not locked down.
Joyent’s Jason: There’s been a lot published on what an open, loving cloud should do. We should give people real assurances that the cloud is a good place to be.
Google’s Christophe: When we publish something on Big Table it is not to say that it’s a lock-in, but it’s our attempt to say that this is something that worked for us.
XCalibre’s Tony: Why not open Big Table completely if you know better than everyone else?
Google’s Christophe: Big Table is a compromise that is scalable. There’s nothing about the API that says you have to do this with Big Table.
XCalibre’s Tony: One of the big problems that companies need to figure out is licensing.
AT&T Joe: The dirty secret of the cloud is that companies need to figure out the licensing models, or it will be forced to fold. The whole idea will drop dead of its own weight.
Alistair: There are two theories: put everything in the cloud and put more out at the edge. What’s the setup for Google?
Christophe’s Google: We have geographically distributed clusters. A lot of services are replicated. We want to make sure that if you trust us with your email, the most current copy might be in a datacenter closer to you.
AT&T Joe: “Architecture 3.0.” We don’t have all the nuts and bolts that will work it out. But an optimal mix of edge and keeping core information in the data center.
Alistair: Can we have our old toys? Can we put all our toys in the cloud environment?
Joyent’s Jason: Absolutely. If we want people to be on the cloud, we have to make sure that this occurs.
Geva’s GigaSpaces: Or make it look like our old toys. Use all of the APIs.
Christophe’s Google: It’s important to find a compromise. When we deal with enterprise we can’t tell them to move into the cloud and do it differently. But five years from now its important for people to build apps that can serve millions of users — then people have to think about building apps differently. We need to provide tools for people to build apps for the cloud.
Geva’s GigaSpaces: Vogel said earlier all you need is a credit card, but that’s not what big corporations want. They want help with services.
Lew’s Rackspace: There’s also a big problem with marketing — cloud means something new everyday.

Google
broadband
at&t
Joyent
Rackspace
gigaspaces
Technology-News
A few months ago, at the insistence of some of my team members, we switched from Zimbra to Google Apps for our own domain offering. The argument was that Google’s quality of service, thanks to its great infrastructure, would be great and that we could also use Google Docs for easy sharing of information.
And since everyone on the team (barring me) loved the Gmail interface, I acquiesced to their wishes. I shouldn’t have. Six months later we are getting frustrated by schizrophrenic nature of the Google Mail system. At the time, these ideas looked like they were all such great ideas in theory — but six months later we are all experiencing all sorts of problems with Google Mail.
Messages are either not getting sent or being lost by the system. Several people have complained about not receiving replies, even after we have sent them back. The mail system hangs up and you keep waiting for something to happen. And don’t get me started on the sub-par IMAP features of the system. How is one supposed to run a business on such an unreliable platform? The integration of Google’s services remains a distant dream, reminding us of the limitation of its competence beyond search and advertising.
And in order to get support we would need to upgrade to a premium version of the service that could set us back by $1,200 a month year. I don’t mind paying, but if their recent track record is any indication, I might as well light a cigar with 12 $100 bills. For the longest time I though we were the only ones with this problem. Apparently not. At least two other friends with startups who are using Google Apps for their domain are finding similar issues.
That is why I feel terrible for those 1.5 million Australian school kids who will now have to use Gmail instead of Exchange/Outlook. In some ways it’s like jumping from fire into the frying pan (being heated in that fire.)
Are you experiencing those issues as well? Go ahead and kvetch in our comments!

If you are a start-up targeting the mobile industry, then you are well aware of the slow moving ways of incumbents, equipment makers and of course handset makers. You are made aware of their equally glacial ways when you come from the opposite end of the spectrum, Silicon Valley.
Google, the Mountain View, Calif.-based search engine that is making a big mobile push via its Android Mobile Platform, is learning the realities of mobile business the hard way. A report in WSJ suggests that the company is experiencing delays to its so called launch which is now slated for fourth quarter 2008. (Somewhere in Cupertino, Calif., Apple’s Steve Jobs is having a good laugh!)
“This is where the pain happens,” Andy Rubin, Google’s director of mobile platforms told WSJ. “We are very, very close.” He was talking about adding features etc requested by carrier partners. I think this is why Jobs was smart in being tyrannical and ignoring carrier requests when it came to software. Google apparently can’t afford to ignore partner requests.
Here are the relevant and interesting facts from the WSJ article:
Again, as I said earlier - whimsical wishes of carriers, endless customization, software delays and of course, executive reshuffling - these are facts of life for mobile start-ups. Welcome to the club, Google.
Related Stories:

The 10 hours of video uploaded every minute to YouTube could be a problem for Google’s infrastructure. Video files are fat and people don’t want to wait long once they press play, which means keeping them requires a trade-off between fast access and cheap storage. A range of companies are trying to address these sorts of storage problems through compression, caching and even Flash memory in the data center.
But since you can’t cache everything, the recent study from Tubemogul, which shows that online videos get the most views in the first three days (with the peak demand occurring on Day Three), can help set caching policies. Dropping a video from the cache after 11 days would mean only half of the video’s viewers would be tormented with a slightly slower upload time.

The Earth2Tech Team grabbed some camera time with Google.org’s director for climate change and energy initiatives, Dan Reicher, at Google’s conference in Washington DC on plug-in vehicles. Reicher told them that Google will make investments in green cars this summer through it’s plug-in hybrid program, RechargeIT. That’s a big step beyond the four plug-in hybrids that currently constitute Google’s entire RechargeIT program. . . . For the full video check out Earth2Tech.

Powerset, which implements semantic search, recently released a public beta based on the limited data set of Wikipedia. But while there is no question that Powerset has some interesting and valuable semantic search technology — many of their demo queries produce meaningful summary pages and reference pages with information extracted from Wikipedia content — there are other semantic search engines that produce equally meaningful and relevant results.
In this post, we compare Powerset results with those of a demo implementation from one such search engine, Cognition Technologies. And we compare them both with the current gold standard in web search, Google (again, limited to the Wikipedia data set).
Example 1: Powerset
There are some classes of queries in which Powerset shines, such as whenever the query involves extracting concepts or aggregation of data from a given data set.
For example, check out the beautifully presented results for the following queries that extract key information the user is looking for and provide it in summary format:
“military intelligence”
“teams in the NFL”
Example 2: Cognition Technologies
On the other hand, there are other types of queries — especially where hardcore semantic parsing is involved — where the Powerset algorithms get confused, and Cognition gives better results:
“rare wildlife of the Amazon”
“football players who went to jail”
Example 3: Google
There are still queries (especially when semantic parsing is not involved) in which Google results are much better than either Powerset or Cognition:
“helicopter carrier Iwo Jima class”
Here, surprisingly, Google has the best results. Powerset has related results, Cognition gets totally confused, but Google nails it!
Disambiguation
One area where both Powerset and Cognition improve on Google is the disambiguation of query terms. This is always a significant issue for search engines; for example, when a user types in the keyword Java, does she mean the island, the programming language, or the coffee?
Google has recently tried some experiments in this area, but these new search engines go one better.
When Powerset sees an ambiguous topic, it uses tabs to provide both sets of results:
Cognition handles it in a different way, by letting the user select from among different semantic meanings for each term:
User Impact
For most common searches, Google search works just fine. We’ve all gotten used to the ubiquitous “keyword-ese,” currently the universal language of web search. With Google’s unlimited resources, comprehensive index and formidable prowess in finding relevant results using the PageRank algorithm, it’s going to be difficult for any other search engine to match those results. Users may have to work just a little bit harder for unusual queries or specialized searches, but most users will accept that trade-off in return for using their familiar and beloved search engine. Indeed, the word Google has come to represent web search in the same way that the word Xerox had once come to symbolize the process of photocopying.
Future Competition
So what can Powerset (and Cognition) do to gain traction and capture users?
In their recent book, “The Innovator’s Solution,” Clayton Christensen and Michael Raynor discuss how upstart companies challenging market leaders and entrenched incumbents can position new technologies for a reasonable chance of success. One approach that they believe is guaranteed to fail is when these smaller upstarts try to make evolutionary improvements to get and stay ahead of the major players.
Instead, they suggest shaping the new technology into a disruptive innovation, along either of the following two major axes:
1. New-market strategy: Leveraging the innovation to attract users who do not typically participate in using the product or service, and thus growing the market as a whole.
2. Low-end strategy: If there are price-sensitive, over-served users who would be willing to trade some of the advanced functionality in return for a lower price point, then the smaller players have an opportunity to enter the market — that is, if they can figure out a way to make a profit.
In other words, the new players entering the market have to find profitable business opportunities in segments of the market that are not attractive to market leaders.
Using this model, it is apparent that a strategy of challenging Google head-on for control of the mainstream web search market has little hope of success, regardless of the new technologies or search innovations that are applied. Google would have no choice but to fight back with everything it’s got to catch up to or leapfrog this “better search” alternative.
Similarly, since Google search is free for users, there is really no viable low-end strategy, no way to outdo the existing search leader by offering a lower price point.
What about non-participant users? Practically everyone online already uses a web search engine (with Google being the overwhelming favorite). However, Google search follows a specific, consistent set of guidelines: simplicity of UI, speed of response, and relevance based on incoming links. These design parameters take top priority over all other considerations.
By challenging these assumptions, we can discover new use cases in search that are underserved (or not served at all) by Google. Some examples include:
1. UI Simplicity: Google’s minimal UI is trivially simple to use and ideal for a one-size-fits-all model, but it may be less than optimal for complex semantic searches. As Alex Iskold points out in his recent article on the myth and reality of semantic search, a richer user interface would allow power users to express semantically-rich search queries and get back better results. Notably, Powerset and Cognition excel at these types of queries.
2. Speed: For some types of advanced searches, users might be willing to wait, perhaps even as long as a day, in order to get back semantically complex results. Imagine a software agent that acts as a virtual search assistant - once the user specifies a query with multiple levels of complexity and dependency, the agent goes off and returns the next day with a list of possible results/options. Queries that require the coordination of complex tasks fall into this category, such as planning a trip that requires coordinating air travel, hotel and car, and minimizing the cost of the whole trip while taking some additional factors into consideration.
3. Relevance: Although all the mainstream search engines use similar criteria to evaluate relevance (mainly, the evidence of incoming links), other relevance algorithms are certainly feasible and may work better for certain classes of queries. Social relevance is an obvious example; reputable premium content is another.
This post is in no way meant to discredit Powerset — they’re in early beta and are doing a fine job of building semantic search. Instead, the examples above clearly demonstrate that the jury is still out on semantic search; other search engines are also contenders in this space, and the race is far from won.
Nitin Karandikar writes about Web 2.0, Internet search and semantic web on his blog, Software Abstractions

Web
search
Google
semantic
Powerset
Technology-News
altsearchengines
A recent report from ABI Research highlights the rise of mobile Linux, estimating that 23 percent of the world’s smartphones will have a Linux operating system by 2013. It appears that much of that growth will come at the expense of Nokia’s Symbian, and that LiMo and Android will be the main beneficiaries. What the report doesn’t note is that last year ABI predicted that 31 percent of smartphones will have Linux by 2012.
Either there’s something to explain the change in numbers, or we should perhaps take our analyst reports with a grain of salt. However, Linux is undoubtedly moving fast: 15 handsets were launched earlier this year with LiMo, and after several demos and prototypes, anticipation for the Android is running high. But the jury is still out on which framework will win out with carriers and application developers.
LiMo has the backing of NEC, Motorola and Samsung as well as SK Telecom and Verizon. Android, through the Open Handset Alliance, has T-Mobile, NTT DoCoMo, China Telecom, Telefonica, Google and several others. The stated goal behind both efforts is to eliminate some of the costs associated with developing mobile applications for multiple operating systems by using open source. It’s a laudable goal, but the fight between the two for market share demonstrates how hard it will be to lower costs, as developers will still have to build for multiple platforms.
photo courtesy of the LiMo Foundation and NTT DoCoMo

In thinking about the desktop/web hybrid platforms that have launched or are about to be launched, I’ve decided that even if last year they were overhyped, this year we’re going to see real adoption and applications. But that presents an interesting problem for developers and eventually, for users. The vast array of options and functionalities not only makes the web experience different for different users, but it makes developing sites more complicated, much like the rise of different browsers and the proliferation of Flash has in the past.
I’ve written about MySpace using Google Gears for email, but apparently WordPress is going to take advantage of Gears in its next version, too. Twhirl uses Adobe Air to bring Twitter to the desktop and a fun program called Snackr pulls random bits from your RSS feeds to stream across the desktop. We’re still waiting for Prism from Mozilla, and yesterday Yahoo launched BrowserPlus. Again, the sheer number of these presents its own set of problems.
I have copies of Air, Gears and BrowserPlus on my machine, and each have their pros and their cons. Air essentially brings the browser offline, while BrowserPlus runs outside of the browser to make your desktop an extension of the web. Gears runs inside the browser, making Firefox even more unstable, but does make my web browsing faster. (Getting it to work with Gmail is my top request, mind you.)
It’s my job to play around with these sites, but I can’t imagine the average user wanting to download three or four different programs in order to optimize their browsing experience. I still get irritated about upgrading Flash. As for developers wanting to take advantage of extending web functionality, deciding which platform to use will be an exercise in decision-making. Do they go with a platform that has more downloads, or better features? Do they integrate with several platforms if the feature sets are similar, or hope that users download multiple programs? These are similar questions they had to ask when designing for Explorer, Firefox or Netscape.
Skylar Woodward, a software engineer at Yahoo who helped build the BrowserPlus program, thinks eventually some of the code behind these efforts will be opened up to the community, making it easier for developers to implement multiple platforms on their sites. In the meantime, he champions the idea of “graceful degradation.” In that scenario, a user can see the site without downloading a platform, he just might miss out on a few nifty features in the process.
So for those of you too lazy to click through on those installs, welcome to the gracefully degraded Internet.

The first day of Google I/O seemed like a coming out party for Google App Engine, the company’s competitive threat to Amazon AWS. For one, the registrations were thrown open to everyone, and for another, two new APIs were released: the image manipulation API, and (more interesting to web app hosting in general), the memcache API. Now the memcache API was something I expected from Amazon a long time ago, but perhaps they don’t use it themselves as much so it’s not in AWS.
With Yahoo in limbo and Microsoft missing in action on the Internet, Google is making a huge play for developer mindshare. As Microsoft and Sun both demonstrated very effectively, focusing on getting developers excited and making them happy is the key to the success of a platform. Google I/O appears to be Google’s big play for developers. And so far it seems to be working.
Google App Engine (GAE) comes with a webapp framework that’s derived from Django, but you can host your own, including CherryPy, Pylons and web.py, all of which are Python-based. No other language is planned at this time. C++ had AT&T, Java has Sun, and Python now appears to have Google behind it, so expect a lot more Python development activity in the global coding trenches. And a lot more Python books being sold.
The big difference between GAE and Amazon AWS seems to be that GAE commoditizes the application hosting layer while AWS commoditizes the hardware and network hosting layer. With GAE you don’t get to choose the web application stack. You provide the UI and the logic; Google provides the scalable datastore and the application hosting and analytics.
There are a number of characteristics about GAE that serve to give me flashbacks. First is the fact that all apps are hosted as CGI apps. I’m sure Google has a reason for this, but it seems so “early Internet.” Then there’s the fact that Google has created its own query language, GQL (pronounced JeeQuel). Facebook has FQL (how is that pronounced, anyway?), so what’s next, YQL? It’s like we’ve reverted back to the late 80s, when all the database companies mangled the SQL standard just enough so that data was bound to their databases in strategic lock-in. This story doesn’t end well for users.
Finally, the GAE Datastore appears to have a native hierarchical structure with parent-child relationships between entities exposed to the programmer in GQL. This harkens back to the hierarchical databases that preceded SQL and relational databases. The power of SQL was supposed to be that it was declarative, that you didn’t have to know how the data structures were implemented. But hierarchical database application code was viewed as impossible to maintain because your data model leaked into your application. The current iteration of the GAE Datastore seems to require a lot of premeditated syntax design on the part of the developer. It reminds me of how queries performed differently in Oracle depending on the order in which columns appeared in the query. I hope, however, that this is a passing phase and that we soon see a better abstraction in GAE.
In contrast, SimpleDB and CouchDB focus on tuples and dispense with the SQL baggage; they’re surprisingly forward-looking compared to the data models of Facebook and Google. While I’m not questioning whether or not the Google Datastore will scale as promised, some of the choices in how these facilities have been exposed to programmers are curious and have rough edges.
A comment about Ruby: While Steve Yegge’s recent article seemed to suggest that it was hard to promote new languages within Google, I spotted some signs at I/O that Ruby might not be shut out of the picture. OK, just two signs. For one, there is a talk scheduled that mentions Ruby in the title. Second, at one of the talks the speaker mentioned a device called “Radish,” described as a 20-percent time project yielding a device that operates on indoor solar energy and is used to update/monitor the meeting room schedules wirelessly. Apparently the data pushed through this device is managed by a Ruby app — yes, there’s a Ruby app running inside Google. He didn’t say “Rails,” just “Ruby,” so please don’t scream all at once, OK?
Other than that, the conference was Google Gears, HTML5, lots of Javascript/AJAX and of course, Android, Android, Android. There’s even a company selling a 12-hour crash course in Android to prepare developers for the October release. Today, who knows? How Android Google Gears app downloads the Internet and beams it to your desktop via Wi-Fu tubes?

Yesterday, I read a post on Google’s blog about their focus on improving search quality. Today, I read a press release from Microsoft in which it said its Live Search product will be used to give “cash back” to those who use it to find and buy things. Innovation vs. buying your way into the market…in my book, that kinda speaks for itself.
Microsoft’s “Live Search cashback” site…promises to pay back a portion of the purchase price — ranging from about 2 percent to more than 30 percent — to people who use it to find designated products and buy them online from participating retailers…including the online sites of large retailers such as Barnes & Noble, Sears, Home Depot, J&R Electronics, Office Depot and others. [via]
Instead of jumping to conclusions, I decided to make a list of my thoughts on this, many of which the folks at Microsoft are not going to like.
Final thought: Microsoft’s traditional strategy of “We will charge less and crush the competition” really doesn’t cut it anymore. How long do you think merchant partners are going to stick around and waste their resources if they can’t make money? This is not some PC-maker-schmuck they have in a headlock. Take a look at all the other new technologies where Microsoft hasn’t been able to dominate — this is a sad reflection on that trend.

How to visualize the colossal amount of data surrounding climate change? Al Gore squeezed a lot of info into 100 minutes and a PowerPoint presentation, but the next step needs to be dynamic, interactive and malleable. With that in mind, two government research groups out of the UK have released climate change-related data using Google Earth Outreach. Earth2Tech has the full story on their efforts, as well as a how-to for viewing the data.

Mobile browsing has clearly moved beyond 9-to-5 users and made inroads among the happy hour set. A recent survey by Opera showed about 40 percent (and about 60 percent in the United States, South Africa and Indonesia) of Opera Mini users visit social networking sites when surfing on a mobile. For those unfamiliar with the Opera Mini browser, it allows a user to see an entire web page and zoom in on desired content as long as they have Java on the phone.
The survey also shows which top 10 sites surfers visited in each country. The U.S. list begins with MySpace and ends with eBay. In between socializing and shopping is more socializing through Hi5 and Facebook, as well as search via Google, Microsoft Live and Yahoo. It looks like even if we aren’t using our phones for talking, we’re still using them to connect — and to settle bar bets. Wikipedia is the No. 8 slot in the United States. As the chart below shows, if users have an easy way to access the web on their mobiles they will. Carriers and device makers take note!

Technology buzzwords come and go…virtualization, green, SaaS…and after sitting through the Google Friend Connect announcement, reading about Facebook’s Connect service and writing about last week’s MySpace Data Availability launch, “open” appears to be just the latest. But open is one of those words whose definition can be spun into a variety of meanings.
While Facebook isn’t yet releasing much detail on its efforts and may completely surprise me, Google’s Friend Connect program today highlights how open standards such as OpenID, OAuth and OpenSocial can be used to create a platform that’s pretty closed. The service, which will launch tonight and only expects to have between 12 and 24 sites participating while it’s in preview mode over the next few months, will allow site publishers to put some code on their sites. If a user visits a site with the appropriate code, she can get access, via an IFrame, to applications built in OpenSocial. A user can also share her activities on a participating site with her contacts, as well as through her news feeds on participating social networking sites.
Last week, I pointed out that MySpace’s Data Availability efforts were welcome in that they expand the number of sites on which a user can use her MySpace data, but that MySpace still had a lock on the user data since it hosted and determined who could display that data by approving site partners. If MySpace’s efforts were three steps forward in opening up user profiles, then Google’s Friend Connect represents two steps back.
The use of the IFrame means that site owners have no way to change or work with user data, they can only display it. MySpace doesn’t allow sites to store user data on anyone’s servers other than its own, but it does allow that data to be used directly in the outside site. For more differences among the three services, please check out the chart below.
While none of these services are entirely open yet — and may never be, given security and data abuse problems — the trend toward a more social web is clear. With broadband more prevalent than ever and voice fading as the primary means of communicating with people who aren’t in the room, enabling a truly open social web is the next big step in communication. But in order for that to happen, the user needs to be able to reach across walled gardens and gain granular control as to what he or she shares and with whom.
There’s open source (really open in that anyone with knowledge can participate in how the code evolves), open standards (open only in that anyone can participate using a pre-defined version of the standard), and open APIs (open in that anyone can take the pre-defined standard and build something for a closed platform such as Facebook). Knowing this, the efforts to open up a user’s data on a social network (their social graph, if you will) by these three companies falls somewhere between an open platform and an open standard.
| unknown, but Facebook API is likely | OAuth, OpenID, OpenSocial | OAuth |
| basic profile information, profile picture, friends, photos, events, groups | Applications built with OpenSocial, contacts, activities on participating sites published back to a news feed | Profiles, friends, photos and videos |
| unknown | Web site owners must apply to Google and be accepted | Web site owners must agree to MySpace terms and conditions, but MySpace will allow anyone who doesn’t abuse the user data to participate |
| will launch within a few weeks | First 12-24 sites will go live in the next few days and the rest of the web will take a few more months | Launched on May 8 and adding more partners within the next few weeks |
| unannounced | Plaxo, Orkut, Hi5 and Facebook | Yahoo!, Twitter, eBay and Photobucket |
| unannounced | On Google servers and displayed only via an iFrame | On MySpace Servers, but can be displayed however the participating site wishes |
| A user’s privacy settings will follow him around the web | Users opt in to Friend Connect and can limit their profile sharing to existing contacts only; a user can elect on which sites he wants to share his activities, can also instantly change privacy settings across all participating sites | Users can control their privacy settings (right now, only which sites get access to their data) on a central page. Partner sites must accept changes in real time and sharing profile data is an opt-in service |

It has been a long time coming, but Powerset, a San Francisco-based contextual-semantic search engine has finally launched. I urge you to try it out, for this is quite an impressive search effort, despite the fact it is currently limited to searching Wikipedia along with some supplementary results from Metaweb’s Freebase. I think it has made Wikipedia much easier to use. I like how you can do more topic-based searches and get a holistic view of the information you’re looking for. Danny Sullivan over on Search Engine Land has an elaborate and fantastic indepth review of Powerset, and that frankly obviates the need for any other review.
That said, Powerset faces an uphill climb, especially when it comes to consumer mindshare. I think Google has become so synonymous with search that it is virtually impossible for a newcomer to establish a toehold. Powerset’s approach is different, and its tactic of applying its technology to specific content repositories such as Wikipedia is smart. But will they (web searchers) come and use Powerset?
At our recent GigaOM PM event, Chad Walters, director of engineering, search and platform at Powerset, gave a talk about how his company was using Hadoop and other clever technologies to meet its immense infrastructure needs. Here are some bits from OStatic’s live blog coverage of the event:
Powerset applies deep natural language processing (based on technology licensed from Xerox PARC), which means the company needs 100 times more processing horsepower than a simple keyword searching and indexing. Powerset uses a distributed database system called HBase in tandem with Coral, its Document Processing System. Coral uses Hadoop as its job control machine. Powerset uses 92 eight-core machines to do processing.

What used to be the purview of corporate and business development departments is now being replaced by venture capital. A fund to foster Facebook apps, the iFund to jump start the iPhone app revolution or the rumored $150 million fund to give Blackberry apps a boost - the increasing number of platform funds doesn’t ensure success. Remember the Java Fund, or the RSS Fund.
The news of the Blackberry Fund was first reported by Venturebeat, but that post has been taken down, so I am not sure if this is even happening or not. If it is indeed true, then it is clear that iPhone has delivered a swift kick in the pants to the Canadian company, and getting it to innovate faster. I don’t think an investment vehicle is the answer. Many developers I have talked to often complain about the challenges of working with Research In Motion (RIM.)
If Team Blackberry is looking to encourage development for their platform, then they should make it easier for folks to develop for their platform. One hair ball that comes with this so called Blackberry Fund: can a company that takes an investment from Research In Motion develop apps for iPhone or Google’s Android?
Simon Brocklehurst does a great job of deconstructing the Blackberry & iFunds, and I encourage you to read his analysis. “All the opportunities, though, probably need Apple and RIM to deliver significant growth in device sales, from where they are now,” he writes, in what is clearly an understatement. Brocklehurst points out that there is a whole lot of other platforms, and the developer are going to gravitate towards the largest market opportunities.
In comparison to the Blackberry Fund and the iFund, I like the approach taken by Google to foster an apps ecosystem for its Android platform. Instead of taking an equity in exchange of funding, Google is basically giving prize monies to winners of a developer contest. Fifty round one winners get $25,000 and go on to the next level. According to a Google Android blog post, the name of the winners are going to be announced shortly. Of course, I have been talking to other Android developers and will write about them some time soon.

The New York Times reports that Microsoft is going to raise its bid for Yahoo by a few dollars. They quote an unnamed source, but seem to be pretty confident that Microsoft is upping the ante a bit.
Microsoft, which had threatened to abandon its bid, has increased its offer “by several dollars,” this person said. The merger talks represent an enormous breakthrough following weeks of behind-the-scenes discussions without any progress. Exact terms being discussed could not be learned.
So much for all the posturing from them this week, and petulance displayed by Yahoo. As they say, if the price is right…
Kara Swisher says that Yahoo is looking for a cushion because they think that the Email monopoly that would ensue following the Microhoo deal is going to cause problems.
That’s because Microsoft and Yahoo completely dominate all mail on the Internet. According to the most recent ComScore figures, for example, Yahoo has 256.2 million users, while Microsoft has 254.6 million. Google is a distant third with 91.6 million users and AOL has about half that at 48.9 million.

Updated: We all know there’s no love lost between Hulu, the Hollywood-backed online video service, and Google-owned YouTube. The two companies have taken snipes at each other. For instance, at the NAB trade show, Hulu was trash-talking YouTube. Jason Kilar, the CEO of Hulu, said that you can’t make money by posting unauthorized and copyrighted videos — with a YouTube page behind him.
Hating your rival is part of the game, which is why it’s hard to ignore the irony of a Hulu Channel on YouTube. YES! What you just read is right. The