The battle over Yahoo’s search business as witnessed over the last few days seems both ridiculous and petty. And it takes the attention away from what is Yahoo’s true value: a media aggregation platform. Yahoo is the place a lot of people — some 400 million — visit to get their news, sports scores and email. I have always liked that business, and yesterday I experienced, first-hand, the enormous strength of Yahoo.
A story by Judi Sohn, who edits WebWorkerDaily, one of our growing portfolio of blogs, was featured on the home page of Yahoo last night. The story got voted up via Yahoo’s Buzz, a service akin to Digg, except much more powerful.
In a few hours, the story about what to expect when switching from a BlackBerry to an iPhone was viewed over 200,000 times and attracted over 350 comments. Now that’s a lot of traffic — but more importantly, a gigantic amount of engagement displayed by Yahoo visitors. The traffic sent our way by Yahoo was many times the traffic we get from, say, Digg or StumbleUpon.
At the risk of repeating myself, Yahoo’s core business now is “audience.” The company, instead of trying to out-Google Google, needs to beat itself by figuring out new ways to keep the audience growing. The first step is, of course, acknowledging that it is a content company. The next one: figuring out new engagement and audience-grabbing ways.

The latest skirmish in the ongoing Viacom v. YouTube billion-dollar lawsuit battle is over how YouTube employees used their own site. It’s been a nutty couple of weeks for the high-profile case. First a federal judge ordered YouTube to hand over its user data to Viacom. Then Google asked to have user identifying information stripped out. Viacom denied it ever asked for that data (it did) and then said it didn’t want user information after all. Still with us? Now the news is that Viacom wants YouTube employee user information. If the media conglomerate can show that employees were aware of or upoloaded copyrighted material to the site, YouTube could lose its protection under the DMCA. [Full Story on NewTeeVee]

Today saw the first of at least two Congressional hearings concerning managing privacy on the web in relation to online advertising. The hearing today involved executives from Google, Facebook, Microsoft and startup NebuAd as well as the Federal Trade Commission and two public policy groups. For a complete listing, check out the hearing, although it clocks in at about two hours and is very, very repetitive.
Everyone present agreed that advertising on the web is not bad because it allows for all this wonderful free content; they similarly concurred that consumers are both uncomfortable with some of the data collection that occurs online want information on how they can control that information. After that, though, there was little common ground to be found over issues including self-regulation, the way NebuAd tracks Internet usage for advertising, and how long personally identifiable data is stored on Microsoft’s and Google’s servers.
I was saddened by the FTC’s unwillingness to put forth any meaningful regulations or guidelines related to behavoiral advertising, or to really even get back into the conversation. I was also (and here’s where the hate mail will start) impressed with Bob Dykes of NebuAd and his defense of that firm’s privacy technology (yes I know NebuAd said it would adjust the technology just yesterday to make it more palatable).
While much of what the firm says must be taken at its word (at least until the audit Dykes promised back in May is completed), surfing habits are harder to pinpoint to an identifiable consumer using NebuAd’s technology than search engine data. That isn’t a ringing endorsement, but it’s something. A larger fear about NebuAd’s technology that wasn’t addressed in the hearing is how the startup secures the data from its ISP partners. I trust my ISP very little, and now even less since they’re close to being granted immunity from legal protests related to them sharing phone calls with the government.
And the threat of government prying was by far the most interesting aspect of the entire hearing. In an age of government surveillance, the personal data such as that collected by Google, NebuAd or even my ISP is frightening. If the government chooses, it can find my web surfing information — perhaps stripped of context, but not of my identity. In the worst-case scenario, my searches could end up as evidence against me before a dozen of my peers in a municipal or federal court.
Think I’m crazy, or maybe have something to hide? I would point you to the brush with the government writer Lawrence Wright had researching an article on the Middle East (bottom of page 11), or the fate of those caught in RIAA’s nets. For those less concerned with government interference and more focused on protecting their online privacy, the next Commerce Committee hearing on this topic will focus on ISPs. I can’t wait.

Google
broadband
microsoft
goog
Technology-News
nebuad
dykes
The other virtual shoe finally dropped today– after a year and a half of rumors, Google (GOOG) now brings us Lively, a web-driven mini-virtual world. Not a contiguous, immersive, fully user-created metaverse like Second Life, as it turns out– so it’s not really a direct competitor– but a series of virtual world chatrooms more akin to IMVU. (However, IMVU has a virtual economy of user-created content, while Lively does not, least not yet.)
On first glance, Lively seems too similar to several existing (and very large) MMOs, making it an also-ran without a key market distinguisher to be truly compelling (besides being from Google). You can stream YouTube videos in these rooms and embed rooms on websites, and it’s got appealing cartoon visuals and a fairly intuitive interface, but that’s true of numerous online worlds already out there.
Of course Google is the Net’s dominant force, but then, that probably won’t matter to the tens of millions already happy in existing virtual worlds. Without some special magic that I’m not seeing as yet, it could easily wind up being a virtual world version of Google Video, easily eclipsed by the YouTube-level dominance of Habbo Hotel/Club Penguin/Gaia Online/etc.
Of course, all this doesn’t answer the most salient question: why would a search engine company create a virtual world in the first place? Does it even fit into their larger plans? As Mel Guymon, Google’s Head of 3D Operations, suggests to Virtual World News, the real takeaway is to validate a growing market for this space. “We’re basically saying this is a real space and everyone is doing this.” Sounds like the 800 lbs. gorilla is just saying, “Me too.”
Lively image credit: Metaverse analyst Dusan Writer, who has some interesting thoughts.

The New York Times today finally got around to noticing that when web sites go down, people are increasingly likely to get mad and generally react the way I might if I drove to my favorite bar and found it closed for a private party. I might be miffed and share a few choice words with members of my party before deciding on a new locale. However, when we write blogs or tweets (if Twitter is up), the inconvenience and our subsequent vitriol is archived forever and transmitted around the world rather than just to our friends. And because millions of other people want to go to that same bar, the chorus of curses grows quickly.
We’ve written about how hard it is to create a 99.999 percent up time championed by the telecommunications industry, but suffice to say there are a ton of moving parts involved in keeping a site visible to the end users; the list begins with the network architecture and ends with the internet connection of a consumer in Austin. Along the way there are software upgrades, server shortages, DNS issues, cut cables, corporate firewalls, carriers throttling traffic and infected machines.
The Times notes that downtime is more than just inconvenient: As more data is stored online and cloud computing becomes more prevalent for businesses, it’s less like a bar closing for a night than a bank closing for a day. But it will never be possible to keep all sites across the entire web up 99.999 percent of the time. Knowing that, architecting for failure, and more services such as downforeveryoneorjustme.com (I would really love a more memorable name for this site) and helpful 404 pages would be appreciated.

Speculation that Google is working with French ISPs to build out a Gallic WiMax network has folks at Fierce Wireless wondering if Google may push open broadband overseas by investing in WiMax deployments. They point to a report in a French paper that says Google has teamed up with Illiad-owned ISP Free (Om loves these guys) and Bollore Group’s Bollore Telecom, which own licenses to the WiMax spectrum in France. Google has already invested in the nationwide U.S. WiMax effort Clearwire, has plenty of interest in making sure there’s open broadband in important markets, and money to spend. Allons-y!

A few minutes after she delivered a speech at our Structure 08 conference in San Francisco, I caught up with Microsoft’s corporate VP of global foundation services, Debra Chrapaty, for a video chat. I think a more appropriate title for her would be Mr. Softie’s Internet Infrastructure Czar. I found her very knowledgeable, engaging and open with her opinions. “We have some new innovations up our sleeve that are going to knock the socks of anything anyone is doing, including our friends down south,” she told me. She didn’t name Google, of course, but we all know who she was talking about.
Her candor was one of the reasons I wanted decided to share the video with you guys. The common theme of the conversation: Microsoft is spending liberally to build out its Internet infrastructure, including upgrading its backbone network and scaling out its data center infrastructure by adding new technologies.
When I asked her exactly how much Microsoft was spending on it, she dodged the question, saying just that it was a big number. This much we do know: Two years ago, the company was spending close to $2 billion on its infrastructure; it has since undertaken the development of six data centers, with parts of two networks already online.
| Adding 10,000 servers a month |
| New data centers being planned/under construction are equivalent of over 15 US football fields of data center space. |
| Plans to cut of 30% to 40% in data-center power costs company-wide over the next two years. |
| Current network backbone runs at about 100 gigabits per second, but soon Microsoft plans to bump it to 500 Gigabits. I think this could be big for Level 3, long time partner of Microsoft. |
| Building out its own CDN (Edge) network - 99 nodes on a 100 gigabit per second backbone. |
| For Microsoft, total data grows ten times every three years. The data in near future will soon approach 100s of petabytes. This includes data from all of their online services. |
| Source: Microsoft, GigaOM |
| When complete, it will consume 48 megawatts of energy. Microsoft can tap up to 72 MW of energy coming from hydro power. Microsoft is paying about 1.8 cents per kilowatt, but will rise to between 2.6-to-2.9 cents per kilowatt as more capacity goes online. Two data centers in this location. | |
| It will be 447,000 square feet on 44 acres. Microsoft is building two data centers here | |
| first Windows Live data center outside the U.S. | |
| The first floor of this facility is going to be entirely made of containers and would house Microsoft search. | |
| Source: Microsoft |
Watch the video to get the full low-down, but if you’re in a hurry, here are some highlights, including her quotes from our conversation.

The 10 hours of video uploaded every minute to YouTube could be a problem for Google’s infrastructure. Video files are fat and people don’t want to wait long once they press play, which means keeping them requires a trade-off between fast access and cheap storage. A range of companies are trying to address these sorts of storage problems through compression, caching and even Flash memory in the data center.
But since you can’t cache everything, the recent study from Tubemogul, which shows that online videos get the most views in the first three days (with the peak demand occurring on Day Three), can help set caching policies. Dropping a video from the cache after 11 days would mean only half of the video’s viewers would be tormented with a slightly slower upload time.

Update: It isn’t quite black Thursday, but it is still a day Yahooligans are not going to forget for a while. First they announced that their deal with Microsoft is off. Microsoft responded with a note of disappointment. Their stock tanked — down $2.63 a share, or 10 percent, for the day. The continuous slide in Yahoo stock — now less than $10 a share than what Microsoft offered — assures that Yahoo has lost almost all of its friends on Wall Street. The greed gremlins like Carl Icahn are only going to increase their attacks on the beleaguered Internet company, and would like to be off with Jerry Yang’s head.
And if that was not enough, the exodus of executives — Usama Fayyad, Yahoo’s chief data officer; Jeff Weiner, executive VP (network division); and Jeremy Zawodny — continues, indicating that in this battle to save itself, the commander-in-chief lost the support of his key lieutenants. (Apparently, a hiring freeze has gone into effect as well.)
Yahoo also announced that it is going to use Google for search and contextual advertising.
The deal is expected to add $800 million in revenue and between $250 million and $450 million in operating cash flow in the first year. As a comparison, Google signed a deal with MySpace for $900 million in 2007 that ends in 2010.
I think this is yet another critical blunder by a company that lost its way three years ago when then-CEO Terry Semel lost interest in the company, putting it on a path of mediocrity. Of course, as one of my gurus once said, in hindsight, everyone is an idiot (or a genius).
And while that might assuage the short-term concerns Wall Streeters have, the company is shooting itself in the face with this deal. It’s almost like knowing your spouse is going to divorce you while you’re standing in the aisle, waiting for the priest. This is akin to Chrysler going to Toyota with its hat in its hand, asking them to sell them engines for their car. I bring this up mostly because on their blog, Google writes:
Toyota sells its hybrid technology to General Motors, even though they are the number one and number two car manufacturers globally. Canon provides laser printer engines for HP, despite also competing in the broader laser printer market.
Did Google doyens check on GM’s performance lately? Or their hybrid sales record? Or, for that matter, Canon’s printer market share? Oy vey! Where is Business 2.0’s “101 Dumbest Moments in Business” list when you need it? In my opinion, with this deal, Yahoo has publicly acknowledged that Google is superior to them when it comes to search and contextual advertising.
More importantly, the Google-Yahoo agreement is most definitely going to be investigated by the Department of Justice. (Senator Kohl, chairman of the Senate Antitrust Subcommittee, issued a statement saying that they are going to be looking at this deal very, very carefully.) One attorney very familiar with anti-trust law pointed out that there is a reason Google and Yahoo announced the deal for the U.S. and Canada — because such a deal will almost never past muster in Europe. On this side of the Atlantic, he pointed out, the language of the agreement is designed to feign innocence.
Yahoo! will be able to complement its own advertising program with Google’s advertising technology. Yahoo can use Google’s advertising technology on as many or as few of its search results and content pages as it chooses. This non-exclusive agreement allows Yahoo! to enter into similar agreements with other advertising providers. [Google Press Release]
How stupid do they think the government investigators are to fall for this drivel? Even if Yahoo enters into an agreements with others, Google is going to win. I mean, if Google’s past performance is any indicator, then as a company they enjoy superior technology and offer better inventory for online advertisers. That is precisely the reason why they are a leader, and that is why Yahoo cut a deal with them in the first place.
Anyway, from the looks of it, the U.S. government investigation is going to entangle Yahoo in underwater weeds of uncertainty. Google, on the other hand, will be victorious in defeat — they would have frozen Yahoo into inaction for awhile. Upon thinking about this further, I realize that it also buys the company some time: It throws Microsoft, Icahn et al off its trail, for another three months while the government investigates.
Yahoo’s best hope now is that someone wants to buy it — News Corp., AT&T, eBay, Microsoft, or even AOL — maybe at a valuation that is much lower than what Microsoft was ready to pay the first time. The sad part of this whole thing is that Yahoo was once a great company that had great products, and that made news by launching great products. Jerry Yang was once Silicon Valley’s wonderboys, and now he is helping his ship run aground.

Yahoo is following the footsteps of other technology giants by launching its own desktop/web hybrid application called BrowserPlus. The plug-in is only available as a sneak peak with a few demos, such as using BrowserPlus to drag-and-drop photos from your desktop into Flickr. Yahoo’s product clearly isn’t ready for prime time yet, but perhaps the excitement around MySpace taking advantage of Google Gears to improve its email capabilities galvanized them. Yahoo’s vision certainly looks more compelling than mere offline access to online apps, so I’m eager for more functionality.

Technology buzzwords come and go…virtualization, green, SaaS…and after sitting through the Google Friend Connect announcement, reading about Facebook’s Connect service and writing about last week’s MySpace Data Availability launch, “open” appears to be just the latest. But open is one of those words whose definition can be spun into a variety of meanings.
While Facebook isn’t yet releasing much detail on its efforts and may completely surprise me, Google’s Friend Connect program today highlights how open standards such as OpenID, OAuth and OpenSocial can be used to create a platform that’s pretty closed. The service, which will launch tonight and only expects to have between 12 and 24 sites participating while it’s in preview mode over the next few months, will allow site publishers to put some code on their sites. If a user visits a site with the appropriate code, she can get access, via an IFrame, to applications built in OpenSocial. A user can also share her activities on a participating site with her contacts, as well as through her news feeds on participating social networking sites.
Last week, I pointed out that MySpace’s Data Availability efforts were welcome in that they expand the number of sites on which a user can use her MySpace data, but that MySpace still had a lock on the user data since it hosted and determined who could display that data by approving site partners. If MySpace’s efforts were three steps forward in opening up user profiles, then Google’s Friend Connect represents two steps back.
The use of the IFrame means that site owners have no way to change or work with user data, they can only display it. MySpace doesn’t allow sites to store user data on anyone’s servers other than its own, but it does allow that data to be used directly in the outside site. For more differences among the three services, please check out the chart below.
While none of these services are entirely open yet — and may never be, given security and data abuse problems — the trend toward a more social web is clear. With broadband more prevalent than ever and voice fading as the primary means of communicating with people who aren’t in the room, enabling a truly open social web is the next big step in communication. But in order for that to happen, the user needs to be able to reach across walled gardens and gain granular control as to what he or she shares and with whom.
There’s open source (really open in that anyone with knowledge can participate in how the code evolves), open standards (open only in that anyone can participate using a pre-defined version of the standard), and open APIs (open in that anyone can take the pre-defined standard and build something for a closed platform such as Facebook). Knowing this, the efforts to open up a user’s data on a social network (their social graph, if you will) by these three companies falls somewhere between an open platform and an open standard.
| unknown, but Facebook API is likely | OAuth, OpenID, OpenSocial | OAuth |
| basic profile information, profile picture, friends, photos, events, groups | Applications built with OpenSocial, contacts, activities on participating sites published back to a news feed | Profiles, friends, photos and videos |
| unknown | Web site owners must apply to Google and be accepted | Web site owners must agree to MySpace terms and conditions, but MySpace will allow anyone who doesn’t abuse the user data to participate |
| will launch within a few weeks | First 12-24 sites will go live in the next few days and the rest of the web will take a few more months | Launched on May 8 and adding more partners within the next few weeks |
| unannounced | Plaxo, Orkut, Hi5 and Facebook | Yahoo!, Twitter, eBay and Photobucket |
| unannounced | On Google servers and displayed only via an iFrame | On MySpace Servers, but can be displayed however the participating site wishes |
| A user’s privacy settings will follow him around the web | Users opt in to Friend Connect and can limit their profile sharing to existing contacts only; a user can elect on which sites he wants to share his activities, can also instantly change privacy settings across all participating sites | Users can control their privacy settings (right now, only which sites get access to their data) on a central page. Partner sites must accept changes in real time and sharing profile data is an opt-in service |

It has been a long time coming, but Powerset, a San Francisco-based contextual-semantic search engine has finally launched. I urge you to try it out, for this is quite an impressive search effort, despite the fact it is currently limited to searching Wikipedia along with some supplementary results from Metaweb’s Freebase. I think it has made Wikipedia much easier to use. I like how you can do more topic-based searches and get a holistic view of the information you’re looking for. Danny Sullivan over on Search Engine Land has an elaborate and fantastic indepth review of Powerset, and that frankly obviates the need for any other review.
That said, Powerset faces an uphill climb, especially when it comes to consumer mindshare. I think Google has become so synonymous with search that it is virtually impossible for a newcomer to establish a toehold. Powerset’s approach is different, and its tactic of applying its technology to specific content repositories such as Wikipedia is smart. But will they (web searchers) come and use Powerset?
At our recent GigaOM PM event, Chad Walters, director of engineering, search and platform at Powerset, gave a talk about how his company was using Hadoop and other clever technologies to meet its immense infrastructure needs. Here are some bits from OStatic’s live blog coverage of the event:
Powerset applies deep natural language processing (based on technology licensed from Xerox PARC), which means the company needs 100 times more processing horsepower than a simple keyword searching and indexing. Powerset uses a distributed database system called HBase in tandem with Coral, its Document Processing System. Coral uses Hadoop as its job control machine. Powerset uses 92 eight-core machines to do processing.

Google isn’t evil and it isn’t being beaten down by the recession or fewer click-throughs on its ads. At least that’s the message CEO Eric Schmidt tried to convey during an interview with Maria Bartiromo that will air on CNBC after the close of markets today.
The grown-up Googler sat down with the Money Honey for a frank talk about Google’s most recent earnings, its plans to move into more enterprise applications, its hopes for monetizing YouTube, mobile phones — even its particpation in the 700MHz auction. Oh and how it’s still trying to avoid being evil.
The good news is that Google is still focused first and foremost on advertising, in particular on gaining as much share of that market as possible. The bad news is the click-throughs rates for search advertising in the U.S. are down, which prompted investors to shear some 40 percent off the firm’s market cap from the beginning of the year through mid-March, when numbers showing poor U.S. click through data surfaced. Schmidt argues that the lowered click-throughs will actually result in higher revenue because the quality of the ad viewer is higher — there is less casual clicking. He also points out that recessions drive advertisers to spend money on ad formats such as search because they can tell how effective those ads are.
Recessions do tend to drive advertisers toward measurable campaigns, and Google will likely benefit. However, the decreased click-through rates are something those not just Google’s investors are watching, but many in the startup community as well. Are consumers becoming more leery of search ads and tuning them out or are they really getting better at determining which ads are relevant to them, resulting in fewer, but higher-quality, clicks? The answer to that question could determine the success of many of the web-based consumer service providers hoping to make money on new ad formats — and on Google AdWords.
