SearchEngineWatch joins the link counting fray

Danny Sullivan is skeptical about the accuracy of Google’s and Yahoo’s results counts, used by Tristan Louis in two studies, which concluded that Yahoo has better coverage of blogs than Google, which in turn has better coverage than Technorati. Danny posted an email conversation with Tristan about his study. It’s a little hard to follow the lines of argument, but it’s well worth reading because it illuminates the difficulties in getting a handle on index size, and especially blog coverage, by the search giants.

Danny, from his exchange with Tristan:

Also, Google did say “of about” with the numbers it reports. That’s not an accident. They’re saying that this is an estimate. But no disagreement with me. If you put up a count, it would be nice if the count was as accurate as possible. Google’s have come under question.

Hmm. From what I’ve seen in Tristan’s data and my own testing, it’s Yahoo’s counts that ought to come under question, specifically for link: queries.

Danny to Tristan again:

The link: command is completely different than the site: command. The link command tells you nothing about the size of the index. As for a confirmation that all links aren’t reported, this past blog post from SEW gives you confirmation and this page on Google mentions links are only a sampling of what Google knows although this other Google page fails to make this clear.

link: and site: are very different, that’s true enough. And maybe the link command doesn’t tell you much about the size of an index, but if link collection methods are similar between Yahoo and Google (and why wouldn’t they be, it’s a relatively easy part of the whole game), then the counts ought to be similar. But they’re not, not by a long shot.

By the way, a big thanks to Tristan for posting his studies and kicking off this discussion. Most of us don’t take the time to do analysis of that depth to support our opinions, and to post the entire method and dataset so others can reproduce it, shoot holes in it, go off on tangents from it.

(I stumbled onto Danny’s post via John Battelle)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s