The Google Bloggers have reported on the State of the Google Union, creating this post titled Helping computers understand language. I’m a big fan of Google, but in this case (and in a previous post titled How I Know that Search Engines Haven’t Mastered Semantics), I have to take the side of the devils advocate and disagree.

It’s not that they don’t understand semantics, I think they do a fine job of interpreting queries and suggesting alternative meanings, but they do an awful job of organizing results from synonyms and related terms in a uniform manner.  My evidence suggests that Google can understand terms that are actually synonymous, but not implied relationships that exist only in human language.

In their example, they cite that they glean the relationship between photos and pictures as applied in 2 queries, photos developed with coffee and pictures developed with coffee.  The results jive for me, but then again – If I just search developed with coffee, I get the same results once again.  One could infer from this that Google is not actually understanding anything, but that they’ve cherry-picked a site that happens to have great presence for a shorter phrase.

It’s not a stretch to say that film developed with coffee is synonymous with photos developed with coffee, but for this query the results are different.  Imagine my surprise when a thesaurus shows me that “film” is not necessarily a synonym of “photo”.

Perhaps that’s why Google didn’t give me the same result they favoured for the #1 position for three other queries.  One could also infer that the site, optimized for “photo” and “picture”, didn’t have the same optimization for “film”.  My conclusion, unscientific as it is, tells me that while Google can use a thesaurus as well as anybody, possibly better, they’re no closer to understanding natural language.

This entry was written by Sean Enns, posted on February 3, 2010 at 3:19 pm, filed under Search Engine Optimization and tagged Google, Search, SEO. Leave a comment or view the discussion at the permalink.