Google Blogoscoped

Forum

Google Auto-Corrects Some Queries  (View post)

Tony Ruscoe [PersonRank 10]

Tuesday, January 16, 2007
17 years ago5,276 views

They've done this type of fuzzy match with plurals for a long time. For example, searching for [service] or [services] will return and highlight results containing both words. I've never noticed whether they've done it with other words until now.

However, I wouldn't say it's really auto-correcting queries as searching for e.g. [christine aguilera] returns different results to [christina aguilera]. If it was truly auto-correcting, they would return the same results. I guess it must still somehow be including results for the actual query entered.

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

Tony, compare:
http://www.google.com/search?q=oper+labs
with
http://www.google.com/search?q=opera+labs

They're almost identical.

Whereas in Yahoo:
http://search.yahoo.com/search?p=oper+labs
vs
http://search.yahoo.com/search?p=opera+labs

Tony Ruscoe [PersonRank 10]

17 years ago #

Exactly. I suspect they're *almost* identical because not many pages would be returned for [oper labs] under normal circumstances, so there's much less noise. Search for [oper] on its own and you'll also receive results for [opera]. But search for [opera] and the results are obviously very different. The reason your result sets are so similar is because not many people have the words [oper] and [labs] on the same page.

I'm thinking this is just an extension of Google's Word variations (stemming) feature. From their "Basics of Search" page:

<< Google now uses stemming technology. Thus, when appropriate, it will search not only for your search terms, but also for words that are similar to some or all of those terms. If you search for pet lemur dietary needs, Google will also search for pet lemur diet needs, and other related variations of your terms. Any variants of your terms that were searched for will be highlighted in the snippet of text accompanying each result. >>

From: http://www.google.co.uk/help/basics.html

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

<<not many pages would be returned for [oper labs] under normal circumstances>>

I tried the query at google.ro and I got 1.030.000 results. This is a very big number.

[oper labs -opera] returns 967.000 results.

http://www.google.ro/search?hl=ro&q=oper+labs

Tony Ruscoe [PersonRank 10]

17 years ago #

Fair enough. I get different numbers though. Either way, let's just say for the sake of argument that Google is searching for [oper labs] and [opera labs] and then adding the two results sets together. If the pages containing the word [opera] have more relevance, your results above would appear to be very similar.

BTW, using the + operator, you can specify that you want to use the *exact* word in your query:

[+oper +labs] = 509,000 results
http://www.google.com/search?q=%2boper+labs

[+opera +labs] = 1,540,000 results
http://www.google.com/search?q=%2Bopera+%2Blabs

[oper +labs] = 1,690,000 results
http://www.google.com/search?q=oper+%2Blabs

[oper labs] = 12,200,000 results
http://www.google.com/search?q=oper+labs

Tony Ruscoe [PersonRank 10]

17 years ago #

Here's a good one: [operation system]
http://www.google.com/search?q=operation+system

It says: << Did you mean: operating system >>

And then it goes on to return results including "OS", "operation", "operating", "system" and "systems" anyway!

What's weird is [operation system] returns 156,000,000 results and [+operation +system] returns 209,000,000 (i.e. even more) results.

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

Hey, Tony, I know about this one. I get a lot of hits from this search query:
google operation system

Brian MIngus [PersonRank 10]

17 years ago #

I noticed this recently. In order to search for the exact word, I just had to wrap it in quotes. So if you have a multi word query, just do this:

"this" is" "my" "search" "query"

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

Or use + in front of each keyword.

Tony Ruscoe [PersonRank 10]

17 years ago #

If people are noticing this more often recently, maybe Google's word stemming feature has just become more intelligent and is trying to match more words?

(It was originally launched in 2003:
http://blogoscoped.com/archive/2003_12_01_index.html#107028708399659095)

Splasho [PersonRank 10]

17 years ago #

I think there's a niche for an old fashioned non-fuzzy search engine which would let you search for exactly what you were looking for. For some searches, especially stuff to do with programming, I get annoying by the fuzziness because it returns to many irrelevant results.

Milly [PersonRank 10]

17 years ago #

"This behavior seems to be new."

I don't know how *much* of it is new. Searching [40tude dialog ifilter] shows [filter] as also being highlighted, without any "Did you mean: filter" message.

But that was happening more than a year ago, too: http://blogoscoped.com/forum/13158.html

garcon [PersonRank 0]

17 years ago #

It seems like "stemming" (word variations). By searching for "jirka bush" in Czech language, you will get results with embolded word "jiří".

http://www.google.com/search?hl=cs&q=jirka+bush&lr=lang_cs

Andreas Bovens [PersonRank 3]

17 years ago #

Noticed this yesterday when searching for [kanji ime] http://www.google.com/search?q=kanji%20ime

The first, second and several other results on that page all refer to kanji and *time* related queries, which is obviously NOT what I was looking for (IME stands for input method editor). :-/

So, nothing against fuzzy matches, but I don't like Google is doing this with 3 and 4 letter words (actually, variations on the first letter of a word are probably irrelevant 99% of the time).

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!