Google Blogoscoped

Forum

Most Popular Words 2006  (View post)

Splasho [PersonRank 10]

Friday, April 21, 2006
12 years ago8,524 views

That is fascinating, great work.

Some really unexpected gainers..

Rick [PersonRank 0]

12 years ago #

Great list.. but where's "human-animal hybrids"?

/pd [PersonRank 10]

12 years ago #

I think whats facincating is that the words "privacy" has hit the top 30 words..that byitself sez that society at large is inquistive about privacy!!

Philipp Lenssen [PersonRank 10]

12 years ago #

Isn't there some kind of requirement in the US to put a "privacy" link in the footer of websites? Or is it just common practice?

IvyMike [PersonRank 0]

12 years ago #

I can't stop wondering about the sudden popularity of the word "pus".

Elias KAI [PersonRank 10]

12 years ago #

hahaha I have did some years ago for single letters
guess whos first for the letter a
or G
or M
or W

Seth Finkelstein [PersonRank 10]

12 years ago #

Philipp: There's no requirement, but many sites have a "Privacy Policy", which accounts for the link. It's just common practice.

Joe Le Merou [PersonRank 0]

12 years ago #

Very useful, thank you!

Jay [PersonRank 0]

12 years ago #

Free Lcd Monitor! Anyone want an Ipod nano. Ive got a site thats giving them away for free. All you have to do is complete an offer and refer a few people. Plus its very easy to do and you don’t even need to pay for shipping. There is no spam or scam... or anything else that starts with an s. I have even used this method before and have recieved a psp and a free dvd (Anchorman) both for free. If any one is interested here is the link.


giftfiesta.com/?rid=879311

Mike Valstar [PersonRank 0]

12 years ago #

well unmask HAS to be because of Gentoo

u [PersonRank 0]

12 years ago #

ipodraffles.com

anthony goh [PersonRank 0]

12 years ago #

that's kind of an interesting study, but i really would have liked it if buzz and web-related words could be filtered out, so we could see more how language and usage of the web is changing.

e.g. furl, (computery-stuff) and pvssycat (dolls), are kind of no-brainers that they'd be such big gainers. i wonder if instead you could track language phrases like

"worried about" or "really busy this week" or "felt really good about xyz"

to get a changing picture of how the whole world is feeling, and what concerns them. just an idea.

thanks for doing this project though. good work!

anthony

/pd [PersonRank 10]

12 years ago #

This story has hit digg and is about 534 Diggs to it :)-

   dirtyfratboy sub'ed the post!!

digg.com/technology/Google_s_M ...

Marco Polo [PersonRank 0]

12 years ago #

I think the top gainers are skewed because of a change functionality in google regarding accented characters.

Many of the words in the top gainer list are French and all of those have in common that they are supposed to be accented, which is quite a coincidence...

soignée, motorisé, déterminer, hospitalisé, bénéfice, dérangé, congé, confrère, détente, métier, clôture, soupçon, député, inquiétude, transistorisé, habitué, évacuée, fête, ménage, exécutant, déjection, civilisé, corvée, gâteau, mélangé, démodé, inéluctable, soirée, individualisé.

I recognized a few spanish words like señor and señora, which are incidentally also accented.

It seems that Google had problems with some character sets in 2003 regarding accented characters, and those were fixed since then. I remember running into those kind of problems back then with some pages not being found because the accented characters weren't recognized as is by Google.

nice job though [PersonRank 0]

12 years ago #

Interesting but quite useless as i think.

What information lies behind the fact that the word "a" is the most found word ? everyone uses an "a" and an "of" and an "and" and so on.

reducing on nouns adjectives and names would have made this evaluation by far more significant.

Robo [PersonRank 0]

12 years ago #

Interesting stuff, but what about "www" with 25.270.000.000 pages?

Philipp Lenssen [PersonRank 10]

12 years ago #

I used words from a dictionary of a little over 27,000 words, but there are no company names included, no "www" and such. The full word list is contained in the CSV file (which you can open with e.g. Excel) at the end of the article. I too think it would be very interesting to do this with a list of e.g. company names, names of celebrities and such. I might do that in the future!

Marco [PersonRank 0]

12 years ago #

To allow a realistic view of the top gainers the list should also show the absolute number of searches in 2006 for these keywords, not just the factor by which the absolute number grew.

For examle, if "furl" increased from 1 search in 2003 to 6155 searches in 2006 (example figures) it gives a different picture of the importance of this keyword. Also a list of the top gainers sorted by the gained absolute number of searches would be very interesting.

Steve [PersonRank 0]

12 years ago #

Quite funny, as i did this 'google battle' together with a friend of mine last year.. well, at that time 'a' won this contest, just before 'www' right now it is vice versa... so, the internet is now the most important thing... in the internet... ;)

Lars Kasper [PersonRank 1]

12 years ago #

Someone at Spiegel Online likes you blog: spiegel.de/netzwelt/netzkultur ...

4wd media [PersonRank 0]

12 years ago #

nice work, but how long did it take?!? :) if you want to try, what steve wrote about, just check googlebattle.com, there you can battle two words against each other... have fun!

Philipp Lenssen [PersonRank 10]

12 years ago #

> nice work, but how long did it take?!?

A couple of days. :)

By the way, I think it's easy to misunderstand that the post discusses popular searches (a la Google Zeitgeist, or AdWords keyword tools)... it doesn't, it only discusses how often a word appears on the web.

deogen [PersonRank 0]

12 years ago #

[moved]

Hi!
If you enter "www" in google, you will get 25.270.000.000 (25th April 2006) returns!
"a" just give you 25.110.000.000 returns ...

Regards,

Matze

Ludwik Trammer [PersonRank 10]

12 years ago #

And if you enter inurl:www you will get about 14,630,000,000 results. So this doesn't count because most of "www"s are in urls, not in body texts...

Holger [PersonRank 0]

12 years ago #

Phantastisch und spanndend,

da fehlt nur noch die Liste derjenigen Wörter, die einen negativen Zuwachs haben, falls es diese gibt.

The list of words which decreased, is missing.
That would be interesting, but also if we categorize the list of increased words.

Stacy Reed [PersonRank 0]

12 years ago #

Such an interesting compilation! Thanks for doing the research!

Karen Jones [PersonRank 0]

12 years ago #

Surely some of these words are mis-spelt? eg: ameba>amoeba, corelate>correlate, itern>intern, graphical>graphic, milgramme>milligram. Would it change the rankings using these correct spellings. Or are there really words spelt as in the list?

Murilo Silveira [PersonRank 0]

12 years ago #

Curioso saber que a palavra "proximo", em português, consta da lista. Na verdade, escreve-se "próximo", acentuada na sílaba "pro", para enfatizar a tônica. "Próximo", em português, significa "near" em inglês ou o ítem seguinte numa lista ou fila de pessoas, aí usando-se o artigo "o" ("the") ("o próximo").
I'm sorry if you don't speak or read portuguese.

Wen Won [PersonRank 0]

12 years ago #

Why is there no more discussion about Marco Polo's comment. This would seem like a coincidence for those all to be accented words. His list again:

soignée, motorisé, déterminer, hospitalisé, bénéfice, dérangé, congé, confrère, détente, métier, clôture, soupçon, député, inquiétude, transistorisé, habitué, évacuée, fête, ménage, exécutant, déjection, civilisé, corvée, gâteau, mélangé, démodé, inéluctable, soirée, individualisé.

I find this fascinating stuff, and an accurate take on the info would be great, however this run of numbers does seemed to be put off by this accent issue.

If the date mine is to be of use then issues like this need to be addressed.

Philipp Lenssen [PersonRank 10]

12 years ago #

I used the same Google API script to collect the results in 2003 and in 2006, but it's very likely that Google changed how accents are processed... so the "increase" data may be skewed in these regards, I don't really know.

iZeitgeist [PersonRank 10]

12 years ago #

This search shows every result in bold:
(25,270,000,000 results as well, must be the global index size)

inurl:** site:**
google.com/search?hs=0sE&h ...

iZeitgeist [PersonRank 10]

12 years ago #

i just discovered this:

"Sorry, Google does not serve more than 1000 results for any query. (You asked for results starting from 1001.)".

google.com/search?q=google& ...

Jay Amin [PersonRank 10]

12 years ago #

google.com/search?hl=en&lr ...

How did you get that?

iZeitgeist [PersonRank 10]

12 years ago #

Try to go for example to the 2nd page of results and change the parameter of the URL to:

"start=1001"

Jay Amin [PersonRank 10]

12 years ago #

oh.........got it...why does it say that?

David Palfrey [PersonRank 1]

12 years ago #

There's another curious feature of the results. Graphing 2006 results against 2003 results on log-log scales shows two almost perfectly separable clusters. Regressing the results as a whole (eliminating a few words whose columns were garbled when I imported them into Excel), we have y=44.326*x -3E06. The smaller cluster (I haven't yet looked to isolate it and find what words it contains) has values for y (i.e. 2006 counts) which are around 10 times less than what one would expect. Can you shed any light on this?

David Palfrey [PersonRank 1]

12 years ago #

Looking at the anomalous results, they seem to be those which are in the alphabetical range MOUTHWASH-PRIDE. These results (approx 10% of the total) appear to have been overestimated in 2003 or underestimated in 2006

This thread is locked as it's old... but you can create a new thread in the forum. 

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!