The He/ She Ratio (View post)Ionut Alex. Chitu | Monday, April 30, 2007 17 years ago • 10,421 views |
Usage of "he" vs "she" on googlesystem.blogspot.com/ (79%/ 21%). Most of the "she" is in the comments. |
JoeW | 17 years ago # |
"Usage of "he" vs "she" on www.myspace.com (15%/ 85%)" myspace is 85% "she". |
/pd | 17 years ago # |
Usage of "he" vs "she" on peterdawson.typepad.com (84%/ 16%) |
/pd | 17 years ago # |
wow... TP lives upto his woman's advocation!!
Usage of "he" vs "she" on tompeters.com (51%/ 49%) |
Ludwik Trammer | 17 years ago # |
Pretty interesting idea, but you have to have sites categories in mind when interpreting the results. For example political sites have more "he" because there are currently many more male politicians. On the other hand porn sites in general have more "she", because the biggest group using them is heterosexual males (again...). Less obvious examples are of course much more interesting. But the think to remember it that service with big representation of one gender's form isn't FOR this gender or PRO this gender or SUPPORTING this gender. It's only ABOUT this gender (as porn sites example shows...). |
/pd | 17 years ago # |
Good Point Ludwik..but what about dating sites.. ?? for example
Usage of "he" vs "she" on www.plentyoffish.com (47%/ 53%)
so would this mean that there are more females looking to date ?
|
Philipp Lenssen | 17 years ago # |
> so would this mean that there are more > females looking to date ?
Not necessarily... women on a dating site might write, "I'm looking for a man. He should be above 30 ... " etc... |
Ludwik Trammer | 17 years ago # |
> Usage of "he" vs "she" on www.plentyoffish.com (47%/ 53%)
The difference is very low. I got 51%/49% for plentyoffish.com (without "www").
> so would this mean that there are > more females looking to date ?
My thinking about this is the opposite of your thinking ;) When someone is placing an ad on such site she/he writes about herself/himself in the first person ("I"), but in the third person (sometimes second person) about a potential partner ("He should be sweet and carrying", "She have to be pretty"). So for every statistical woman placing her ad there is more "he", and for every man more "she". |
alek | 17 years ago # |
How 'bout seeing if the percentages are similar/consistent if you compare "him/her" |
mukthar | 17 years ago # |
Usage of "he" vs "she" on google.com (65%/ 35%) |
Roger Browne | 17 years ago # |
Uclue.com is 35%/65% which I certainly would not have expected.
Answers.google.com is 68%/32%. |
Ionut Alex. Chitu | 17 years ago # |
Usage of "he" vs "she" on en.wikipedia.org (69%/ 31%) Usage of "he" vs "she" on blogspot.com (65%/ 35%) Usage of "he" vs "she" on youtube.com (69%/ 31%) Usage of "he" vs "she" on flickr.com (57%/ 43%) |
Rebecca Kelley | 17 years ago # |
This study seems a bit ridiculous to me. When writing, it's customary to use "he" as the default gender in order to avoid sentences riddled with "he/she" (e.g. "A user on your site is more likely to convert if he/she thinks that the content is relevant to him/her"). Look at the Spanish language--when referring to multiple genders (el chico, la chica), both together are given the "los" definite article, which is masculine (los chicos).
I'm not saying that men aren't more highly represented than women; however, I think it's important to keep in mind basic grammar and writing rules. |
Ludwik Trammer | 17 years ago # |
It is interesting to compare this result with average ratio for the Internet as a whole, which is 65%/35%. That means for example that blogspot.com is exactly average and on flickr.com there are more content about woman than on average site. |
Jennifer Hitchcock | 17 years ago # |
Wow!
I am 91% girly. (It won't let me put my domain name because it has a number in it. ladylike four is the name though.)
I took some test somewhere else though that said I had a strong masculine side. These predictors and indicators are funny. |
Ludwik Trammer | 17 years ago # |
> This study seems a bit ridiculous to me.
It's not ridiculous. You just have to keep all those factors in mind. I believe my previous post, which compares results with an average, should help avoid problem that you mentioned. You could even introduce new ratio based on that. For example flickr.com has 62% more female content then average, and YouTube 6% more male. |
Ludwik Trammer | 17 years ago # |
> I took some test somewhere else > though that said I had a strong masculine side.
That's correct. Your content is 91% ABOUT girls. And you know WHO is interested in girls...? ;) |
dan | 17 years ago # |
but....
is it really the opposite?
would a guy write HE on his site...he'd me...and when writing about a woman would write she...so it could be that we need to flip the data.
www.vibrantorange.com |
Ramibotros | 17 years ago # |
Interesting: Usage of "he" vs "she" on *.blogspot.com (47%/ 53%) Usage of "he" vs "she" on *.* (66%/ 34%)
|
Ramibotros | 17 years ago # |
As for music: Usage of "he" vs "she" on last.fm (44%/ 56%)
Also: Usage of "he" vs "she" on imdb.com (58%/ 42%) Usage of "he" vs "she" on groups.google.com (73%/ 27%) |
George R | 17 years ago # |
The following results were all taken from i.p. address 72.14.207.107 over a short period of time on 4/30/2007.
+804000 site:cnn.com 1740000 site:cnn.com the 1550000 site cnn.com -he -she 1220000 site:cnn.com he OR she 1220000 site:cnn.com he +336000 site:cnn.com she +883000 site:cnn.com he -she ++65100 site:cnn.com she -he 1660000 site:cnn.com -he 2380000 site:cnn.com -she
|
George R | 17 years ago # |
Your ratios only add to 100%,. Many pages contain both "he" and "she". Such pages should count twice, resulting in a total greater than 100%.
In any event using the "counts" that google provides are of questionable value. See my previous comment. |
John Honeck | 17 years ago # |
Hopefully Philip doesn't rank for He:She after this post, which is a different subject all together.
Since it's been well established that Googlebot is indeed a she, does this come into indexing decisions? |
Philipp Lenssen | 17 years ago # |
> This study seems a bit ridiculous to me. When writing, it's > customary to use "he" as the default gender in order to > avoid sentences riddled with "he/she"
Ludwik hit the nail on the head when he said: "It's not ridiculous. You just have to keep all those factors in mind." The ratio I presented makes no attempt at any specific interpretation ("more 'he' means xyz" or "50% 'she' means foobar"). And there are many ways to interpret these results. For instance, Ludwik proposes to level the values according to the average of the web at large. On the other hand, someone else may consider the web at large to be biased, so they'd consider this leveling to be unfair, etc. (and some may consider "he" defaulting to "neutral" to be biased grammar itself, proposing e.g. a balanced he/she alteration) – again, many different interpretations for the he/ she ratios. |
Stubbe | 17 years ago # |
This is such a great tool! I'm proud to say that thestubbes.com is 74% compliant. I hope you don't mind me posting the image on my site (with a link to this fabulous tool of course) |
Andrew | 17 years ago # |
The experiment is flawed: it shows the number of PAGES containing the word 'he' or 'she' but doesn't take into account how frequently the words appear on each page. So a page with 'he' appearing 100 times is given the same weight as a page with 'she' appearing' once. |
Ludwik Trammer | 17 years ago # |
> it shows the number of PAGES containing > the word 'he' or 'she' but doesn't take into account > how frequently the words appear on each page.
Yes, and this could be considered both a good and a bad thing. One article with a lot of "she"s shouldn't change the whole service stats gathered from articles from many years. Maybe the best solution would be to count "he" and "she" in every single article separately and than give the point to the gender who had more mentions in the given article. And than count the points.
But Philipp's solution is better just because it's easy and clean. Philipp is showing us that we can gather many interesting stats in a simple way (do you remember his chess stats gathered from Google?). It's less about the stats and more about the process. |
Andrew | 17 years ago # |
Linguists have been gathering stats (about word frequencies, etc) using this process for quite a while but the results are notoriously unreliable. See, for example, http://aixtal.blogspot.com/2005/02/web-googles-missing-pages-mystery.html |
Rob Balder | 17 years ago # |
I'm also puzzled as to the significance (if any) of this number for a given site. Livejournal.com, for example, keeps statistics on their users who indicate gender (most of us) and shows about a 2:1 female:male user ratio.
Male: 1724500 (32.8%) Female: 3533289 (67.2%) Unspecified: 2089647
Yet the he/she for Livejournal is (66%/ 34%).
The person who did this comparison above for MySpace shows an even more extreme disparity between user demographics and ...what, speech focus? What is this measuring? I'm having a hard time seeing what this number could be used for, even in sweeping and general terms. |
Andrew | 17 years ago # |
^ The simple answer: males don't only use 'he' in their writing and females don't only use 'she' in their writing! |
Ludwik Trammer | 17 years ago # |
> Male: 1724500 (32.8%) > Female: 3533289 (67.2%) > Yet the he/she for Livejournal is (66%/ 34%).
And isn't that fascinating? I find this and similar comparison of different data extremely interesting. For example in this case there is almost exactly the same percentage of posts with "he" as female bloggers, and the same percentage of posts with "she" as male bloggers. It would be really interesting to compare data from more sources. |
Kevin T. Keith | 17 years ago # |
I find that many valid URLs return only an error message – even some that get a lot of traffic. What gives? |
Philipp Lenssen | 17 years ago # |
Kevin, got some examples? basically all I'm doing is concatenating a "site:" command with the URL entered, and then check the Google page count... |
Stephen Tordoff | 17 years ago # |
Now this is interesting, and somewhat related to this:
"She invented"
http://www.google.co.uk/search?hl=en&q=%22she+invented%22&btnG=Google+Search&meta= |
Stephen Tordoff | 17 years ago # |
Or worse, http://www.google.com/search?hl=en&q=what+have+women+invented%3F&btnG=Search |
TOMHTML | 17 years ago # |
Found via Digg, of course ;-) |
Stephen Tordoff | 17 years ago # |
The second one was, but I already knew about the first before reading the Digg article. |
Amy Forza | 17 years ago # |
Women Inventors A-Z....Look it up |
James Xuan | 17 years ago # |
What haVE when invented is not as good as your first one Stephen |
ssvbhalla | 17 years ago # |
she invented |