Google Blogoscoped

Forum

Can a Site Guess a User's Gender With JavaScript?  (View post)

David H [PersonRank 1]

Wednesday, March 18, 2009
15 years ago5,795 views

Philipp, the link in the post is wrong.

A B [PersonRank 0]

15 years ago #

http://www.mikeonads.com/2008/07/13/using-your-browser-url-history-estimate-gender/

Philipp Lenssen [PersonRank 10]

15 years ago #

Funny. Thanks A B, post updated!

David, thanks, I fixed the link.

samblackjack [PersonRank 1]

15 years ago #

Interesting idea. I wonder if there are some sites which are already doing something similar, maybe not based on gender, but other things. Google is already displaying ads based on the content of what's on the screen – search terms, email content, blog content, etc. I wouldn't be completely surprised if Google was changing what those ads show based on other sites visited by the same user, since they can track other sites you have visited already.

Avrohom Eliezer Friedman (AEF) [PersonRank 10]

15 years ago #

Arrrrrg.

I'm a female – scary

(Or there is a 51% shot of me being female. I guess I just like girly sites. Even scarier!)

James Xuan [PersonRank 10]

15 years ago #

I was 100% Male, despite visiting Jezebel...

noname [PersonRank 4]

15 years ago #

the problem is then there is no typical men or typical women site. E.g. we got here a big men magazine. But around 40 % of the readers are women – they simply want to read what men want

noname [PersonRank 4]

15 years ago #

P.S: it could be working if i could see not just one but all referrers of the same session. But this is not possible (security reasons)

noname [PersonRank 4]

15 years ago #

ups, sorry, it took to much time to work – now i understand – they are testing visited state of some known servers, then ok, it can work

Sterling [PersonRank 0]

15 years ago #

What I found interesting about the link A B gave was that, at least in my case, it got my gender correct through the "geek factor". IOW, many of the sites it sniffed out of my history were developer-oriented (sorceforge, php.net, etc.), and presumably from there the well known male bias among programmers nailed me.

I'd be interested to see the results for less technical users.

noname [PersonRank 4]

15 years ago #

hmm, i got a big speed improvement – delete sites with near 1 ration (like youtube.com)

Trey [PersonRank 0]

15 years ago #

http://gwap.com (games with a purpose... the ReCaptha and google image labeler people) have a test on their homepage (or at least did) where the user picks between a few photos of which they like best and the site gives the probability of the user's gender. It's seems a bit elementary (girls click on babies, guys click on cars) but can definitely be refined over time.

[linked URL]

Pawankumar Nathani [PersonRank 0]

15 years ago #

The algorithm is very good one and could be very useful for displaying ads..

The question of whether the person is male or female is irrelevant. The main question is if we are able to determine the preferences of the user.

If we are convinced that the user prefers certain kinds of sites and then the ads are displayed then whoever that person is (either he or she) the ads are rightly targeted.

The same logic applies even to the other things like age, etc....

The only risk is what if such kind of information is used by any pervert or misused by any one....!!!??? that what if is a dangerous one.....

Veky [PersonRank 10]

15 years ago #

> On that note, would it even be technically possible (and wanted?) to stop this hack from working in future browsers, that is, without breaking many completely unrelated scripts? <

Of course. The easy way to see it would be possible, is that those scripts continue to work after clearing cache and browsing history. For example, those JavaScript functions could just assume all links are unvisited (return the style of unvisited links), and all would be fine. But as you say, it's probably not a big problem, especially since you have to have an explicit list of sites you want to check against.

> Sometimes, two users may share the same browser, hence their URL histories may be mixed up. On the other hand, this could also mean that no confidence threshold would be crossed by the algorithm, so it may be OK after all. <

Probably. For example, my girlfriend and I use the same browser, and the above script says we are collectively 54% male. I guess I use the computer a bit more than her. ;-)

> Is it even possible to find a sufficient list of “typically male/ typically female” websites? Is there even such a thing as a “typically male” site? <

Probably not. But Bayesian analysis can be easily applied to this... there is always a probability that a random user of the given site is male, which can be entered into Bayes formula; it doesn't have to be close to 0 or 1.

> Would the algo only identify the “soap opera clichee male/ female” (loves football, beer, fast cars, must be male!)? <

Cliches are much more accurate when they are averaged across many diverse branches of culture. I'm male and I don't like football, beer nor fast cars, but I like programming, maths and Wikipedia browsing. A few cliches are enough. :-)

> would it be enough to just make your algorithm’s goal to identify one gender – say, you’ll only try identifying male gender based on typical male sites – and then you simply assume if you don’t sufficiently identify that gender, you’ll assume the person is of the other gender... <

The historical perspective says no. Every time we tried to proclaim a "default" gender (the other being "non-default" one), a few hundred years later we would realize our mistake. ;-P

> ... on a side-note: could the same algorithm approach also be used to guess a person’s age (user visiting a lot of kids sites?), their tastes and preferences, etc.? <

Theoretically (although, guessing among 2 discrete possibilities is much easier than guessing on a continuum). But as Pawankumar Nathani said above, all you're doing in fact is guessing their tastes and preferences, at least projected to web browsing.

Which brings us to an interesting conclusion: the purpose you want to use a user's gender for, will probably benefit more from "tastes and preferences", than from proper gender of that user. For example, if someone is male but has tastes and web preferences similar to a woman, he'd probably be happier if your script "detected" (addressed) him as female. We are increasingly living in a society where gender is a matter of choice... what's in a person's 23rd chromosome doesn't matter much.

Travis Harris [PersonRank 10]

15 years ago #

Not funny!

Likelihood of you being FEMALE is 96%
Likelihood of you being MALE is 4%

This is on my laptop, so I am the primary user!

Vassil Hristov [PersonRank 1]

15 years ago #

Damn – 67% probability that I'm a girl! Argh! :D

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!