Google Blogoscoped

Wednesday, March 18, 2009

Can a Site Guess a User’s Gender With JavaScript?

Update: Hah. The whole thing was already thought up, and implemented as sample, some time ago. [Thanks A B!]

My friend Nikolai and I were brainstorming a site we’ve planned, and the topic of adjusting the site’s content based on gender came up. In this case, knowing the gender would just be a bit helpful, but not crucial – and not crucial enough to bother asking the user to provide the info themselves –, and getting it 100% right would also not be necessary. (For the site we discussed, we also wouldn’t store any user data like names or email or so, so changing the site’s content based on gender would be just a temporary, session thing.) So we were wondering if the following approach would be feasible technically (note we did not yet judge on whether it should be done if it could be done, we were only brainstorming):

  1. Identify a couple of hundred or so websites which are typically visited by men, and then identify a couple of hundred sites typically visited by women. For instance, one could check the Alexa top sites for some inspiration, or enter such things as “top woman’s portals” into Google. One may also be able to simply get a long list of popular websites and then ask the people at Mechanical Turk whether or not they’ve visited a particular site, and then ask them if they’re male or female.
  2. With that list in hand, one could use the JavaScript history hack which works in many common browsers. The basis of this hack is that it allows you, given a list of URLs, to check whether the user visited any of the URLs before. This can be done because you can use a hidden layer which outputs the links, with a different (CSS) style assigned to visited vs unvisited links, and then, via JavaScript, go through the links to check their applied styling to see whether the URL shows as visited. You can check the sample I published here a while ago.
  3. Now for every “typically male” site the user visited you could add a point and for every “typically female” site you could subtract a point (or substract for men, and add up for women, as you prefer). Then if you end up with a number crossing a certain confidence threshold – say, over 50 points plus or over 50 points minus – you could then make a guess that the user is a man/ is a woman, and submit that data via a form, Ajax or what-not. (If no confidence threshold is reached, your site takes a neutral stand in terms of whatever you wanted to change based on gender.)

Now, here are some things to consider:

Thoughts and comments... ?


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!