Google Blogoscoped

Friday, December 2, 2005

Connecting People Via Wikipedia and Term Extraction

Matt Biddulph found a very cool way to create a meaningful “people network” via the Yahoo API. In a nut-shell, here’s how it works:

  1. You have a flat list of names, like Margaret Thatcher, Roy Jenkins, David Marshall, Tony Blair, Winston Churchill etc., and you want to know which of these persons share a connection
  2. For every name, you search Yahoo for the top Wikipedia page; for Margaret Thatcher, the query would be “margaret thatcher” (you can use the Yahoo REST API for this).
  3. You take the text from the resulting Wikipedia page (like “”) and apply Yahoo’s Term Extraction API on it
  4. You’ll end up with a list of extracted terms, like baroness thatcher, woman, tony blair, political philosophy, and some of these map back to the original list of names... and voila, you got the connection!

I assume this works with other things than just people... like movie titles, TV shows, band names and so on. Another approach would be to calculate the Googleshare to find the relation between any two things; I wonder if the results for the two approaches are somewhat similar in their structure.

[Via Yahoo.]


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!