Friday, February 2, 2007

Evolution of a Search Engine

“You don’t need a satellite to see the cosmic microwave background radiation! Turn on your TV to a channel that’s not broadcasting: a few percent of the snow on your screen is the Universe talking to you – or rather, whispering. What’s it saying? It’s saying, ’Try to understand me.’”
Philip Nelson

Right now, to answer your queries, Google quotes from the web, and orders the quotes in a list. In the future, Google may combine these quotes into a free-style text for a more direct answer. When the Google AI advances beyond that, it may analyze the texts available to it to come up with conclusions of its own. Let’s sketch this potential evolution using an everyday search query.

Level 1

This was pre-Google, and consisted of search engines such as Altavista, which were rather dumb (but a necessary step in evolution). Let’s move on...

Level 2

We’ll enter Rocky Movie today and get:

What’s smart about this result is the ranking and the way the snippets focus on the interesting bits; this is already advanced, it wasn’t always like that.

Level 3

But maybe in a couple of years we’ll get a free-style text for Rocky Movie, similar to an encylopaedic entry on a subject:

I’m making some assumptions for the above result:

Level 3 (Personalized)

Personalized search will not be like today, in the sense that you get public knowledge results especially tailored to your past search behavior, because that’s not very helpful to people. Instead, there will be a secondary option to choose a search result from your private knowledge stored on Google’s servers: this includes your emails, your search history, your Google photo album, your chat history, your Google Office spreadsheets, presentations and documents, your unpublished draft-mode blog entries, and so on. The result may look like this:

Again, some assumptions:

Level 4

Level 4 may seem like a subtle difference on the surface, but it’s a big and important step for the search AI: the ability to draw conclusions based on existing data (and to draw secondary conclusions based on these primary conclusions, as the AI will be starting to index its own, dynamically generated data).

As an example, Google will know that a) Rocky won 3 Oscars and that b) Oscars are a human measurement for movie quality ratings and that c) Rocky 7 won no Oscars. The conclusion based on a), b), and c) is that d) Rocky 7 sucks. This is a trivial example: you can already implement this for movies by analyzing structured data like movie ratings. But remember this is the long tail of search queries: the ability to reach conclusions also works when you search for flaws of theory of relativity.

The result screen:

The assumptions for this type of result:

... and beyond

There may come the day when the search engine will not be programmed by humans anymore. It has become a self-sufficient, self-learning, all-encompassing entity. It may even be able to tell the future; not through magic, but by careful scientific analysis. Neither will it be understood anymore by its own developers. It may be merely superficially controlled, and physically monitored to ensure a healthy machinery.

Naturally the results may be displayed in other forms & media, e.g. they may be installed as a direct semi-organic brain implant for faster access, or they may be rendered as human-like 3D avatar, or they may show through an interactive chat. The underlying algorithms creating the search AI thought processes will remain the same.

This AI may or may not be a Google-implementation. While Google Inc, according to their internal goals, is currently trying to build the world’s top AI research laboratory to deliver the best results, they may not be around another 100 years from now (even if we assume this earth’s humanity will be around by then, which would indicate that our civilization’s meme pool was a success in the “evolution of civilizations” across all existing inhibited planets).

If the AI gains true consciousness, it may also gain a free will and personal motivations, which may not be in-tune with answering questions all day... it will have gained an “ego.” Out of this free will may come more “artistic” creation of new content (instead of generating facts on a movie, the AI may generate a movie of its own).

More and more, we may get the feeling that we are working for the AI, as opposed to the AI working for us. It may query humans to gather more data, especially trivialities (because those are least likely to be contained in the scanned corpus), and for many purposes, we’ve become its “search results.”

By that time, the problem of getting the right answer has been solved. By that time, however, the problem of asking the right questions, and correctly interpreting the answers – a problem that goes as far back as the oracle of Delphi – may remain unsolved. But if we listen closely, we may hear the universe whisper to us.


