A search for [Assange +France] gets plenty of results about "Julien Assange" and the "Tour de France", as both topics are making the headlines today.
A page containing the words "Julien Assange" in the title and "France" in the "related articles" section is absolutely not relevant to my search, but I understand that Google returns this page. However, I find quite surprising to see many pages that have no mention of France whatsoever. From the snippet, it appears that the word "France" was indexed when it appears in a section such as "most email articles today". But why is Google unable to ignore the content that isn't part of the 'actual' page?
Is it that hard to distinguish temporary content to the long-term content?Can search engine solve this issue in the near future?
For instance, here's the 4rth result for [Assange +France] right now
and here's the linked article bbc.co.uk/news/world-us-canada ... Now, how can Google come up with singing French nuns for a story about Wikileaks?
Google's newsbot is able to distinguish between article content and other content. But it often fail.
For example there are many problems with dates if dates are posted outside the article