Google Blogoscoped

Forum

TinEye is open to all, and much better than before  (View post)

Roger Browne [PersonRank 10]

Wednesday, March 25, 2009
15 years ago5,563 views

Image search engine TinEye is now fully open (no invitation needed). I searched for the NASA moon photo of astronaut and flag, and was amazed that TinEye could match all kinds of variations:
http://quezi.com/5524

I'd love to know what kind of indexing they use. It's robust in the face of cropping, colour changes, skewing, editing, captioning, etc. Most impressive!

Philipp posted about TinEye in June 2008
http://blogoscoped.com/archive/2008-06-10-n27.html
but back then you needed an invitation, and the matching wasn't so good, and there weren't enough images in their index. It's really usable now.

Above 1 comments were made in the forum before this was blogged,

Juha-Matti Laurio [PersonRank 10]

15 years ago #

Placing the service's URL here too: http://tineye.com/

James Xuan [PersonRank 10]

15 years ago #

Oh Quaero!

cMET [PersonRank 0]

15 years ago #

Ahmm.. it has been open for over a month:

http://blog.ideeinc.com/2009/02/05/new-releases-more-feedback/

:P

Roger Browne [PersonRank 10]

15 years ago #

[put at-character here]cMET: It's not hot breaking news, but it's still an update on the previous mention here.

[put at-character here]Philipp, your "worst match" link (in the blog post) is broken. TinEye will only give you a permalink if you are logged in and click on "History". All other links to search results expire after an hour.

Philipp Lenssen [PersonRank 10]

15 years ago #

Thanks Roger, I removed that link.

Ianf [PersonRank 10]

15 years ago #

I have a vague memory of this once being called IDEE Picture Search Engine or similar, and devoid of any kind of basic explanation there of how it works. I realized from the start it could hardly be done by brute force alone – they had do be extracting some unique signature, perhaps several but still chunks of manageable size, out of every picture, and searching primarily for other traces of these. Earlier, Philipp wrote:

" [...] At the core of their site is a so-called image fingerprinting algorithm. It works not only with identical images, but also pics that contain just a part of another image. "
http://blogoscoped.com/archive/2008-06-10-n27.html

which sounds plausible enough, but doesn't enlighten us of how that fingerprinting is extractes/ achieved(?). Earlier commenters thought it could be done the usual throw-servers-at-a-problem-way, and as such ought to be done by Google, too. Martin Sochnacki (Wanted) sort-of concurred:

" [ Google has] other projects with higher priority, than throwing tens of thousands servers just to launch the image search with pattern recognition. "
http://blogoscoped.com/forum/133527.html#id133629

I, for one, do not think it can be done by "mere" massive parallell crunching of image fingerprints (and, besides, maintaing server farms of Google's size without a viable business model doesn't wash). TinEye has therefore to be doing something new and quite unexpected in search engine context, which is why the method is unpublished, likely to remain that way, and most probably originally a chance discovery, or some "programming serendipity."

Those of you closer to "the know"... any SPECULATIVE idea as to what that secret algorithm might be doing? Hashed n-dim matrices, anyone?

Roger Browne [PersonRank 10]

15 years ago #

[put at-character here]Ianf: Yes, their technology must be very clever, and their results are way better than those of any of their competitors.

Their fingerprinting is resistant to:

* colour variation (because B&W images match colour or tinted images)
* size changes (bigger and smaller images match no problem)
* overlays (you can chuck loads of text over the image and it still matches)
* cropping ("astronaut+flag" matches "astronaut" and also matches "flag")
* collages (combining with other images)

Obviously somehow they must vastly reduce the data for each image to something that can be indexed and compared.

I considered these possibilities: indexing a very coarse wavelet compression, indexing a very coarse fractal compression, indexing some of the artifacts generated during JPEG compression, indexing some unconventional attribute of the image (such as the ratio of fine to rough texture).

All of these could be made to work if it weren't for the problem of matching partial images against full images. This leads me to think that they must be doing shape detection and matching individual parts of each image.

Here is some speculation:

Suppose the outlines of the main shapes of each image are traced. Then, a few simple numerical values are calculated and indexed for each outline (e.g. tilt, width-height ratio, circumpherence/area ratio). That's quite promising. It could match cropped or scaled images provided at least one shape remains in common.

This algorithm wouldn't be able to match rotated or skewed images, but I don't see any of those among the search results.

And it would explain something that had been puzzing me: how can TinEye offer a "worst results first" search? You couldn't do this with Google search results, because they can't get to the "worst" result without calculating all the better ones first. But with traced shapes it's easy. Start looking for images which match one shape. For each matching image, try for matches on some of the other shapes. If you're looking for "worst matches", you've found one if only one shape matches. If you're looking for "best matches", you've found one if all of the shapes match.

Of course there's still lots of engineering needed to get good reproducible outline traces, but I think it's doable this way.

David Sarokin [PersonRank 1]

15 years ago #

Turns out you can use tineye to find your cartoon avatar's identical twin:

http://www.ehow.com/how_4875324_avatar-twin.html

Not earth-shaking news, but still, it's kind of funny.

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!