Google Blogoscoped

Forum

New Google Adobe Team Up for Flash  (View post)

beussery [PersonRank 10]

Tuesday, July 1, 2008
16 years ago5,848 views

Big announcement here:
http://googleblog.blogspot.com/2008/06/google-learns-to-crawl-flash.html

beussery [PersonRank 10]

16 years ago #

"Now that we've launched our Flash indexing algorithm, web designers can expect improved visibility of their published Flash content, and you can expect to see better search results and snippets."

beussery [PersonRank 10]

16 years ago #

More here:
http://googlewebmastercentral.blogspot.com/2008/06/improved-flash-indexing.html

& here:

http://www.adobe.com/aboutadobe/pressroom/pressreleases/200806/070108AdobeRichMediaSearch.html

* Miss Universe [PersonRank 7]

16 years ago #

[moved from "Adobe Flash (SWF) Files NOW Searchable by Google & Yahoo by New Tech"]

http://www.adobe.com/aboutadobe/pressroom/pressreleases/200806/070108AdobeRichMediaSearch.html

the company is teaming up with search industry leaders to dramatically improve search results of dynamic Web content and rich Internet applications (RIAs). Adobe is providing optimized Adobe® Flash® Player technology to Google and Yahoo! to enhance search engine indexing of the Flash file format (SWF) and uncover information that is currently undiscoverable by search engines. This will provide more relevant automatic search rankings of the millions of RIAs and other dynamic content that run in Adobe Flash Player....

The openly published SWF specification describes the file format used to deliver rich applications and interactive content via Adobe Flash Player, which is installed on more than 98 percent of Internet-connected computers. Although search engines already index static text and links within SWF files, RIAs and dynamic Web content have been generally difficult to fully expose to search engines because of their changing states — a problem also inherent in other RIA technologies.

* Miss Universe [PersonRank 7]

16 years ago #

[moved]

http://googlewebmastercentral.blogspot.com/2008/06/improved-flash-indexing.html

Q: Which Flash files can Google better index now?
We've improved our ability to index textual content in SWF files of all kinds. This includes Flash "gadgets" such as buttons or menus, self-contained Flash websites, and everything in between.

Q: What content can Google better index from these Flash files?
All of the text that users can see as they interact with your Flash file. If your website contains Flash, the textual content in your Flash files can be used when Google generates a snippet for your website. Also, the words that appear in your Flash files can be used to match query terms in Google searches.

In addition to finding and indexing the textual content in Flash files, we're also discovering URLs that appear in Flash files, and feeding them into our crawling pipeline—just like we do with URLs that appear in non-Flash webpages. For example, if your Flash application contains links to pages inside your website, Google may now be better able to discover and crawl more of your website.

Q: What about non-textual content, such as images?
At present, we are only discovering and indexing textual content in Flash files. If your Flash files only include images, we will not recognize or index any text that may appear in those images. Similarly, we do not generate any anchor text for Flash buttons which target some URL, but which have no associated text.

Also note that we do not index FLV files, such as the videos that play on YouTube, because these files contain no text elements.

Q: How does Google "see" the contents of a Flash file?
We've developed an algorithm that explores Flash files in the same way that a person would, by clicking buttons, entering input, and so on. Our algorithm remembers all of the text that it encounters along the way, and that content is then available to be indexed. We can't tell you all of the proprietary details, but we can tell you that the algorithm's effectiveness was improved by utilizing Adobe's new Searchable SWF library.

Q: What do I need to do to get Google to index the text in my Flash files?
Basically, you don't need to do anything. The improvements that we have made do not require any special action on the part of web designers or webmasters. If you have Flash content on your website, we will automatically begin to index it, up to the limits of our current technical ability (see next question).

http://www.techcrunch.com/2008/06/30/once-nearly-invisible-to-search-engines-flash-files-can-now-be-found-and-indexed/

Techcruch readers chime in

TOMHTML [PersonRank 10]

16 years ago #

Yes, Googlebot was ALREADY able to read Flash files. They just improved their bot a bit.

Above 6 comments were made in the forum before this was blogged,

Colin Colehour [PersonRank 10]

16 years ago #

This is being posted everywhere now:
http://blogs.zdnet.com/BTL/?p=9224

Anyone find it interesting that they are working with Google and Yahoo but not Microsoft. Do they not care about Live.com searching of flash files?

Tadeusz Szewczyk [PersonRank 10]

16 years ago #

So it seems there remain more problems than solutions...

Ianf [PersonRank 10]

16 years ago #

I know ?why? they do it, but still can not get over the "... bother" part. Flash/AIR objects excel at showing pseudo-dynamic pictorial and/or audio content --which Google neither can, nor intends to index-- flossed with varying lengths of textual matter. Which is the segment Google can index, along with, no doubt, internal adm. floatsam. And yet there is a common, humanly-grokkable mechanism for indexing such impenetrable/ walled binary or otherwise encoded gardens that are Flash and the like – inclusion by the poster of a descriptive title arg. within the calling wrapper. That's too much of a bother? – goodbye Google-presence! (tune by Elton John, travesty of lyrics by Bernie Taupin).

As for the nitty-gritty, this part in particular caught my eye (EMPHASIS MINE):

» Q: How does Google "see" the contents of a Flash file?

A: We've developed an algorithm that explores Flash files in the same way that a person would, by clicking buttons, ENTERING INPUT, and so on. Our algorithm remembers all of the text that it encounters along the way, and that content is then available to be indexed. «

Where does Google get the initial, contextual text to enter as input, I wonder. Unless the algorithm throws every combination of inside-harvested strings at its Flash-parser, that can but come from the very search terms that threw up the Flash object to begin with, OR, what's more probable, already were associated with the same file as results of previous, yet to this query unconnected, searches. Somehow that last bothers me a lot, dunno why.

_____ [PersonRank 1]

16 years ago #

I just hope Google allows searches to throw out results that originate in Flash. Or better yet, make it an opt-in search option.

Ianf [PersonRank 10]

16 years ago #

Not much hope in that unless you discount the default "leading-minus" deselector. Thanks to culture of tagging, today's Google search results are already choking with youtube returns. That, at least, is easy to filter out by its unique depository name. But Flash files are from anywhere, and, if served explicitly, need not even carry their semaphore suffix. So weeding them out by default will not be so simple a matter. Neither is offering a preselected option to restrict the return to only certain doc types (say the textual ones – .txt, .html, .pdf and .doc) really in the interest of Google....

Ianf [PersonRank 10]

16 years ago #

On an unrelated-but-related level, I *VEHEMENTLY* object to being subjected to so pedestrian a test of me human worth as that, which the heartless Blogorobot just spewed out at me:

"Small bot/ human check, what's 1 + 1?"

What does it take me for, that I couldn't put two and two together?! And what if I don't feel like answering in decimal, or even numerical terms?

Roger Browne [PersonRank 10]

16 years ago #

[put at-character here]Ianf: OK, instead you can answer "what's the twelfth root of 4096" (the answer will be the same, but you will not feel so pedestrian).

Ianf [PersonRank 10]

16 years ago #

Roger, why so obsessed with numbers, and then such simplistic addition as sole test of humankind? You know as well as I do that it won't wash in the end. Were I bent on spamming this forum, it'd take me, oh I don't know, better part of a coffee-break to write a superior robot to defeat Philipp's robot. If we are to be quizzed at all, we need better, and more intriguing, quizes, with fuzzy-logick range of passable, preferably verbal, answers to verbose questions of meta-physical nature. That, and an instant-onliney procedure of appeal to some higher, independent arbitrator, case our attempts to pass have been rejected.

milivella [PersonRank 10]

16 years ago #

Interesting article about the meaning of the Adobe-Google (and Yahoo) partenrship:
http://remiel.info/post/40766424/on-googles-web-the-user-is-1-google-is-0

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!