Google Blogoscoped

- Page II -

Tuesday, May 6, 2008

Google’s Head of PR Leaves for Facebook

Elliot Schrage has left Google to join Facebook, BoomTown reports. Google on their management information page explains Elliot is or was “Vice President, Global Communications & Public Affairs” and “responsible for the company’s public-facing communications, including media relations, policy strategy and stakeholder outreach, as well as internal communications.” At Google Press Day 2006, Elliot said, “The Google Story is getting more complicated and complex everyday.” Also in 2006, Elliot gave a testimony in front of the US House of Representatives over Google censoring in China.

BoomTown quotes from an internal memo in which Facebook’s CEO, Mark Zuckerberg, welcomes Elliot:

... Elliot Schrage will be joining our management team as VP Communications and Public Policy. In this role, he will be responsible for developing the key messages we want people to understand about our products, our business and the growing global importance of social networking and what we do. The goal here is to help people understand how the internet can strengthen people’s relationships. Elliot will direct our efforts to work with users, media, governments and other entities around the world to ensure that Facebook’s policies are transparent, responsive, effective and are recognized as being those things.

BoomTown writes that sources say Schrage was interested in Facebook “because it was a company poised for explosive growth, much like Google in its early days.”

[Thanks Juha-Matti Laurio! Image based on Google photo.]

Google the "Hefner Mansion of the 21st Century"

Jon Carroll of the San Francisco Chronicle today writes about his experience visiting the Google headquarters, and he’s reminded of another building:

Some years ago, I spent a year taking meetings at the Playboy mansion, both the one in Chicago (RIP) and the one in Los Angeles. It had a lot of amenities – I was never more than 15 feet from food and extremely good red wine, and the chairs were comfortable enough for the all-nighters we routinely pulled. There were scantily clad women about, too, and occasionally I passed one in the halls. It was a lot better than a regular office.

I think maybe Google is the Hefner mansion of the 21st century. It too rises from a fantasy – what if you had all the information in the universe at your fingertips? – and it too has sensual amenities to ease the workload. Since the whole idea of Google, it would seem, is that no one should ever have to leave work, one does wonder how it handles courtship rituals and related activities. I imagine they are handled in some officially unofficial way, though, because the Google environment is very carefully thought through.

[Thanks Brinke!]

The Google Era of Computing

The CEO of online office Google Docs competitor Zoho, Sridhar Vembu, regularly sends out interesting thoughts from the Zoho blog for others to republish. Below is a partial (partly snipped) reposting of his latest musing; you can find the full text (titled “IBM, Microsoft & Google Eras of Computing”) from May 2nd over there.

By now it is conventional wisdom to say that there was an IBM Era of computing, then a Microsoft Era, and now we are in the Google Era. In this post, I will explain why Microsoft was not the “next IBM” and why Google is not the “next Microsoft” – there are significant qualitative differences among them, quite apart from their status as the dominant, era-defining players.

The original IBM mainframe era (in contrast to today’s IBM) was one of the highly closed systems. IBM was not just the dominant player of the era, IBM was pretty much the entire ecosystem. There just wasn’t a lot of room for third parties to play in. Third parties were marginalized companies surviving on IBM’s sufferance or professional services companies (like EDS) or were providers of cheap replacement parts, which felt vaguely dirty, borderline legal (consider today’s third party print cartridge situation as an analogy).

In contrast to IBM, Microsoft was far more open, which indeed was the original reason for their success. Microsoft unleashed what I would call the semi-open era of computing. The acronym ISV (independent software vendor) came into its own during the Microsoft era. Indeed, Microsoft encouraged ISVs provided fairly good support – up to a point. The defining test for Microsoft was Netscape, the most prominent ISV that got on the wrong side of Microsoft. Microsoft failed the test by winning; their victory over Netscape forever established their reputation in the industry, a reputation that finds its echo in Yahoo’s cultural resistance to being assimilated.

Now, the present Google era. Google has the genetic and cultural advantage of being born in an open source world, with a business model that is aligned with rather than antagonistic to open source. It reflects in how they conduct their ecosystem initiatives. Google Gears comes with one of the most liberal open source licenses (BSD license), and we at Zoho particularly appreciate the support provided by Google’s open source teams. In our extensive interaction with them, we could tell how they truly get the value of openness. That openness is going to be the underpinning of the Google era of computing – I hope they never forget that!

Google Dictionary Tool for China?

The Chinese Google news blog DWGoogle reports that Google is about to release a dictionary tool in China in some kind of cooperation with Kingsoft. This tool (pictured) is an adaption of Kingsoft’s existing popular tool PowerWord, and it allows the look-up of words as before but also brings new features added by Google.

According to Wikipedia, PowerWord* “is a collection of Chinese, English and bilingual dictionaries and supporting proprietary software, published on CD-ROM in China by Kingsoft, which claims to have 20 million users including 50,000 organisations.” A text to speech tool can be triggered for words by clicking on a loudspeaker icon. DWGoogle writes that Google’s adaption of this Windows desktop application will be added to Google Pack. Perhaps Google PowerWord will also be useful to English users with an interest in Chinese content, as, according to Wikipedia, the user interface of at least the existing PowerWord can be set to either English or Chinese.

If you speak Chinese and can add more details of what DWGoogle talks about or related information, it would be helpful!

[Screenshot by DWGoogle.]

*PowerWord is written 金山词霸 in simplified Chinese, 金山詞霸 in traditional Chinese, and “Jīn shān cí bà” in Pinyin.

Post to Google Reader

You can now add your messages to Google Reader to be seen by your Google friends. To post a note, log-in to Reader and click Your stuff -> Notes to the left. This will open a box where you can type in a note with any thought, Friendfeed/ Twitter-style. To find out who exactly your friends are, check the Friends tab at the Settings page for an overview (or click the Manage Friends links).

Next to the notes box, there’s also a new bookmarklet to share items you find on the web. Just drag and drop the link to your bookmarks and you can activate it when you stumble upon something; you can also add a custom note to items, Google explains in a blog post.

Another new feature is that the layout of your Shared Items page can now be customized to *drumroll* display one of three graphics on top (titled ice cream, ninjas and sea).

Google Reader tries to be a lot of things all at once, these days, but I think the handling – especially of the social features, which are magically interrelated with Gmail/ Google Chat – can feel a little cluttered and confusing. But I’m not a Google Reader user to begin with (I’m mostly using Friendfeed, which offers opt-in subscriptions independent of a friendship status)... what do you all think?

[Via Friendfeed. Hat tip to Leon Santiago!]

Update: Ryan points out that you can share blogs posts and change the text of them – without Google Reader disclosing this when other readers see the item, which might make them to believe they’re reading the original text by the blog (e.g. as it will say “from [blog name]” below the title). Ouch.


Note the upper-case bit that reads “This is a test to see if I can alter ...”

[Thanks Ryan! Screenshot by Ryan.]

Monday, May 5, 2008

Unicode Most Popular Encoding

Google released a chart which says that Unicode encoding – which can be used for a variety of languages, not just the smaller set supported by ASCII and others – has become the most popular encoding on the web since December 2007. While their search index – which this was based on – may not represent all pages out there, it might be the next best approximation. Google, who announced they’ve begun to support the latest version of Unicode now, say “We have long used Unicode as the internal format for all the text we search: any other encoding is first converted to Unicode for processing.” [Via Waxy.]

Friday, May 2, 2008

Carrefour Censorship on Google.cn Continues

Three days ago I sent a first email to Google in regards to the Carrefour query censorship we see on Google.cn – searches for 家乐福 (French hypermarket chain Carrefour) are turning up no results (see the end of this post for an update). I’ve also sent follow-up emails to Google press US and Germany and asked employees of the company. Google press support is keeping quiet on this topic though, making no comment so far (perhaps they’re busy or need more time, and are overwhelmed by requests).

However, in the meantime one good source told me the censorship is indeed by Google, and not due to some direct technical interception. If the message is by any chance deceptive, this person may have been deceived too, though. Until Google gives out an official word on this, it’s hard to tell for sure. But here’s a recap of the scope of this censorship (note that sometimes, different data centers return slightly different result counts; also note that some numbers overlap so they can’t be added up at all times, e.g. a blog post is a also a web page):

Search type Result count estimate for 家乐福 on Google.cn (Google China) Result count estimate for 家乐福 on Google.com, Chinese language
Web pages 0 7,160,000
Images 0 242,000
News articles 0 10,692
Videos 0 2,397
Blog postings 0 977,110
Books 0 602

I had problems testing some services, like Google Maps. Google also has services to which there is no direct US counterpart – like Google Dao Hang and Rebang – which are completely censored too for this query. Other search engines in China show censored results too (according to some reports, even the Chinese Carrefour homepage was down for days due to being hacked). Variants of the query in question, like writing [家乐福 test], do not result in the near-blank page but show results (though they may still return partly censored results due to the ongoing domain censorship, which becomes active when you search for e.g. [家乐福 democracy]).

If Google did not do this censorship themselves, then a question that arises is how they feel about their brand being kidnapped by interceptors. If Google indeed censored this term based on some government instruction, then I’m curious about the following things, to begin with:

Google states they’re doing the China censorship as a compromise to give a better offering to Chinese users. But it seems only if we have more transparency on the issue can we make up our own minds and evaluate their moves. Back in 2006, Google’s director of research and former director of search quality, Peter Norvig, said:

And sure they [Chinese] want to know about democracy and Falun Gong and so on, but really they want to know about their day-to-day information. And they want to know about things like outbreaks of bird flu and so on.

And so we’re giving them that and we think that’s the most important. (...)

Some of the people will want to query about democracy, but most of them just want to know about their pop stars.

Incidentally, not only was [bird flu] partly censored in 2006 and still is partly censored in Google today (to follow-up on the English query of Peter’s example, though many Chinese queries are censored too), but it may also be likely that the current more complete censorship of Carrefour was picked precisely because it was a query of such frequent and recent interest among some Chinese.

Update: The censorship of this word on Google.cn has now been lifted (at least in the services I checked). The ban is also gone on Baidu, though Yahoo.cn still shows nothing. [Thanks Orz86, Hong Xiaowan and Aguyinchina!]

Adding a Watermark to a Google Docs Document

Google recently added a CSS (Cascading StyleSheet) editor to their Google Docs document editor. To edit the stylesheet, create a new document or open an existing one and pick Edit -> Edit CSS from the menu. Google put up a Docs-specific CSS tutorial to guide you through some of the details.

For instance, if you want to create a kind of watermark – in the sense of an image printed somewhere on the page that is not selectable by left-clicking on it – you can follow these steps:

  1. Create a new document which will hold your image resources. You need this as you can't reference external images in Google's CSS (like those on Flickr, or your own server; perhaps Google feared security issues with external files during this release). Click New -> Document in the Google Docs explorer and name the document something like Docs Image Resources.

  2. Upload necessary graphics into this container document. I'm uploading this blog's logo, for instance. Now switch to the HTML source of the page by clicking Edit -> Edit HTML in the Docs editor menu. Look for the image you added (I'm sub-titling every image so that it will be easy to find in the source) and copy it's relative image URL to the clipboard, like this one:
    File?id=ajfjf92tmnhd_920x5c3dbct_b

  3. Create a new (or existing) document where you will add the watermark by referencing the file you uploaded. Switch to the CSS editor – Edit -> Edit CSS – and enter the following CSS (replace the image URL with the one you copied yourself):

    body {
      margin: 100px;
      background-image: url('File?id=ajfjf92t...etc...');
      background-repeat: no-repeat;
      background-position: 90% 50px;
    }

    This will add a right-positioned logo, as shown in the screenshot.

You can now create a copy of this document as a kind of template whenever you're creating a new document, as the CSS will be copied with it. And that's the big problem about this approach, too; until the day Google allows you to reference a stylesheet that is somehow saved in a repository external to the copied document, you can't change the CSS for all documents easily. You'd have to go to every single CSS you copied and make the changes, which would be time consuming. You can't even easily change the watermark image you copied because when you upload a changed pic to your repository document, it will generate a new file identifier, and you'd have to manually replace it in all CSS definitions of your docs (though maybe someone of you knows a workaround for this).

Until the day Google does offer better CSS handling, stylesheet editing for docs may be the next best thing to using inline styles right in the HTML, but it's still cumbersome. It works well if each of your documents has a unique styling, but it does not work great when you have multiple documents and you'd like to apply (and perhaps later redesign) the same layout for all of them.

Thursday, May 1, 2008

Mariusz Gasiewski, Polish Google Evangelist

Janusz Nowak told me about Polish Google employee Mariusz Gąsiewski. “[H]e is great evangelist of Google products in Poland. He is seen as one of the best AdWords and web analytics specialists in Poland ... and he loves to share knowledge about this.” Among other things, Mariusz wrote a couple of free Polish ebooks to introduce people to Google Analytics, Janusz tells me, and creates website templates for small business. Marius also often writes at the official Google Poland blog, he says.

I asked Mariusz a couple of questions, like what his job is at Google. Mariusz replies he’s been working at Google since July 2007. “Within Google I work in Sales department, working with AdWords. I am Account Strategist ... Basically I help our customers to use the potential of Google AdWords to drive their business successfully. Sometimes I provide them with training and orientation on Google products such as Google Analytics, Website Optimizer, Conversion Tracking.” Mariusz says he’s writing his blog in his spare time after work. “I like to share my knowledge about Google products and search engine marketing. I try to create free materials which could help Polish people to drive successful business in Internet. try to improve and share my knowledge in web analytics, search engine marketing, usability.”

[Thanks Janusz!]

Income Levels Of All Italians Posted Online

From Reuters:

Italians were surprised, and in some cases outraged, on Wednesday to discover their income levels were available for public viewing on an Internet site [agenziaentrate.gov.it].

As part of a crack-down on tax evasion, the outgoing centre-left government made public every citizen’s declared taxable income on the state’s tax website

The data was then taken down later on Wednesday after a complaint from privacy watchdogs and attacks by politicians. Reuters quotes tax minister Vincenzo Visco says, “It’s all about transparency and democracy. I don’t see the problem.”

[Via Spiegel.]

Update: Ema comments, “Just to add some info. In Italy, by the law (and it was from a lot of years) you can go to a local tax office and ask for other people tax declaration. ... They are public in our country. So the only new thing is that they are published online, not a big difference as long anyone can already take this data and publish everywere (some times newspaper did this in the past and none complain about).” [Thanks Ema!]

The Value of a Google #1 Ranking

Search Engine Optimizer Aaron Wall put up an extensive, multi-angled analysis of the worth of a top Google ranking.
[Thanks Aaron! Disclosure: Aaron advertised his SEO Book here about a couple of years ago.]

CNBC Interview With Google’s Eric Schmidt

Google boss Eric Schmidt was interviewed by Maria Bartiromo of CNBC. Some excerpts from the longer transcript:

On innovation and the next big thing

Eric Schmidt: I’ve always thought that the scariest piece of innovation is knowledge understanding and language translation. I don’t understand how it works, but to watch a computer – literally watch it – read something in English, dissect what it’s about, translate it into a language that I don’t speak (...) And it isn’t magic, it’s just very good computer science, very good artificial intelligence, very good physics.

On problems entering Asian markets

Eric Schmidt: In China, of course, there’s all the issues of regulation and censorship. We delayed our entry for good reasons*, and as a result we’re not number one there.

Eric adds that Google had some problems with some of the languages, like Thai, where he says as the language does not have word breaks, “developing the technology to do that right and then search and index against it took [Google] a little while longer.”

On Google’s biggest challenges

Eric Schmidt: I think it’s internal. It’s the ability to manage the creative process, deal with the complexity in what is a relatively large company, in terms of people, who’s doing what. We have 50 development centers all around the world, people in different time zones, “Are you doing that? Are you doing that? Do I work with you? How do I check in my code?”

On long-term goals

Eric Schmidt: We’re really focused on this huge opportunity before us, which is automating the trillion-dollar industry that is advertising. We won’t get all of that, for sure, but we should be able to get a significant part of that over the lifetime, certainly of my service to the company. And our goal is to build this into an institution that lasts for many, many years

Eric remarks that Gogle’s highest priority is the end user or end user happiness, which is about whether or not people are happy with Google search results. Eric thinks Google pays for this by improving their ad services. Also, Eric Says:

Our next big play is in this applications phase, where we think people spend a lot of time online with information, and we can help them, whether it’s their e-mail, which is an easy one to understand, but what about their personal data?** What about their spreadsheets and their calendar, keeping it all there? (...) If we do that right, they can do it on mobile phones as well as at home, in their office and on a Mac and on a PC

Eric says they want to do the same thing for corporate customers who Google “will have for 20 or 30 or 40 years as they build into [Google’s] model.”

[Via Techmeme.]

*Compare to press day 2007, “Our China traffic and China business is booming right now, so it looks like the strategy is working. My only regret is we should have done it much sooner. But it just took this long process to figure out what the right sort of ’Google-value based’ answer was.”

**Does Eric imply email is not personal data?

Wednesday, April 30, 2008

Google Apps Hacks Is Out

A couple of days ago I received the first couple of copies of Google Apps Hacks* from O’Reilly and today, the book is fully live on Amazon, too! It’s very exciting for me, as this book project was spanning around a year, with about half of that in preparation, and the other half in writing of the book (in Google Docs).

Google Apps Hacks features tips and tricks evolving around not search but the “Google office” consisting of such programs as Gmail, Google documents, spreadsheets, presentations, Google Maps, SketchUp, Picasa, Blogger, Google Calendar, iGoogle, and many more. There are different difficulty levels, ranging from quick tips easily applied, over to spreadsheet formulas or stylesheet hacking or downloading special plug-ins or creating gadgets or maps, towards programming tips. O’Reilly on their site offers a table of contents, a sample chapter and more – and they’re also giving away the book as a prize to one of you.

So, I really hope many of you will be able to put the tips to good use and find the book helpful, and am curious about feedback.

*During early writing the book was called Google Office Hacks.

[Thanks to everyone who helped in writing the book (mentioned in more detail in the beginning of the book), and a special hat tip to my editor Brian Jepson, as well as those who provided tips for the book through pointers or through having great blogs about Google, like Ionut Alex. Chitu’s!]

Artist Themes for iGoogle

Google in the US is currently promoting a new “artist themes” section for iGoogle via a special logo and text below the search box. IGoogle is the name of Google’s personalized homepage, and as it allows skinning with special background graphics and so on, this directory offers works by artists like Jeff Koons, Coldplay, Robert Mankoff, Dolce & Gabbana, Akira Isogawa, Anne Geddes and more who Google says collaborated with them. (Other countries also have interesting iGoogle themes, like an Astroboy one for Japan.)

[Hat tip to Jérôme Flipo, Brinke Guthrie, Ionut Alex. Chitu and Colin Colehour!]

Update: As TomHTML and Colin point out in the comments, the special logo was showing in other countries too. Colin writes, “I have seen it available in the following countries USA, Finland, France, UK, Korea, Japan, Australia and probably many more locales that I haven’t tested yet.” [Thanks Colin and Tom!]

Google Ocean?

Google is planning to map not only the sky and land masses (Google Mars, Moon, Earth, Google Maps and so on), but – according to a report by News.com – also aims to map the oceans:

The company has assembled an advisory group of oceanography experts, and in December invited researchers from institutions around the world to the Mountain View, Calif., Googleplex. There, they discussed plans for creating a 3D oceanographic map, according to sources familiar with the matter.

The tool – for now called Google Ocean, the sources say, though that name could change – is expected to be similar to other 3D online mapping applications. People will be able to see the underwater topography, called bathymetry; search for particular spots or attractions; and navigate through the digital environment by zooming and panning.

[Thanks WebSonic.nl!]

Tuesday, April 29, 2008

DomainTools’ Paid Links

Google recently revived their domain information onebox. Enter e.g. whois google.com to find a special link referring you to domain info from domaintools.com. But, as Beussery.com found out, DomainTools is selling ads on their site which use a non-"nofollowed” anchor link wrapped around image ads, with the link pointing to e.g. a vpslink.com sub-page and the alt text reading e.g. “Cheap VPS Hosting”. And this may well be against Google’s own webmaster guidelines, which disallow such paid links unless they come with a “human readable disclosure” (like the “nofollow" value for links).

However, this may also just be a simple mishap and not necessarily bad intention on DomainTools’ side, or the advertiser’s side. For one thing, many text link ad schemes don’t use image content, and they also don’t often use target URLs like “http://vpslink.com/?utm_source=domaintools&utm_medi...”, as that may dillute the goal of gaining PageRank. Or reversely put, it looks like this kind of ad spot may be sold for a similar pricing – currently at $10,000/ month – even if DomainTools would add that nofollow value in their template. Especially now they’re in such close neighborhood to Google (though I might be wrong).

In any case, this still looks like something DomainTools may want to fix, and Google would have an interest too in them fixing it... as it would be quite weird if Google would need to ban a site in organic results and then pick it as preferred onebox target above the same organic results.

[Thanks Beussery!]

Update: DomainTools now fixed their collision with Google’s webmaster guidelines by adjusting the HTML; instead of direct links, ads are now channeled through a /go/ redirect on their server, and their robots.txt disallows Googlebot indexing that directory. [Thanks Matt Cutts!]

Google China Self-Censors Carrefour

Many Chinese are currently reportedly angered at France for an attack against a wheelchair-bound Olympic torch bearer that took place in France, as well as Paris awarding honorary citizenship to the current Dalai Lama of the Tibetan Buddhists... so some people in China wanted to boycott French hypermarket chain Carrefour. And now, a search for 家乐福 (Carrefour) in Google.cn does not show any of the 6+ million results it shows in Google.com – instead, Google China returns a nearly blank page with a single message that (if an automatic translation is to be believed) roughly means, “You cannot access information of this search result, please return to google.cn for other information.” While censorship of selected domains is common for Google China, keyword-based censorship is more rarely to be seen. Plus, this specific filter covers not only web search but other services, like news or video search, as well.

If (as is likely) this censorship was ordered from the Chinese government – finding a helping implementor in Google Inc, as well as other search engines, like Chinese Baidu – it may perhaps not so much aim to censor Carrefour... but to quiet down evolving mass protesting against Carrefour. A little patriotism may be wanted by Chinese authorities, but angered masses getting together acting in unity may be too much for the government, as this may be a risk to their own system which normally tries to prevent such demonstrations.

Google in China removed other kind of information before, too. In Google Maps, satellite imagery is missing; in Chinese web search, whole domains are blacklisted (including those of news organizations, or human rights watch organizations); in Google Book Search China, foreign publishers are missing; in Google News for China, Google removed several government-unfriendly sources. Google also agreed to censor information in other countries, like Germany and France, often referring to local laws and policies, sometimes with and sometimes without disclosures printed on the specific search result (e.g. in Germany they didn’t always show it but now we have reason to believe they always show the disclosure for web results, whereas in Google News China or Chinese book search no such censorship disclosure is visible in search results; even in web search, where there is a disclosure printed at the end of results, the disclosure will not tell users just what it is that’s missing).

[Thanks Www!]

Update: Ludwik Trammer, who is very surprised about this censorship, comments, “There is alway a slight chance Chinese Government did that without consulting Google via some automatic or semi-automatic system they established.” I’m currently investigating whether such a thing is possible and applied here; if you have more information in regards to this bit, please comment. In the past, messages from Google with the Google logo showing were indeed by Google. If Google replies with anything official or more information comes in I’ll update. Pinging Garett Rogers of the Googling Google blog, Garett says he wonders if it’s possible “to figure out if the government is intercepting the request and doing what they want to the response” (something which often happens from within China, nonetheless, but even then usually in the form of some non-branded connection error message, as far as I know). [Thanks Ludwik and Garett!]

Google VisualRank

Google researchers at a web conference in Beijing announced they work on some kind of PageRank specifically aimed at images. Called VisualRank, the technology was so far only applied to a smaller test set of images, as apparently applying it to all images Google indexed would be too computing-intensive (even arguably the world’s largest super-computer can’t do everything imaginable yet). According to the New York Times yesterday, visual rank is an algorithm “for blending image-recognition software methods with techniques for weighting and ranking images that look most similar,” and in Google’s internal scoring tests it achieved far higher quality results.

If I understand the gist of the research paper [PDF] right, then it seems the core of Google’s VisualRank algo consists of not only looking at textual cues in regards to images, but also image content itself. After identifying the most authoritative set of picture candidates for a given query, Google then improves the ranking of images found to be sharing the most visual characteristics with the group at large, by creating a similarity network (which also would understand e.g. imagery shown from different perspectives, to a certain extent). Center node images or those images containing large resolution versions would then determined to be the most relevant. In 1000 sample queries – taken from the top Google Product search queries – 762 VisualRank results were tested to be more relevant than Google’s old approach, with 202 equal quality results and only 70 results that were worse.

(On a side-note, I wonder what motivates Google to publicize this information, as it could tip of their competition? Are they only being nice, potentially attracting more good researchers, or is there more to it?)

As far as I can tell, the paper does not yet indicate that Google is any further in specific image recognition, e.g. figuring out that the image found on the web contains, say, a vase of flowers without looking at textual descriptions in the vicinity of the embedded pic. 4 years ago, Google co-founder Sergey Brin said, “I don’t think that in the near future we’re going to have a service that takes a picture, and the computer decides, oh, that’s an elephant, so we search for an elephant. That seems funny to us. We should be able to do it.” Google does have face recognition features for Google Images, though; it’s found in the advanced image search dialog and works very well, and you can also use it on your own site, if you have one with indexed images, by searching Google Images for site:yourdomain.com with the face search activated.

[Thanks Miss Universe, Manoj Nahar and David Mulder! Image by Google from the paper by Yushi Jing and Shumeet Baluja.]

Monday, April 28, 2008

5 Years

On a Monday, the 28th of April 2003, this blog saw its first post. So happy birthday! Like then, today I’m sitting in an internet cafe on vacation (back then in Malaysia, but now in the Netherlands). That first post, by the way, was not about Google, but just consisted of a couple of pointers to movie trailers. However, preceding the blog post there were a couple of Google and search-related articles at the technology section of www.outer-court.com (consequently, this blog originally started out at blog.outer-court.com).

In 2003, I was using Google’s Blogger.com to FTP-transfer posts over to this site. In 2003, uploading an image from the internet cafe to the server of “Googlosophy Blogoscoped” (as this place was briefly called) often meant installing a downloaded FTP client by trying to find some way to get to Windows Explorer... by using the “Save” dialog of Internet Explorer and locating the Explorer EXE, as usually you weren’t allowed to even right-click the desktop, let alone expand a Start menu.

Google changed a lot in the meantime, and the Google-covering news scene also changed incredibly. Not only does Google now have dozens of blogs themselves as they have so many more products, but there’s also many daily tech blogs which cover Google and search, as well as a couple of Google-specific news blogs (in 2003, the top Google news blogs were Google Village and the Google Weblog, but at the time they weren’t really as frequent as you might have wished for). Today, if you’re blogging an important news on Google a couple of hours after it broke, you’ll probably be the 50th or so news source to cover it; often, it also only takes an hour or so for a news to be on the frontpage of a social news site like Digg. To get something first, you need to be really quick, or do original research. If mere quickness fails, then you can take the time to provide analysis and commentary instead. Then it can help if you take more time and connect bits and pieces. For me it’s great to take as much time as I like on a blog post as I’m working for this blog and some other sites full time since 2005.

Something else changed with Blogoscoped since 2003; these days, there’s a great forum here where many of us hang out, cover news, and write our commentary on the world of Google or technology. Originally the forum was added because Blogger didn’t provide a comment functionality for posts which were FTP-transferred. But then the forum as another location to break news turned out to be really great, thanks to all of you participating. And not only do so many people help crunch Google news in the forum, but there’s also a great amount of tips sent in via email. These days, there’s even a co-editor – Tony Ruscoe – who helps out.

I’m looking forward to see how this place evolves. There’s been some visible change in the last years (like the PageRank going from 0 to 7 and then 8 for a while, to now be back to 6 – or the several redesigns) as well as a lot of less visible change behind the scenes (like the style of research involved to create posts, which changed around every year or so as new approaches and sources become meaningful – e.g. Friendfeed replacing Digg, reading the forum replacing visiting the frontpage of a technology news site and so on). Keeping it dynamic means keeping it fun. Thanks to everyone for hanging around, reading or writing – I hope you’re having fun too and gain information and perspective from the discussions here.

Friday, April 25, 2008

Google Docs Updates: CSS Editing, Saved Searches and More

Google Docs received a couple of updates. For instance, you can now create custom views onto your documents by saving advanced searches, which will then be displayed in the folder pane under a “Saved searches” label*. To save a search, click “Show search options” in Google Docs and perform a search; afterwards, click on “Save this search” and give the search a custom name.

Google also announced they rolled out their offline functionality for Google Spreadsheets and Presentations, not just the Google documents word processor. Also, you can now edit a document stylesheet; in the Docs document file menu, click Edit -> Edit CSS. While you were previously able to insert a restricted amount of inline styles, you can now more easily apply global styles, like a margin for the “body” element:

body { margin: 100px; }

The linked help entry on this feature was not live at the moment; it may be possible Google is restricting the kind of Cascading StyleSheet properties you can enter.

As another added feature, Google’s Presentations tool now has speaker notes which will be visible in presentation or print mode, Google says. Also, Google announced that you are now able to embed videos into Presentations; however, I was not able to access the help entry on this subject, and also did not see the “Insert video” button in the Presentation editor’s top toolbar, as suggested by Google’s help.

[Thanks Hebbet!]

*This feature was discovered through a video by Google a while ago.

Update: The “Insert video” button now appeared here in Google Presentations. When you click it, a dialog pops up letting you search through YouTube. You can then check the box next to the video and hit the Insert button. Other video providers outside Google-owned YouTube are not offered; by continuing their product cross-integrations, Google is not only further fleshing out their “office suite” but also strengthening their overall hold on users. (There’s also a chance they’re still working on integrating other video providers, but that supporting YouTube was most easily implemented for this release.)

Once published, others can playback the video right in the presentation via a custom player (a player which lacks a pause button, oddly enough... once played, videos can only be stopped, which immediately rewinds).

Scientology’s Google Ads

Scientology seems to be heavily advertising with Google these days. I spotted the below ad on YouTube:

And just now at a blog, I saw the following animated Flash AdSense ad – which was not disclosed as an ad, which is partly to blame on Google (Google would fare better either disclosing all ads as such or none, as the latter would easier allow webmaster to always attach their own disclosure):

These ads link to Scientology.org, like this target page.

Just like Google stays out of guarding over specific organic search result, they likely generally stay out of monitoring ads or enforcing their own beliefs, except for what they say they disallow via their AdWords policy... which does not explicitly seem to mention any rule against the kind of ads shown by Scientology. Such a stance may allow Google to act more neutral overall.

Sometimes, Google is also actively reaching out to advertisers. The following bit posted by Lauren Turner on one of Google’s ad promotion blogs last year was more revealing than Google might have wished for in retrospect (and Google ended that particular blog not long thereafter):

[C]ompanies come to us hoping we can help them better manage their reputations through “Get the Facts” or issue management campaigns. Your brand or corporate site may already have these informational assets, but can users easily find them?

We can place text ads, video ads, and rich media ads in paid search results or in relevant websites within our ever-expanding content network. Whatever the problem, Google can act as a platform for educating the public and promoting your message.

Scientology on the other hand may well have found this advertising opportunity on their own, as they know Google well. Some years ago, they managed to get certain Scientology-critical pages on the domain Xenu.net kicked out of Google’s results by filing a copyright infringement notice with Google (do a search for site:xenu.net and scroll down to see Google’s notice). Many people at the time reportedly reacted by googlebombing Xenu up into a good spot – even today it’s in the top 10 for a search for scientology. Not all reputation issues can be managed with budgets alone.

Diverse Google Results Are Good

Google results which are diverse – approaching your query from different angles – are good, and those which aren’t diverse can be a barrier to find what you’re after.

Take for instance the query [google blog], which I’ve been monitoring every now and then for some years now. It’s ambiguous input which can mean mainly any of the three things: 1) the user wants to see the official blog by Google Inc., 2) the user heard about Google’s blogging platform and wants to find Blogger.com, or 3) the user is looking for an independent blog covering Google. There may be even more cases than this.

Some search engines, upon entering an ambiguous query, return related search suggestions. Google does so too, at the end of the results; they link to searches for [unofficial google blog], [google employee blog], [create blog] and more. But the best thing for the search engine is to cover all possible answers among the top 10 results, so that the searcher has the chance to lazily jump to any of the results without further query refinement.

In the search for [google blog] (see screenshot), what we’re seeing at the time is roughly the following result – I’m coloring the different types:

1. Official Google Blog
2. ... same blog as above ...
3. Google’s blog Search
4. Google’s blogging platform
5. Independent blog about Google
6. ... another independent blog about Google ...
7. Another official Google blog
8. ... another official Google blog ...
9. ... another official Google blog ...
10. ... another official Google blog ...

This is an excellent example of satisfying the different search cases, as there’s a lot of variety in result types. However, over the past there were times when all the official Google blogs would start to push independent Google blogs (like this one) further down the page. As Google has so many blogs by now, it can be thought of as a powerful link network helping its member blogs (not consciously, perhaps; they also generously link out to independent blogs). This network effect seems to have been mostly corrected in the latest results for this search, though. (There are still problems with this result – one independent blog shown in the result is not updated anymore, yet there are more interesting new independent blogs about Google which the result does not feature... perhaps we can think of this as a link heritage problem, which may favor old stuff over new stuff.)

In other searches, the network effect is still strong, and sometimes arguably too strong. Take a search for [amsterdam hotel], for instance. I would think in this case, searchers are mainly looking to rent a room in Amsterdam for an upcoming trip. What the Google top 10 now shows are 9 hotel booking sites with offers for Amsterdam, as well as one hotel which is called “Hotel Amsterdam” in position 7. This is another good result, because while not very diverse, it’s likely diverse enough. Here’s a color chart for the search – we can assume that all but one player in this result set are in it for the commission money of referring searchers to a specific hotel:

1. General hotel booking site
2. General hotel booking site
3. General hotel booking site
4. General hotel booking site
5. General hotel booking site
6. General hotel booking site
7. Specific hotel by that name
8. General hotel booking site
9. General hotel booking site
10. General hotel booking site

However, once you went to a specific booking site and you find a hotel which you might want to book, you also might want to find out more about the particular hotel. Often, the hotel’s homepage will show galleries and detailed contact info and more, data which the booking site may not have. But these single homepages will frequently not be as heavily search engine optimized and networked as the larger, more general hotel booking sites, which also have gotten their entries for individual hotels indexed in Google!

Consequently, the search for a particular hotel – [marnix hotel amsterdam] – will show this homogeneous result set (sometimes with photos and good reviews but not always):

1. General hotel booking site entry for Marnix Hotel
2. General hotel booking site entry for Marnix Hotel
3. General hotel booking site entry for Marnix Hotel
4. General hotel booking site entry for Marnix Hotel
5. General hotel booking site entry for Marnix Hotel
6. General hotel booking site entry for Marnix Hotel
7. General hotel booking site entry for Marnix Hotel
8. General hotel booking site entry for Marnix Hotel
9. General hotel booking site entry for Marnix Hotel
10. General hotel booking site entry for Marnix Hotel

People usually don’t look beyond the top 10 results, but the actual hotel’s official homepage, MarnixHotel.nl, is not to be found in results 11 to 20 either. What you can do now is search Google for [site:hotelmarnix.nl] or [inurl:marnixhotel] and similar queries, but this is only little better than going straight to the browser address bar to try your luck. (You may also head over to Google Maps.) In other words, the Google result when querying for a specific hotel is not diverse enough to be as useful as possible, because Google favors larger networks.

In this specific case, one may think it would be easy to come up with a working approach adjustment; say, why doesn’t Google favor sites which have the words “marnix” and “hotel” right in the second-level domain? On the other hand, what may work for this query may not work for another query, and Google normally favors scalable solution. (And in fact, when searching for the more specific [marnix hotel homepage] – a stronger indicator you’re after the official thing – the domain MarnixHotel.com will be shown, but it’s apparently a domain reserved by the HotelsArea.com network.)

For diversity, Google may analyze links and then look at larger site clusters which are still distant to each other (while trying to penalize clusters which are a little too large and interconnected, perhaps, as that may be a spam farm)... but it may be very hard to identify a fitting but not well-backlinked site. I know (or rather, I guess) that MarnixHotel.nl is the official site, but I figured this out by looking at the domain name, the site’s layout, the introductory text and imagery and so on.

So the problem remains: heavily optimized, backlinked and well-networked sites are doing better than non-optimized island pages. Most hotel or small business owners may not have the same resources to invest in optimization, or they may simply not realize there’s need for optimization, or in the case of hotel booking networks, they may be perfectly happy if those other sites handle the booking. Only the Google searcher misses out on the site they were potentially looking for due to a lack of result diversity.*

*However, while diverse results are good, this doesn’t mean ambiguous search queries should always result in an equal share of all different types of results. And sometimes, cases of ambuigity could be construed which may not exist as such in real life due to actual, common search patterns. Take the query [apple], for instance, which could mean you’re looking for information about apple the fruit, or Apple the company. However, the Google top 10 result at this time is filled with Mac-related information exclusively; only in result 11 (which, again, most people won’t see, and rather adjust their search query to search again) will there be a pointer to Wikipedia’s entry on the fruit. While this is a bit of lack of diversity, and one might think Apple-the-fruit may fare well in something like position 9 or 10, I would also think that many people looking for apple-the-fruit would enter more specific queries than that, such as [apple recipes] or [apple images], so the result is not all that bad.

Thursday, April 24, 2008

The Leave Me Alone Box (Video)

[Video by Michael. Via Reddit.]

Google Germany’s Girl’s Day Logo and More

Today is Girl’s Day (“Mädchen-Zukunftstag”) in Germany – and potentially other countries – and Google.de is showing a special logo with a pigtailed kid writing math formulas on a green board. Girl’s Day is intended to get more girls into jobs which more males usually get into. Today is also Boy’s Day (“Neue Wege für Jungs”) intended to get more boys interested in jobs predominantly taken by women.

In other news, Google has started a doodle competition for Germany, Austria and Switzerland (note you need to e.g. switch your browser language to German to see this). Everyone between 5 to 18 years can participate; the theme is footballsoccerwhatyoucallit and Google says the logo will be displayed during the day of the European (men’s) championship final on June 29th, among other prizes to be won. A couple of designs are already online in Picasa.

[Thanks Hebbet!]

Google Hands Over Profiles of Suspected Pedophiles

On April 9th this year, AFP wrote that Brazilian authorities ordered Google to hand over 3000+ profiles “containing suspected pedophile material” on Google’s social network Orkut. However, the authorities then complained that Google “refuses to identify users,” AFP wrote. Now, that has changed, as a new AFP article indicates:

Google on Wednesday handed over data stored by suspected pedophiles on its Orkut social networking site to Brazilian authorities, ceding to pressure to lift its confidentiality duty to its users, officials said.

According to AFP, the Brazilian senate commission was “looking into allegations that illegal images of minors were posted in restricted-access photo albums on the site.” The article goes on to mention that Orkut in Brazil is more popular than Facebook or MySpace. It also says that a member of the senate commission that ordered the profiles hand-over said this was the first time Google had accepted to disclose data. Also, AFP writes that a Brazil state prosecutor said that “90 percent of the 56,000 pedophilia allegations received in the past few years related to Orkut.”

[Thanks Pd and Juha-Matti Laurio!]

Google Finance Redesign

Google’s Finance site, originally launched in 2006, received a design overhaul. Pictured above are two older design as well as the new one. The current design fluently resizes its elements like market summary, tabbed top stories, and portfolios, but also partly breaks some of its tabbed content in browser window sizes below 1024 pixels width. Kirby Witmer in the forum comments, “Takes a bit to get used to the new layout.”, whereas blogger Ionut Alex. Chitu says, “Too busy, the font size is too small.”

[Thanks Kirby, Ionut and Manoj Nahar!]

Wednesday, April 23, 2008

Unusual Google Domain Names

Pingdom compiled a list of domains which they say are mostly owned by Google. Included in the list (probably often for reasons of having obtained a domain from a typosquatter or someone else who Google thought abused their trademark... as opposed to Google aiming to run that service themselves):

[Thanks Pingdom, Tony and Isaac Sunyer!]

Tuesday, April 22, 2008

Gmail Inbox vs Archive

It seems there’s two different approaches to maintaining Gmails. One is to keep everything in the inbox; all mails will stack up, and those you still need to react on have the unread status (making them visible by their white background, and searchable using is:unread). The other approach is to immediately archive all mails you’ve dealt with so that you have a clean inbox. (In both approaches you can additional star messages you’d like to get back to later on, though it’s also easy to star too much and then never get back to it.)

The two approaches are different, with each having slight pros and cons in different areas, but both achieve similar effects in organizing your mails. However, some in a forum thread say that keeping your inbox rather clean (and letting things stack up in the archive instead) will help speed up Gmail. I switched to the “archive” approach yesterday to give it a try for some time, though I can’t make a clear conclusion what’s faster – it does seem a little faster, a little less lagged during start-up, but I’m not sure. (And it may also depend on the number of unread messages still in your inbox.)

If you want to switch from an inbox to an archive approach but you want to keep your unread mails in the inbox, here’s what you can do. In the inbox, click All, and then click the Select all... link. Hit the Archive button (this may process a while afterwards). Now do a Gmail search for is:unread, and in the results, click All and hit the Move to Inbox button. Now when you get new mails and read and reacted on them (where necessary), you need to remember the extra step of hitting the Archive button on them, though.

[Thanks Drew, everyone!]

Monday, April 21, 2008

An iGoogle Preview With Friends Features, Expandable Views

The Google personalized homepage iGoogle is now available in a preview version for developers to play around with new features. You can log-in via the new iGoogle developer homepage by clicking on Getting Started. In the sandbox, you will be able to see a couple of new things:

Google’s preview image of the updates gadget:

Just like the release of OpenSocial, the new iGoogle sandbox right now feels more like a flaky alpha experiment for brave-of-heart developers rather than something useful. There are broken links in tutorials, character encoding issues, JavaScript bugs and more. But in the long run, perhaps Google is aiming to clone functionality provided by social tools like Facebook or Friendfeed. While other social services start with the network – profiles, defining friendships and so on – and then put applications on top, Google seems to go the other route by stacking the social network on top of their existing apps. This in turn causes confusion in some areas, like when it comes to defining who your “Google friends” are in the first place.

[Thanks Hebbet!]

Friday, April 18, 2008

The Return of Google’s WHOIS OneBox
By Tony Ruscoe

Last month, Philipp compiled a list of lost Google features, at the top of which was a reference to the WHOIS OneBox feature that Google added for a brief period back in January 2004. Well, that OneBox is now back:

By entering a search phrase like [whois google.com] you can quickly see when the domain’s WHOIS record was created and when the domain will expire above the search results. (Somewhat ironically, it doesn’t actually tell you to whom the domain belongs though.)

Google implies it’s getting this information from the Domain Tools website, which is linked to from the OneBox result title too. However, it seems the data being displayed is not coming live from the WHOIS record, so Google must be caching the information somewhere.

It has been suggested in the past that Google may use this WHOIS data as part of its ranking algorithms, possibly giving weight to those domains which are registered for longer periods.

[Hat tip to Matt Cutts!]

Google’s Special China Page and Its Helper Cartoon

Maybe Google’s “Cliply” was only an easter egg (see previous post), but Google does have a help avatar on a homepage variant of theirs in China. The cartoon figure is called 小谷, which can be losely translated to “small valley” or “little Goo” (as “xiǎo gǔ” reuses a part of Google’s Chinese name). Little Goo gives tips on how to use Google search, e.g. how to bring up weather or calculator onebox results. You can click the [x] in the thought bubble to make Goo go away... but the helper returns next time you load the page. Perhaps Lil’ Goo and Clippy are distant cousins or something.

What’s this special Chinese homepage anyway? We mentioned it before, but it’s not Google’s main page when accessing google.cn in China (that honor goes to another page). Rather, I have reason to believe the page is set to be the default homepage in some internet cafes in China as part of a certain deal Google struck. One Chinese blogger who likes to remain anonymous tells me Google might not have paid enough though, as sometimes games sites like ztgame.com are default in net bars. Internet cafe operators only have low basic salary but have a share in the profits, the source says, indicating that while Google.cn is a student, Ztgame is godfather.

[Thanks Ken Wong and Tony!]

Google Android Challenge Deadline Reached
By Yannick Stucki

Last fall Google created quite some buzz with the announcement of the mobile device platform Android and the $10m Android Developer Challenge. The deadline has been reached and the official Android Developers Blog announced that they’ve received 1,788 entries for the competition. A significant part of the buzz was caused by the huge prize money and that certainly showed Google’s decidedness to enter the mobile market.

Naturally, the question whether throwing $10 million was worth it comes up. Of course, to answer this question we’d have to check out all the apps in order to judge their quality, but since there was a lot of money to win we can expect that there will be at least a few hundred very good/useful/cool apps.

Even if we can’t judge the applications yet, we can still do the math: Google paid an average $5,600 per app. This might sound like a lot of money, but think about how much more it would have cost to hire people and fully pay them to create all those apps. $5,600 is not really much for an app: after all, adapting to Android, implementing and debugging takes quite some time. Furthermore, you couldn’t just hire a team for $10m and tell them: “code 1000 apps for Android!” Not just because of the workload, but creative and different ideas are also crucial; not that a hired team wouldn’t be creative, but but the large number of third party developers can do even more diverse and creative stuff.

At last, besides creating a lot of buzz and apps, the developer challenge also brought hundreds of outside developers to the Android platform to form a community, even before the first phones have arrived. So I’d certainly answer the above question – whether it was worth the money – with a yes.

Google’s Udi Manber Interviewed

Popular Mechanics interviewed Udi Manber, a Google vice president of engineering responsible for “core search,” as Google says. Udi argues search and ranking results is “about getting lots of signals and putting them all together,” adding that signals from people are the best signals. “Our goal is very simple,” Udi says, “We want to return to the user the answer that they need.” Asked about Google’s separation of organic result and ads on result pages, he answers:

When we decide to launch something, we have a weekly meeting where all those things come together and we look at all the evaluations and we make decisions – revenues and any effects on ads do not come into those meetings. We don’t even know what the effects are. We make the decisions solely based on how good it is for search, how good it is for users. The ads are on a different part of the page, and the ad people, I assume, do the same kind of thing and try to improve the ads.

Udi Manber also says “At Google we do not manually change results.” Google’s Matt Cutts at his blog adds, “That’s the right answer for a general/Popular Mechanics audience. For the nitpicking search junkies that read here, I’ll just add that we are willing to take manual action on a small number of issues like webspam and removals for legal reasons.” Removals for legal reasons include e.g. the removal of human rights watch or news sites in China. It’s also worth noting that even automated algorithms are manually created in the beginning and then evaluated by humans, and as such, the algos may have bias in certain directions.

[Thanks Juha-Matti Laurio! Image source photo by Google.]

Google Street View Cars Spotted in Italy

It looks like Google is preparing to bring their panorama photos as part of Google Maps to Europe. Above photo was taken by DNAKiller in Rome, who posted it along with other photos in the forum saying, “This morning, on my motorbike ... What a strange car, seemes like a google one ... and it was!” The icon on the car is the same as the Google Street View icon. Above the car, a pole is erected containing the cameras. TomHTML comments on another photo set from Milan, saying that the license plate is Italian too.

At the moment, Google Street View is a feature only showing for some US cities (like Chicago, New York, San Francisco and others). It is expected to come to Australia though, and with these recent developments, likely other parts of the world too. In some countries, Google may face stronger barriers in terms of privacy regulations guiding what people are allowed to photograph (e.g. in Germany, you are not allowed to just snap a photo of someone and publish it; you may be allowed to do so if the person is part of a crowd and not the center of attention, but with Google Street View, it’s hard to judge whether photos depict a crowd or individual persons, as both views are possible depending on where you zoom to. Many traditional laws seem to not yet reflect technological and cultural developments such as crowd intelligence applied to large aggregated & interactive data sets).

[Thanks Hebbet and DNAKiller!]

Another Google Video Redesign

Google Video redesigned their US homepage again; above, you can see the past designs of Google Video leading to the current one*. The service started out as a video-free search service for stills and captions from TV, then became a video storage & sales site, and since last year, is a meta video search engine. The latest design recommends a couple of “hot videos,” which play in a left-side player if you click on them. “The hot video list is compiled by looking at a variety of signals including videos that most shared, viewed and blogged about,” Google says. Most of the video recommendations at the moment are from Google-owned YouTube.

Below the hot videos, another section lists videos “Featured on AOL” probably put up due to the alliance between Google and AOL; this is typical of how portal-style sites usually emphasize what their partners have to offer as opposed to what the user may really look for.

The Google Video player page has also been reorganized. It looks a bit more organized and application-like now, letting you expand and collapse its elements, and using tabs or arrow-button paging to organize comments and other features.

When you play videos from other sites, Google continues to wrap them in a frame, but Google’s part of the frame is now positioned to the left instead of the top, perhaps to not push down videos too much on the page:

Google’s frame can be closed with an X to its top right, but that won’t remove the frameset itself, so the video URL (e.g. when shared with friends) will still point to Google; you can click “original context” at the bottom of Google’s frame though to remove their frameset completely**.

I’m curious how this frame benefits Google users in the first place though. At Google images, where a similar frame is used, it serves a purpose because the target image is often buried deep down in the page, making it harder to find (clicking on the thumbnail at the top will lead directly to the image then). Usually no such obstacle is found on video result pages, though, where the video is visible immediately – perhaps making the first beneficiary of that Google frame Google Inc itself, as it makes users stay longer on their site. What’s more, Google’s “share” and “related videos” features offered in their frame are usually redundant as the target video site already offers these. (In 2004, Google co-founder Larry Page told Playboy, “We want you to come to Google and quickly find what you want. Then we’re happy to send you to the other sites. In fact, that’s the point. The portal strategy tries to own all of the information.” The 4 years that passed since then are a long time on the web.)

Google Video’s search result pages changed in this redesign, too. By default you will see a thumbnail to the left and snippet and link too the right, ordered in a list view. But now you can toggle to a grid view as well as a TV view on top. The TV view splits the page into a list to the left and a player widget to the middle right. Clicking on a result in the TV view will dynamically embed the other video sharing site’s video. While this also keeps users longer at Google Video than at other sites, this time the feature makes sense for users, as no features (like related videos, or sharing) are mirrored on the page. Video sharing sites will likely continue to add more commercials of their own right into the embedded films to make money, independent of which site the user is on. However, the end result of the TV view is not quite as fluent as zapping through TV channels at all time because partly, the external site (like tudou.com) will be very slow to load.

[Thanks Jilm!]

*Not all homepage design iterations are included in that image.

**Note this frameset is also not included when you click on video search results from Google’s main web search engine, which continues to be a video search engine as well as part of Google’s “universal search” approach. Universal search aims to let people search through all kinds of content without having to visit the specialized search homepage.

Google News Quotes Finder

Google News added a quotes feature, they announced. Search for a name of a person, say Marissa Mayer, and above the news results you may find a gray speech bubble box showing a quote and its source. Click on the linked name below the speech bubble, and you’ll be dropped to a page listing numerous of such speech bubbles containing quotes (including the ability to search through the quotes). Like this quote by Google’s Marissa Mayer: “By gaining all those users you’ll eventually gain the attention of advertisers.”

Google News is slow to roll out new features but when they do they often look solid. This latest addition seems like it could be an incredibly useful feature. However, I wasn’t able to get this to work with the Google News archive (just the quotes from the time frame of a month), which would make it even more useful. I’m also unsure how many quotes it does find, e.g. whether it will find just quotes directly disclosed by phrases like “... Ms Mayer said,” or whether Google also locates quotes from Q&A style interviews, or those indirectly disclosed as “... she said” (the latter style was not found in a brief test I did looking for something Marissa said, though that test isn’t conclusive).

[Thanks Chuck!]

Update: As Matt Cutts points out, even (at least some) quotes of the “he said/ she said" type are found. [Thanks Matt!]

Mechanical Turk Service + Experiments

The Dolores Labs offer to help you get Amazon’s Mechanical Turk work for your needs, and they set up a couple of interesting crowd intelligence experiments on their blog. (One of their experiments reminded of the CHI image sorting I discussed here before we had Mechanical Turk; another of their experiments used my Cover Browser as image source, a nice surprise!) Mechanical Turk, as you may know, is Amazon’s structured, programmable and paid approach to apply crowd intelligence to all kinds of tasks.

Thursday, April 17, 2008

Google Website Optimizer Announcements
By Tony Ruscoe

Google Website Optimizer (which was originally launched as a closed beta test in October 2006) has finally dropped its “Beta” label and been made available as a standalone product, Google announced yesterday.

Previously only available as a feature within Google AdWords, anyone can now use it on its own “to increase the value of your existing websites and traffic without spending a cent.”

The Website Optimizer Blog Team have also launched The Official Google Website Optimizer Blog to accompany the product.

Also announced by the Google Analytics Blog is that Urchin has graduated out of beta too. Urchin is similar to Google Analytics – it allows you to report on your website traffic – except it needs to be hosted on your own servers rather than Google’s.

[Thanks Mbegin!]

Google’s Tips for Moving Your Website
By Tony Ruscoe

The Google Webmaster Central Blog has a post about best practices to follow when moving your site from one domain to another without losing your Google rankings. Here are the main points:

You might also be interested in reading New Design and Move to Blogoscoped.com, Telling Google Your Domain Moved and How This Blog’s Move Went With Google, all of which document how this blog moved to a new domain last year.

Tuesday, April 15, 2008

On Leaving Google

Someone who claims to have worked as a software engineer for the Google AdWords Report Center wrote a blog post detailing why he quit Google. The post mentions a great many benefits and joys of working at Google, but then also goes into problems like agile start-up mentality vs big company settings. Digital Hobbit also writes about the problems of perhaps being assigned to a task you’re not excited about, and then having to learn to understand Google’s programming framework (I split the single paragraph into two):

[I]t is unlikely to initially be able to work in an area that one is passionate about. Many of the Google products are exciting, but unfortunately I was unable to be passionate about my particular product area. That is not to say that there weren’t any interesting aspects about it, and I do have a lot of respect for the team I worked with. Overall this is less of a problem later, as it is generally encouraged to switch projects every 1-2 years, but this first year makes a big difference, particularly for experienced engineers that have a good understanding of what kind of things they enjoy working on (or perhaps more importantly, don’t enjoy working on) or what kind of environments are a good match. I feel that the hiring process should be improved to better take this into consideration, although this is admittedly a difficult logistical problem at Google’s scale.

Another scale-related problem: Due to the sheer size of the code base and the vast number of Google-specific tools and frameworks, it also takes a very long time to learn how to actually become productive at Google, which can be frustrating at times.

[Via Reddit.]

Monday, April 14, 2008

Search Engine Market Shares in Russia

ComScore released an overview of search engine usage in Russia. According to the study, which excludes traffic from internet cafes and mobile phones as well as users below the age of 15, Yandex is the leading search property with 47.4% of the share of searches, “Google sites” are second at 31.2%, and Rambler Media is third at 9.7%. Note that these figures may not be representative of all internet users; ComScore recruits participants for their stats by a package including security software and sweepstakes prizes, an offer which e.g. may or may not be as attractive to tech geeks.

I wonder how well the quality of search results at Yandex.ru really is. A search for [google blogoscoped], for instance, does not return this blog in the top 10. It may be its strength is the Russian web and the English web is ranked less well; if anyone knows more, please comment.

[Thanks Yakov and Jamie!]

Google Data Center Locations (and a Sidenote on Supercomputers)

Google hosts what might be the world’s biggest supercomputer owned by a single company*; rather than a single machine, it’s a dispersed network made of smaller machines, though. Now Pingdom (a neat paid service that can alert you when your site is down) put together a map of Google data centers based on approximate information from the unofficial Google Data Center FAQ. Pingdom writes, “If you include data centers that are under construction, Google has 19 locations in the US where they operate data centers, 12 in Europe, one in Russia, one in South America, and three in Asia. Not all of the locations are dedicated Google data centers, since they sometimes lease space in other companies’ data centers.” But as the unofficial FAQ disclaims in regards to the number of data centers, “Nobody knows for sure, and the company isn’t saying.”

*The world’s biggest supercomputer owned by no particular single entity, on the other hand, might be the web itself – the global consciousness, if you will. It’s made up of us humans and our thoughts, but if we want to approach it in technical terms (which is just one of the many ways to see it), all of its nodes are individual sites crunching information day in and out. A blog and forum like this, for instance, is crunching information related to the subject “Google,” but there are other nodes covering politics, art, technology in general, sports and entertainment, and so on. Each individual node can reprocess the output of another more specialized node for a given subject. As individual task outputs are cached under permalinks, the solving of new tasks is sped up; e.g. to integrate a bit of another topic into a blog post I can jump over to a Wikipedia entry to find the cached “preprocessing” of hundreds of other people, all of who in turn might have used information from all over the web.
        Some nodes are filters or prisms to the online world (e.g. a link blog or feed reader); some nodes are aggregation meeting points for users (e.g. a forum or wiki); some nodes deliver input from the real world (e.g. a webcam or a traditional news site); some nodes are corrective nodes to other nodes (e.g. a watch blog, or the comments in a blog); some nodes determine ranking of importance (e.g. Google); some nodes are gateways to channel “processing power” – our thoughts applied to a particular subject –, determining which subjects are worth crunching (e.g. Reddit or Digg). Over time, the system at large may reject nodes that do not interact (like nodes causing censorship, nodes behind paywalls, nodes that don’t reintegrate diverse feedback due to a conflict of interest), and emphasize such nodes that do (like nodes using free licensing, open comments, embeddable widgets, or standard registration systems).
        But in all this, different nodes in the system may have opposing information points; if the cumulative balance of these information points is about equal to each other, then you won’t be able to find an easy answer – if you request the result to the question “when was Cary Grant born?” via a search engine, you will get a rather plain answer, whereas a question like “who is the ideal candidate for my country’s presidency?” might yield ambiguity. As this network of opinions is so loosely coupled by design, no instance can single-handedly resolve such ambiguity, or – at the moment – truly tell us which questions are worth asking in the first place.

[Thanks George R and Pingdom!]

Orkut Mobile, More

Orkut released a mobile version of their homepage at m.orkut.com (you can view it on the desktop too), as the Inside Orkut blog reports. Contained within the mobile site is a reduced version of the default Orkut, including features like Scraps, Updates from friends, an alphabetic list of your friends, your profile and more.

 

In other recent Orkut news, AFP reports that Brazilian authorities ordered Google to hand over 3,000+ user profiles containing suspected pedophile material. “Federal authorities have complained that Internet giant Google refuses to identify users who post criminal material on the social-networking website Orkut,” AFP writes.

[Thanks Inside Orkut and Juha-Matti Laurio!]

Third-Party Gmail Redesign Style

The Firefox extension Stylish allows you to easily load homemade stylesheets in your browser, to be applied to specific websites. These custom stylesheets are then able to change a couple of colors or font sizes... or deliver a full-featured redesign of an application. One such complete redesign is offered for Gmail by Globex Designs. Called Gmail Redesigned*, it reformats the whole display into a darker, more beveled design, leaving almost no design element (not even the loading message) as it was, as the screenshot shows.

*To install the program, first install Stylish, restart, and then install Gmail Redesigned. To uninstall it, in the Firefox menu click on Tools -> Add-ons -> Stylish -> Options, and delete the Gmail Redesigned item (or completely uninstall Stylish).

[Thanks Evgueni!]

YouTube Was Temporarily Blocked in Indonesia

On April 2nd, CNSNews reported that Indonesian authorities asked YouTube to take down the movie Fitna by Geert Wilders. Google’s YouTube rejected this, arguing YouTube allows people “to express themselves and to communicate with a global audience.” They said:

The diversity of the world in which we live – spanning the vast dimensions of ethnicity, religion, nationality, language, political opinion, gender, and sexual orientation, to name a few – means that some of the beliefs and views of some individuals may offend others.

After Google’s rejection of the take-down, on April 8th news came in that YouTube at large, as well as Google Video (among other, non-Google owned sites), were now blocked in Indonesia. The information ministry said the order “would stay in place until the Web sites remove the film,” AP at BrisbanTimes writes. This blocking was not a singular incident; YouTube was also recently blocked in China due to videos covering the Tibet issue, and had been blocked in Thailand and other countries before.

The Indonesian ban was met with protest by many Indonesians though. According to CNet, one Indonesian blogger wrote, “Indonesia has entered a dark digital age ... I wonder why the Government made the stupid decision. The 17-minute video was rubbish, and I am not interested in viewing it at all.”

And after three more days, the nationwide ban was lifted again. The Information and Communication Minister Muhammad Nuh said, “I openly ask the public’s forgiveness for the inconvenience caused over the past few days by the blocking of sites,” adding that it was “a consequence of a process designed to protect the state.”

[Thanks Pd and Mohammad!]

Show Off Your Google App Engine Apps

SanderQD in the forum started an interesting thread: show off your Google App Engine apps*. If you created something at appspot.com, please add your pointer and a description of the program. If you have a link to the source to share for others to take a look and learn, even better. [Thanks SanderQD!]

*Also see the App Engine homepage.

Friday, April 11, 2008

Collaborative Google Gadgets

A couple of iGoogle gadgets are now shareable, Google announced. This is similar to functionality already available in Google Docs, where you can also collaborate with others on documents if you want to. A list of all shareable gadgets – like sticky notes, or crossword puzzles – can be found in the Google Directory.

As a gadget developer, to activate this feature you apparently need to include the line <optional feature="shareable-prefs"/> in your gadget’s ModulePrefs section.

Concerns Over a Future Google Street View Australia

Anthony Klan at The Australian writes:

Google Australia is expected within months to launch an application that will publish highly detailed, street-level photos of much of Australia, in a move that has drawn strong criticism from privacy advocates.

Google’s picture-snapping cars have been cruising Australia’s suburbs since late last year, with pictures of thousands of homes expected to be uploaded to the internet with Street View’s launch.

While Google has defended the project, the internet company baulked when The Weekend Australian requested the personal details and addresses of the group’s key figures to allow the paper’s photographers to take pictures of their homes. “Providing those details would be completely inappropriate,” said Google spokesman Rob Shilkin.

The article then goes on to share several details about Google managers and the places they live in, obtained via searches and public documents. CNet was once boycotted by Google press for googling details about CEO Eric Schmidt, so who knows if Google will stop talking to The Australian now...

On the web, when you don’t want your website to be indexed, you can put up a robots.txt file. In the real world, there is no such standard yet to defend against Google’s car fleet taking snapshots of houses... unless perhaps a high fence, and general trespassing laws. Google’s Street View interface does have a feature though where you can request a take down of specific material after it went live. And the head of Google Northern Europe, Philipp Schindler, once explained: “In the cases where we found out it’s necessary to introduce special privacy protections, we reacted prior to launch. For instance, you won’t find images of accommodations for the homeless, or abortion clinics.”

[Thanks Pd!]

Update: The Sydney Morning Herald has more, including quotes by Google’s Marissa Mayer who said when it comes to protecting privacy, Google was a the mercy of “flaws of the real world and human error.” Marissa said, “If the road isn’t very clearly marked as a private road, or if the driver misses a sign, there will be occasional places where we make a mistake.” Marissa also said Google was developing technology to blur faces and car number plates but that they weren’t sure if this would be ready for the Australian launch. (Valleywag’s comment: “if it’s anything like the algorithms used to detect copyright infringement on YouTube, we’ll be living in the Matrix before it’s done.”)

I’m curious how this will play out over the world; maybe technology will become more privacy-sensitive, or maybe culture will become less privacy-oriented, or maybe new laws start limiting mass aggregation of public data. In Europe, a Street View service as it is now may already face legal challenges, as in some countries here there are more restrictions on republishing pics taken in public.

[Thanks TomHTML!]

Googlebot Submitting Forms to Find More Pages

The deep web is a term referring to all the kinds of pages that are live on the web, but not indexed in search engines for some reason or other. For instance, traditionally search engines mostly follow links in HTML, but from what we know they don’t understand JavaScript (yet) or submit forms. Now Google announced that they have started to experiment with submitting forms for some “high quality sites” by entering words picked from the site into the form’s text boxes, and by selecting different radio buttons, select boxes or checkboxes. Then when the Googlebot determines that web pages found in the results to that submission are valid, “interesting” and unique, they may add them to their search results index.

Google notes that they only do this form submission for “GET" forms. A form using GET results in a parametrized URL like example.com/show?foo=bar. The guidelines for webmasters are that a GET request should never actually change data on the server, like trigger a user registraton or something; for such things, webmasters should use POST, which the Googlebot will not submit. Google also note that they “omit any forms that have a password input or that use terms commonly associated with personal information such as logins, userids, contacts, etc.” Plus, Google say that pages they find will not reduce the PageRank of other pages on the site.

With this move, Google digs a bit deeper than before which may result in more relevant results for searchers, and a smaller “deep web.” And if webmasters misconfigure their scripts or robots.txt files so their site goes against net standards, it may also result in a bit of new confusion for some. On the other hand, this move by Google also has the potential to help webmasters who have such misconfigurations, especially those who aren’t very knowledgeable about web accessibility or SEO, and who don’t put up crawlable links to all their sub-pages (and in reverse, if Googlebot continues to be smarter about what it crawls, in the long run some web developers may also see less incentive to remove small inaccessibilities on their site).

[Thanks Miss Universe! Sketch drawn by MMOArt.]

Update: A correction; Google does parse some JavaScript to find URLs, as TomHTML told me. Google’s Matt Cutts confirmed to me “Google has the ability to scan JS to discover some very clearly provided links.” Whether or not Google parses more complicated JS, Matt leaves open. He says, “Primarily we use it as a way to discover new links to possibly crawl.” [Thanks Tom and Matt!]

Thursday, April 10, 2008

Tip: How to Reorder iGoogle Tabs
By Tony Ruscoe

One thing that iGoogle doesn’t currently allow you to do is move your tabs around to change the order of them. A quick Google search returns hundreds of pages where iGoogle users are asking how to do this. Until this feature is officially added, here’s how you can do it yourself...

Yesterday, Google added a feature to iGoogle which allows you to export and import your settings.

Exporting your settings to your computer downloads an XML file which contains information about all of your tabs, gadgets and theme settings. (You can see what yours looks like here.)

If you’re familiar with XML, it should be quite obvious what you need to do. If not, simply follow these steps:

  1. Go to your iGoogle settings page, scroll down to the Export / Import section and click the “Export” button. (You’re going to edit this downloaded XML file, so I recommend making a copy just in case things go wrong!)
  2. Open the XML file in a text editor, such as Notepad, and look for the sections which start with <Tab title=. There should be a section like this for each of your iGoogle tabs. (If there are any you don’t recognize, they could be used by iGoogle for Mobile.)
  3. Find the section which corresponds to the tab you want to move and cut everything between <Tab title= and the next occurrence of </Tab> (including the tags themselves). Paste it either before or after the <Tab> section that you want it to appear next to.
  4. Now go back to your iGoogle settings page, browse for the file you’ve just edited and click the “Upload” button. Do not click “Save” button!
  5. Once you see the “Import completed.” message below the upload field, go back to your iGoogle home page and you should see your tabs have changed order!

(If things didn’t go to plan, find the copy of your XML file and upload that, which should restore everything to the same as it was before.)

[Hat tip to Colin Colehour!]

Google 3D Warehouse Integrates Street View

The Google Maps Street View feature has now been added as a tab to Google’s 3D Warehouse, in case the model includes geo information. The Warehouse directory offers 3D models for Google SketchUp users and for integration into Google Earth. (These days, any other site can integrate Street View too, via the API.) Frank Taylor of the Google Earth Blog, who sent this in, says “this new tab is an excellent example of Google using their own tools in a smart way”.

In other recent Street View news, AP reported that Google was sued by a couple who claim Google’s panorama pics “violated their privacy, devalued their property and caused them mental suffering” (see the forum discussion).

[Thanks Frank and Brinke!]

Google China Shows Olympic Torch Run

Google Maps China has put up a mini site showing the location of the Olympic torch that’s currently being carried over the globe in preparation for the China Olympics. The mini site also links to Google Image search, mobile search, Google Earth, language tools, and coverage ov