Google Blogoscoped

Wednesday, April 20, 2005

Steven Pemberton and XHTML 2

Yesterday I had the chance to meet Steven Pemberton, who gave an enlightening talk on XHTML 2 and XForms in Sankt Augustin, Germany. Steven is the chair of the HTML and Forms Working Groups at the W3C, the World Wide Web consortium which defines the cross-browser standards used by web developers. In the 80s, he developed a browser-like system which included extensible markup, stylesheets, vector graphics, client-side scripting, and “everything you would recognise as the web now (though it didn’t run over TCP/IP).”

As for XHTML 2, which is not finalized yet, we learned it will be a lot more streamlined than the relatively downwards-compatible XHTML 1. Most of the specific tags, like the “img” element, are moving into attributes – such as this (no more “alt” text):

<p src="images/some-image.gif">Our sales increased by 4 percent this year.</p>

Also, there are now suggestions which would implement RSS and similar right into the HTML, as I’ve suggested before here. The role attribute could now attach an “rss:title” to a heading. Headings, by the way, are now simply called “h”, as the level is calculated by the number of “section” elements surrounding it. Very nice indeed.

XForms in the meantime add another abstraction layer (which makes this a more useful, but also more complicated alternative to common HTML forms): a data/ intent model. This model will take the values out of the form, determine their data structure, and attach new XML fragments in Ajax/ XMLHTTP fashion. The user agent – a browser like Opera’s – needs to implement some Xpath, and all is settled. What comes as great bonus is true media independence; the same XForm could be accessed via a speech interface, as well as a screen application. (You could order a pizza using your hand phone or your Windows mouse.) XForms is actually taking off faster than expected, Steven Pemberton said. As for the other recommendations, 5 years are expected for the web crowd to adopt the standards, Steven remarked.

Steven also talked about RDF, the biggest competitor to the lower-case semantic web Google is introducing. Google after all manages to extract content from “lazy everyday web writing” with no attached meta-data (this is an address, this document is about Tony Blair, and so on – just think of how Google Auto-link recognizes addresses, the power of Google Q&A, or of how Google attaches keywords to a page which are found only in links pointing to same page).

During the first and second break I bombarded Steven with my questions, especially on some outstanding Google issues. The rel="nofollow” attribute actually convinced Steven to write a long letter to Google Inc, a W3C member. He criticizes they didn’t ask the consortium for advice on link spam and potential wordings of HTML extensions. This nofollow is one of the attribute values which, if you spend 30 seconds instead of 15 thinking about it, would have been a much better one. While Google certainly reached some short-time goals with their suggested implementation, and while they also complied to existing robots.txt and meta-tags traditions, they didn’t think about good web standards; ideally, elements and attributes describe document structures and relationships only, allowing for a multitude of “context-sensitive” implementations. Therefore, as Steven added, “nofollow” (a verb, after all, with a single use-case) would have fared better if it would be called “unrelated.”

Thanks to Lars Kasper for the photo from the tutorial – he has more available.


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!