Google Blogoscoped

Wednesday, January 19, 2005

Nofollow Just for Spam?

Some people suggest using nofollow exclusively for those links which users have control of from the outside. This would include guestbooks, forum signatures, blog comments, referrer listings or a blog's trackback function. Possibly, this also includes wikis. However there are apparent other needs for nofollow links and they may include links you create yourself on your own web space (like your blog).

For example from every post here I link to the discussion forum, and lately I've added a parameter to that link to automatically fill out the subject line of a new thread in the forum. So the main URL for the forum looks like this:

A link from a specific post on the other hand looks like this:

Because I do not want to have search engines think those are pointers to unique pages, I'm inserting a "noindex, nofollow, noarchive" meta-tag into the target HTML whenever the post_id parameter is set. Still the problem of link weight may remain: instead of my blog homepage having about 50 unique links, it now has 100 or more. In other words I may want to simply subtract these URLs from the link-weight calculation Google performs; this would keep the page leaner, and other links would receive more weight (specifically, I want to put "weight" on links I discuss in my blog posts, because they are the most important ones).

Thinking ahead with semantic correctness

As is true for most other HTML elements, we should always try to think of elements and attributes in terms of what they mean as opposed to what effect they have. Why this separation? Because a single meaning can have a multitude of different specific effects. A <h1>-element may be rendered big and bold; it may be used to give more weight to the keywords contained within it; it may be attached to a style-sheet; a text-to-speech engine may read it louder; it may be used to automatically create a table-of-contents; and so on.

To cover all the possible element or attribute uses (some of which are yet unknown because they need to be invented in future tools) we'd have to make a big mess out of HTML – like <div class="heading"><big><loud> <b><important>My Title</important></b> </big></div>.

Naming conventions: avoid verbs

In this context, having an attribute value be a verb actually is sub-optimal; the verb describes one possible effect. It's a call to action, but fails to adapt to multiple situations in need of a different action. For these reasons HTML recommendations are careful to avoid verbs (events like "onclick" are a noteworthy exception). Why doesn't everyone else keep to this separation of meaning and effect? Well, for one thing, it's harder to think in abstractions; saying "this is a headline, and a headline is big and bold" is more abstract than saying "this is big and bold." Just the same, saying "don't follow this link" is easier than e.g. "this link has been posted from the outside and is not yet approved, and unapproved links should not be followed."

Of course, the "nofollow" value has been around before and is used in meta-tags of an HTML file's <head>-section, so it's good we re-use this value. Still we need to be careful not to over-simplify this new "HTML standard" and say: "Basically, nofollow is a way to tell Google and others 'This is spam'." It is true we have to start guessing now at just what this means, because "nofollow" is the symptom to an undefined cause. But I would rephrase the explanation to simply say: "Nofollow is a way to tell Google and others to completely ignore this is a link." It's more on the point and opens up a larger array of effects for the new nofollow. As a first additional use, it can now be applied to think about link-weight too and to restructure parts of your internal linking.


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!