Google Blogoscoped

Monday, April 18, 2005

Linking Strategies

The Search Engine Journal features a detailed discussion of how Google might interpret links based on Google’s link evaluation patent. Jim Hedger writes:

“Over the past week, SEOs and SEMs [Search Engine Optimizers and Search Engine Marketers] have noted some significant changes in the search engine results delivered by Google. Google appears to be actively cleaning its listings by targeting sites using suspicious link-building techniques. A couple of well-known search engine marketing sites have vanished from Google results under keyword phrases they dominated just last week.

The sudden disappearance of these sites, along with a notable difference in search results under other highly competitive phrases has led many in the SEO/SEM industry to conclude Google has implemented some of the spam-link busting filters outlined in their 63-point patent document published two weeks ago.”

Of course, links are still the most important ranking indicator for Google. This is commonly called “off-page” optimization, as opposed to the on-page optimization of e.g. meaningful titles and headlines. As this is so well-known, Google needs to defend against link-farms or similar schemes which artificially boost pages. Artificially ranked result pages for a given search query are annoying to the user and can be considered “search engine spam.” It’s a very commercial interest for a search engine to fight such spam because it needs to stay popular, just as it’s very commercial for spammers to target the most popular search engine.

What does Google do to defend themselves? In short, Google is looking at how real and non-optimized links seem to appear. The more artificial a link appears, the more spammy it is. Google is never able to tell for sure, but they can try to balance things out. Just like a spam filter, catching all would mean getting rid of some of the good guys as well; being too lax to prevent to catch the good guys on the other hand risks to introduce too much spam. A spammy search result could mean the sudden death of a search engine – people may go away to another SE to never come back.

What makes a link “real” and what “unreal” (thus spammy)? For example – and not to say Google implements this, because that’s impossible to know to outsiders – a search engine could be lowering the rank of pages which link to pages which backlink to them in return. This can be a link exchange among friends, a link farm of one person or group, or two bloggers in a discussion. Determining who’s still on the good side would consist of counting overall linking behavior. Surely, a blogger wouldn’t only link to sites which contain a backlinks. This would create a more balanced linking behavior.

Mechanisms such as trackback and referrer listings (the function of a blog to automatically include links to backlinking pages) would be a potentially poisonous ingredient in this mix, because they lean towards the side of “incestuous” link-farms. The recent rel="nofollow” attribute may be a good practice in this case, as it assures the search engine turns a blind eye to this link.

Another example; a search engine may also decrease the value of a page if links appear in a closed set of pages. This is your basic link-farm – it is closely connected in itself, but practically unconnected to the rest of the web (sites like Slashdot wouldn’t link to a link-farm). On my own sites, even though I don’t think I risk getting the Google-axe, I avoid inter-linking; if my site A is unrelated to the topic of my site B, they don’t need to be connected just because I have the power to put my links on them. Jim Hedger on such interconnected sites writes:

“If you or someone you know has been engaged in a link-building plan that relies on link trading between multiple sites that don’t actually relate to or do business with each other, you might want to take a few hours to examine your link-building strategies.”

Google lately is also strongly looking at the date of pages, the date of links, and the relation of these two dates. Or so it seems from their patent. The dynamics in which new links appear and disappear in turn reflect upon a page’s value. A page that grows its links over the course of a year may be a real community effort, with real backlinks. A page getting 10,000 backlinks on a day, and then none for a year, may hint at a link-farm effort with no real community behind. It’s hard to tell exactly what Google is doing, but look at what they write:

“[0039] Consider the example of a document with an inception date of yesterday that is referenced by 10 back links. This document may be scored higher by search engine 125 than a document with an inception date of 10 years ago that is referenced by 100 back links because the rate of link growth for the former is relatively higher than the latter. While a spiky rate of growth in the number of back links may be a factor used by search engine 125 to score documents, it may also signal an attempt to spam search engine 125. Accordingly, in this situation, search engine 125 may actually lower the score of a document(s) to reduce the effect of spamming.”


“[0077] The dates that links appear can also be used to detect “spam,” where owners of documents or their colleagues create links to their own document for the purpose of boosting the score assigned by a search engine. A typical, “legitimate” document attracts back links slowly.”

Link text is important as well. It may be unnatural if a site only gets backlinks containing the same link text. Why? Because different people would naturally use some variation in their texts. Only a spam farm would try to artificially boost their ranking for a phrase by repeating it over and over. Paid links may employ the same approach.

Of course, a counter-measure is relatively easy for the case of too-static anchor text: simply randomize your backlink text to more or less (but not exactly) target your key-phrase. Link-farmers may always be able to take counter-measures, to which Google acts in return. This can be considered a costly game of good timing and intuition. On the other hand, there’s one methodology with which you’re almost always on the safe side: simply do not think about Google, avoid trying to boost your rankings, and rely on great content alone. Great content will always attract real links, and you don’t even need to figure out what makes them “real.” Google has a lot of PhDs doing this job for you.


