Thursday, September 13, 2007
Google Analytics Was Partly Showing Wrong Absolute Visitors
Google’s web statistics service, Analytics
, was partly showing the wrong value for “Absolute Unique Visitors,” Google said in a statement
. This bug is fixed now, Google says, and was only restricted to the Absolute Unique Visitors details report page, and not the value as taken from the Visitors overview page. But when you visited the former, Google previously made the error of adding up the daily unique visitors; the sum of those days, however, are not the Absolute Unique Visitors value anymore (because the sum includes repeat visitors). A sample calculation I made for one of my sites showed the wrong value to be off by a factor of around 1.6 – the old Analytics would have shown around 95,000 non-existing visitors on the Absolute Unique Visitors details page for the last month!
This case goes to illustrate that even Google gets math wrong sometimes, but the confusion among search statistic providers and those who read the stats don’t stop with this bug-fix, unfortunately. It’s helpful to clarify some terminology:
- A hit is any file on the server that is requested in the browser. This value is mostly useless to track a site’s success because it also includes image files, CSS files and so on.
- A page view groups all hits into a single request, and is thus a much more useful value. Its basic meaning is mostly “someone (or something, if you include bots) requested an HTML page”. (There is already some ambiguity with this term; e.g. do you include RSS as a page view? What about Ajax or Flash applications, which usually don’t reload the HTML but just request e.g. an XML file from the server?)
- A visit on the other hand groups all page views of a session into a single request. Usually, a time frame of half an hour of inactivity is used to “time out” a session, meaning if the same IP would visit in the morning and the evening but not in-between, it counts as two visits (also sometimes called “unique visits”, even though it’s potentially the same person). However, the methodology starts to strongly vary with this term “visit.” My server provider 1and1, for instance, calls this session a “visitor,” not a visit, in their stats interface*. Google Analytics on the other hand never calls it a visitor.
It’s a bit like asking “how many cars drove through this street today?”. You can just add 1 for each car you see passing by. But you can also exclude cars you’ve already seen, in case someone is driving back home after work on the same day, passing the same street.
- A visitor is supposed to group all visits into a single count over a specific time frame. As mentioned above, there is confusion between the two terms. That’s why Google Analytics uses the word “Absolute Unique Visitor” to not have it be confused with the “visitor" value of certain other packages, as the absolute unique visitor is much lower.
As an example, say your site gets 10,000 visitors everyday, and they all loyally visit once a day for 5 minutes. In a month of 30 days, you would have 300,000 visits. According to some statistics packages, you would also have 300,000 “visitors,” though that is a slightly imprecise interpretation. According to Google Analytics however (aforementioned bug excluded!), you would have 10,000 absolute unique visitors in that month... because Google is sending out a cookie to know who’s been on the site before in that month. But many statistics packages working straight from the log files do not have the luxury of this cookie, so they need to interpret the IP. Since IPs are often handed out dynamically by providers, and some proxies summarize different users under a same IP, this value is more vague, hence the 30 minute time-out for session as a sort of middle ground.
Now which value you should look at or hand out depends on what you want to measure; “success,” branding, ad views? Or the chance a given person clicks on an ad? Or the chance your server goes down due to traffic? Or something else entirely?
Visits in Google Analytics are one of the best all-purpose measurements, and then you can also look at average time spent on the site (for instance, an important value to a video site which may show commercials alongside the video in specific intervals), and page views. The Absolute Unique Visitor number on the other hand won’t tell you anything about the visitor loyalty; to this number, 10,000 people visiting on Monday only is the same as 10,000 loyal people visiting every day from Monday to Friday. That number is also particularly incomparable to some other stats packages, and it certainly won’t tell you about server traffic. It’s an interesting value, but handle with care and know that you can never sum up the individual time frames for this to get the overall absolute unique visitors number (the error Google themselves made).
In any case, when you compare values from different statistics providers, make sure you compare the same thing – and if you’re asked about your statistics, also make sure the person who asks doesn’t compare your numbers with numbers from other statistics packages, unless you can be precisely sure the two stats “mean” the same thing... especially when it comes to the terms “visit” and “visitor.” And as above news of Google’s bugfix shows, sometimes you even need to ensure you’re also not comparing numbers from a recent version of the statistics software with an older version of it.
>> More posts
This site unofficially covers Google™ and more with some rights reserved. Join our forum!