Death of the web server log analyzer

Since I’ve had my first web server set up – somewhere around the year 2000 – I’ve been running tools to analyse my logs daily and generate statistics of visits such as referrers, browser versions, screen resolution, and other useful data. I have tried several different tools since then, but the one I’ve been using for most of that time is AWStats.

I’ve usually preferred that kind of tool over analytics systems like, say, Google Analytics, simply because it’s as close to the metal as you can be when analyzing web access. It doesn’t require JavaScript support on the client side, or any set up on your web pages; it would therefore detect all visits to a website, regardless of the page visited (it would include access to links that don’t exist, and detect common 404 URLs, for example) or the client’s technology.

Today I was having a look at the statistics of one of my websites to try and make sense of how many mobile users I had, and their operating system breakdown. This is what I had for OS under my (newly updated) AWStats:

OS breakdown by AWStats

This is not very helpful; it doesn’t list any mobile OS (it is probably folding them under their original desktop counterparts, or under “Unknown”).

The browser statistics fare a little bit better:

Browser breakdown by AWStats

But it is still a little awkward to read; and I don’t think the “IPhone (PDA/Phone browser)” statistics seem very accurate either.

I had a look at Google Analytics’ stats, instead, and this is what I found, for OS statistics:

OS breakdown by Google Analytics

And for browsers:

OS breakdown by Google Analytics

Although the date range used in the statistics above is different, the list seem more accurate (at least for what Google Analytics has access to, which is my main WordPress install), and it lists all modern OS and browser variants correctly. And, of course, I also have advanced parameters I can add to my list (like OS version, or even device types) that are not pictured here.

What happened?

It would be unfair of me to say AWStats is just a bad, or outdated, product. But the truth is that it is still a leading log analysis tool whose results don’t come close to the ones produced by a snippet-requiring, JavaScript-component-based tool like Google Analytics.

Despite being closer to the metal, it seems to me that log analysis tools are at a disadvantage simply because they don’t have as much access to the client machine as a JavaScript-based solution does. Maybe what’s at fault is their reliance on nearly useless user agent strings, and while I do believe AWStats could do better in that regard (I believe it’s detecting a lot of false browser matches), it’s also true that they can’t use some small JavaScript code for better browser type and version detection – which, I’m assuming, Google Analytics does for increased accuracy.

Log analysis tools still have their place for detecting, say, 404s or direct links to raw assets (like images) coming from third-party domains. But for page access and audience analysis, it seems their days are long gone. They died a while ago, but I suppose I only noticed it now.

  • It should not be that hard to figure out how to filter out operating systems report from your local HTTP GET log.

    • Zeh

      That’s what AWStats does.

  • Don

    My hosted accounts for clients come with AWStats and I still send them PDFs of the monthly results, as they look more impressive than GA or StatCounter. Shame.

  • Art

    I too dislike putting JavaScript on my pages. My favorite log file analyzer is Serlog. You might give it a try.