Analytics by Log Files

In my Web Analytics class, we’re beginning to analyze Apache log files to extract Analytics data. Today, I pulled down a raw access log from this site to see what I could learn. I also have AWStats going to build reports for server access. As I’ve been digging through my access log, I’ve noticed that comment spammers make up a large portion of my server access.

I have found that comment spambots will hit a page on my blog, then scrape the page for the comments form, and then post spam comments to the form target. From AWStats, close to 50% of the access of my site are from Operating Systems that are unknown. This leads me to believe that about 50% of my access log data is pollution from spambots.

Luckily, spambots don’t usually download my Google Analytics JavaScript and execute it like a normal browser so the data is more pure.

One thought on “Analytics by Log Files

  1. Pingback: benrobb » Blog Archive » Web Traffic Analysis

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>