110 likes | 217 Vues
Lies, damn lies and Web statistics. A brief introduction to using and abusing web statistics Paul Smith, ILRT July 2006. Overview. Some key terms explained Log Analysers What you can find out What you can’t find out Cache for questions Trackers / counters Further reading. Key terms.
E N D
Lies, damn lies and Web statistics A brief introduction to using and abusing web statistics Paul Smith, ILRT July 2006
Overview • Some key terms explained • Log Analysers • What you can find out • What you can’t find out • Cache for questions • Trackers / counters • Further reading
Key terms • Log file • IP address • Hit • Visitor / visit / user session • Page request / view • Referrer • Cache server • Proxy server
Log Analysers • A few examples: • from Google Directory listing • What we use in ILRT: • UNIX scripts / tools • Analog [www.analog.cx] • AWStats [awstats.sourceforge.net] • WebTrends [www.webtrends.com]
What you can find out • Number of requests made to your server • When they were made • Which files were asked for • Which host asked you for them.
What you can find out (cont’d) • What people told you their browsers were • What the referring pages were You should be aware, though, that: • Many browsers deliberately lie • Users can configure the browser name • Some people use "anonymisers" which deliberately send false browsers and referrers.
What you can’t do • You can't tell the identity of your users • You can't tell how many visitors you've had • You can't tell how many visits you've had • Cookies don't solve these problems • You can't follow a person's path through your site
What you can’t do (cont’d) • You often can't tell where users entered your site, or where they linked to you from • You can't tell how they left your site, or where they went next • You can't tell how long people spent reading each page • You can't tell how long people spent on your site
Cache for questions • Cacheing proxy servers are the main problem: • if users get your pages from a local cache server, you will never know • many users can connect to your server using the same cache/proxy server • one user can appear to connect from many different hosts (eg AOL)
Trackers / Counters • A more recent innovation: • Code embedded in each of your web pages • Makes call directly to host data server • Can reveal more detail (screen size, screen colours, originating host name, etc) • Examples: • SiteStat: [www.nedstat.co.uk/] • Google Analytics: [analytics.google.com]
Further reading • How the Web works by Stephen Turner • Interpreting WWW Statistics by Doug Linder • Measuring Web Site Usage: Log File Analysisby Susan Haigh and Janette Megarity • Who Goes There? • Measuring Library Web Site Usage by Kathleen Bauer • Why Web Usage Statistics are (Worse Than) Meaningless by Jeff Goldberg.