Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^3: web statistics

by dsheroh (Parson)
on Mar 04, 2008 at 17:00 UTC ( #671932=note: print w/ replies, xml ) Need Help??


in reply to Re^2: web statistics
in thread web statistics

Define "absolutely NOT accurate". Barring server misconfiguration, disk/filesystem failure, or deliberate tampering with the logs, I don't see any way that the web server logs can be any less than absolute in their accuracy regarding which pages were served, at what time, and to which IP addresses.

Referrers and user agents are the only things I can think of off the top of my head which go into my logs and are susceptible to spoofing by users1 and those shouldn't significantly affect the accuracy of log-based analysis.

Absolutely agreed that it's all in the interpretation, though.

1 OK, technically users could spoof their IP address as well, but that's a relatively sophisticated technique and they're not going to be able to see the returned page if they do it, so I'm comfortable with ignoring them for these purposes.


Comment on Re^3: web statistics
Re^4: web statistics
by gloryhack (Deacon) on Mar 05, 2008 at 21:57 UTC
    Okay...

    "absolutely NOT accurate" is intended in this case to mean that HTTP server logs do not accurately communicate the specific information (e.g. "Unique vistors and number of hits per day") that the presumably pointy-haired boss has asked advait to provide.

    Your HTTP server doesn't serve every page view, as there are caching proxy servers out there on the internets. Your HTTP server doesn't know my Firefox browser with HTTP_USER_AGENT suppressed is a web browser with a human behind it -- it could just as easily be a bot and is likely to be categorized as such by most heuristic methods. Your HTTP server doesn't know if hits from the TOR network are initiated by one user or one hundred users. In short, your HTTP server knows only what is requested, how it is requested, when it is requested, and by which IP address. It doesn't know unique visitors, (human vision) page views, or any of that other stuff that makes marketdroids drool.

    Thus, in context, my caveat stands.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://671932]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (7)
As of 2014-07-10 10:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (206 votes), past polls