Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Using binary search to get the last 15 minutes of httpd access log

by Anonymous Monk
on Aug 03, 2012 at 22:50 UTC ( [id://985363]=note: print w/replies, xml ) Need Help??


in reply to Using binary search to get the last 15 minutes of httpd access log

Aww, "to hell with it..." just read the file from the start. You'd spend more time futzing around with trying to find your place than you'd spend just reading every record and throwing out the ones you don't want. Sometimes, "brute force" is EXACTLY what the doctor ordered...

Replies are listed 'Best First'.
Re^2: Using binary search to get the last 15 minutes of httpd access log
by Old_Gray_Bear (Bishop) on Aug 04, 2012 at 15:42 UTC
    I had to solve the same problem (for Apache logs, too) a few years back. Brute force is fine for a small log, the logs I was parsing were growing at a gigabyte+ per minute. (We rolled logs every 100 GB or 30 minutes, which ever came first.)

    Pseudo code:

    set the current size of the log (end point) seek to the mid-position (size/2, begin point) read forward from the begin-point until a timestamp is found if the timestamp is within 5 minutes of the current time, process sequentially to the end of the log and exit else reset the begin and end points and try again.
    This gimmick ran (most of the time) in under 500 milli-seconds, and gave us enough information. The Perl implementation was fast enough (most times) that we never got around to implementing it in C. You can run into problems with slow growing logs (what happens if there is only one line in the file?), and mumungous lines (again, only one line in the file and its 55MB long!). We got around it by fiat -- if something goes sour, quit; and retry again in 30 seconds. (Yahoo, Instant Messenger, three to four terabytes of logs per day....)

    ----
    I Go Back to Sleep, Now.

    OGB

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://985363]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-03-28 15:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found