Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
I have written some code to test whether Tie::File and a binary (?:chop|search) as suggested here would actually work. I wrote a script

use strict; use warnings; use DateMunge; my @messages = ( qq{The lunatics have taken over the asylum\n}, qq{Chickens have got into the server\n}, qq{All your data is gone\n}, qq{The disk drive just wants you to know it is fine\n}, qq{Random message\n}, qq{This message is intentionally blank\n}); my $start = 1150000000; for (1 .. 29301) { my $message = $messages[rand @messages]; print dateStr(qq{%O.%T $message}, $start); unless ($_ % 23) { for (1 .. int(rand 10) + 10) { $message = $messages[rand @messages]; print dateStr(qq{%O.%T $message}, $start); } } $start += int rand 50; }

to generate a fictional log file of 47,000 + lines (about 2.5MB). The DateMunge::dateStr is something I wrote years ago before I had access to CPAN or knew about POSIX::strftime. I then wrote the code to test the solution. It seems to work fairly well, taking about 5 seconds to find the correct line in the log running on a SPARC Ultra 30 with 300MHz cpu. Currently the threshhold is hard-coded in the script but it could easily be changed to a command-line argument. Here's the code

use strict; use warnings; use Tie::File; use Fcntl q{O_RDONLY}; my $logFile = q{spw586856.log}; tie my @logLines, q{Tie::File}, $logFile, mode => O_RDONLY, autochomp => 0, or die qq{tie: $logFile: $!\n}; my $threshhold = q{2006-06-19.11:47:25}; my $threshholdIdx = -1; my $firstIdx = 0; my $lastIdx = $#logLines; if ($threshhold lt getDate($logLines[0])) { die qq{Threshhold date before range in $logFile\n}; } elsif ($threshhold gt getDate($logLines[-1])) { die qq{Threshhold date after range in $logFile\n}; } BIN_CHOP: while (1) { if ($threshhold eq getDate($logLines[$firstIdx])) { $threshholdIdx = $firstIdx; last BIN_CHOP; } my $idxDiff = $lastIdx - $firstIdx; if ($idxDiff < 2) { $threshholdIdx = $lastIdx; last BIN_CHOP; } my $midIdx = $firstIdx + int($idxDiff / 2); if ($threshhold eq getDate($logLines[$midIdx])) { STEP_LEFT: while (1) { $midIdx -- if $threshhold eq getDate($logLines[$midIdx - 1]) } $threshholdIdx = $midIdx; last BIN_CHOP; } if ($threshhold lt getDate($logLines[$midIdx])) { $lastIdx = $midIdx; next BIN_CHOP; } if ($threshhold gt getDate($logLines[$midIdx])) { $firstIdx = $midIdx; next BIN_CHOP; } die qq{Internal error, how did we get here?\n}; } die qq{Binary chop did not find threshhold\n} if $threshholdIdx == -1; print qq{Threshhold : $threshhold\n}, qq{Line No. : @{[$threshholdIdx + 1]}\n}, qq{Log msg. : $logLines[$threshholdIdx]}, qq{Prev. msg. : $logLines[$threshholdIdx - 1]\n}; print qq{Lines from threshhold onwards\n\n}, @logLines[$threshholdIdx .. $#logLines]; sub getDate { my $line = shift; my ($date) = $line =~ m{^(\S+)}; return $date; }

and here's some output

$ ls -l spw586856* -rwxr-xr-x 1 jgillman og5a 1895 Dec 1 11:11 spw586856 -rw-r--r-- 1 jgillman og5a 2507626 Dec 1 10:58 spw586856.log -rwxr-xr-x 1 jgillman og5a 696 Dec 1 10:56 spw586856makeDat +a $ wc spw586856.log 47712 332470 2507626 spw586856.log $ head -10 spw586856.log 2006-06-11.05:26:40 Random message 2006-06-11.05:27:13 This message is intentionally blank 2006-06-11.05:28:00 Chickens have got into the server 2006-06-11.05:28:33 The lunatics have taken over the asylum 2006-06-11.05:29:16 Chickens have got into the server 2006-06-11.05:29:53 The lunatics have taken over the asylum 2006-06-11.05:30:34 The disk drive just wants you to know it is fine 2006-06-11.05:30:52 The lunatics have taken over the asylum 2006-06-11.05:31:32 All your data is gone 2006-06-11.05:31:52 All your data is gone $ tail -10 spw586856.log 2006-06-19.11:51:40 Chickens have got into the server 2006-06-19.11:51:56 Random message 2006-06-19.11:52:44 Random message 2006-06-19.11:52:50 All your data is gone 2006-06-19.11:52:58 This message is intentionally blank 2006-06-19.11:53:44 The lunatics have taken over the asylum 2006-06-19.11:54:07 This message is intentionally blank 2006-06-19.11:54:35 The disk drive just wants you to know it is fine 2006-06-19.11:54:51 Chickens have got into the server 2006-06-19.11:55:33 Chickens have got into the server $ time spw586856 Threshhold : 2006-06-19.11:47:25 Line No. : 47694 Log msg. : 2006-06-19.11:47:27 This message is intentionally blank Prev. msg. : 2006-06-19.11:47:20 Chickens have got into the server Lines from threshhold onwards 2006-06-19.11:47:27 This message is intentionally blank 2006-06-19.11:47:47 The disk drive just wants you to know it is fine 2006-06-19.11:48:36 The lunatics have taken over the asylum 2006-06-19.11:49:12 All your data is gone 2006-06-19.11:49:28 This message is intentionally blank 2006-06-19.11:50:01 Chickens have got into the server 2006-06-19.11:50:10 This message is intentionally blank 2006-06-19.11:50:38 Chickens have got into the server 2006-06-19.11:51:24 Random message 2006-06-19.11:51:40 Chickens have got into the server 2006-06-19.11:51:56 Random message 2006-06-19.11:52:44 Random message 2006-06-19.11:52:50 All your data is gone 2006-06-19.11:52:58 This message is intentionally blank 2006-06-19.11:53:44 The lunatics have taken over the asylum 2006-06-19.11:54:07 This message is intentionally blank 2006-06-19.11:54:35 The disk drive just wants you to know it is fine 2006-06-19.11:54:51 Chickens have got into the server 2006-06-19.11:55:33 Chickens have got into the server real 0m5.28s user 0m4.86s sys 0m0.34s $

I chose a threshhold near the end of the file to demonstrate the printing of all lines without scrolling to death but the search seems to work on all threshholds that I tested. I hope this is of interest.

Cheers,

JohnGG


In reply to Re: print log file by johngg
in thread print log file by xiaoyafeng

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2021-10-21 19:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My first memorable Perl project was:







    Results (83 votes). Check out past polls.

    Notices?