Re: grab 'n' lines from a file above and below a /match/
by Aristotle (Chancellor) on Sep 16, 2004 at 22:39 UTC
|
Well, you'll have to store the lines. And it's going to be tricky to correctly handle matches where the context overlaps, ie where one match follows less than 2n lines from the previous.
The easiest thing to do is use the toolbox: egrep -C n c9391b56-b174-441b-921c-7d63 GWSvc.log
Update: the following should work and handle all edge cases:
my @backlog;
my $to_print = 0;
my $context_size = 10;
while(<>) {
$to_print = 1 + $context_size if /c9391b56-b174-441b-921c-7d63/;
push @backlog, $_;
if( $to_print ) {
print shift @backlog;
--$to_print;
}
elsif( @backlog > $context_size / 2 ) {
shift @backlog;
}
}
print shift @backlog while @backlog and $to_print--;
Makeshifts last the longest.
| [reply] [d/l] |
|
my $context = 4;
my @buffer = ('') x $context;
my $print_to = 0;
my $match = qr/42/;
while(<DATA>) {
if ( m/$match/ ) {
print @buffer;
$print_to = $. + $context;
@buffer = ('') x $context;
}
push @buffer,$_; shift @buffer;
print if $. <= $print_to;
}
| [reply] [d/l] |
|
I have the gateway running at full logging level, thats the only time I am going to be seeing these params (this particular one is actually a SIP subscribe request ID, additional parts of which I have removed). I will be running into such entries probably once in 100 lines. So, I am not really worried about the overlap part of your answer.
As for egrep - its windows, so not available to me - plz help
| [reply] |
|
| [reply] |
|
|
Hey, On a lighter note - I now realize that I was operating on $_ and trying to play with $., what I dont understand is where would I use $. -> any typical situations where $. might come in handy you might find this to be a newbie question, but I am still learning perl and am just curious. thanks again
| [reply] |
|
$. is just the number of the last line read from the last accessed filehandle. It's useful any time you want to know the line number. It does nothing beyond that; in particular, writing to it has no effect at all, other than that its value changes.
Makeshifts last the longest.
| [reply] |
|
Hey this works too !!
thanks a lot
| [reply] |
Re: grab 'n' lines from a file above and below a /match/
by Zaxo (Archbishop) on Sep 16, 2004 at 22:47 UTC
|
You're trying to do that by manipulating $., but that won't read more lines. Since you want to keep lines from before the match, you'll need to buffer them. Here's one way,
my $n = 10; # , say
my @lines;
{
local $_;
while (<LOG>) {
push @lines, $_;
if (/c9391b56-b174-441b-921c-7d63/) {
# push @lines, (<LOG>)[0..$n-1] and last;
# better,
while (<LOG>) {
push @lines, $_;
last if @lines > 2*$n;
}
last;
}
else {
shift @lines while @lines > $n;
}
}
}
Update: improved the code to not read the rest of the file after a match is found.
| [reply] [d/l] |
|
I dont see any output using it, I just tried it against a smaller text file and that doesnt give any output either ..
| [reply] |
|
| [reply] [d/l] |
|
Re: grab 'n' lines from a file above and below a /match/
by borisz (Canon) on Sep 16, 2004 at 22:47 UTC
|
untested. On unix grep can do exactly this.
my $MAX = 5;
local $_;
OUT: while ( defined ( $_ = <LOG> ) ){
push @lines, $_;
shift @lines if @lines > $MAX;
if ( /bla bla/ ) {
print @lines, $_;
for ( 1 .. $MAX ) {
last OUT unless defined ( $_ = <LOG> );
print;
}
}
}
| [reply] [d/l] |
|
Thanks a ton Boris :) works as advertised !!!
| [reply] |
Re: grab 'n' lines from a file above and below a /match/
by been42 (Curate) on Sep 17, 2004 at 03:50 UTC
|
I know that I'm showing up a little bit late to the party, but wouldn't Tie::File work for this?
use strict;
use Tie::File;
# some variables get set up here since we're using strict (wink)
tie @lines, 'Tie::File', 'GWSvc.log', memory=>$some_small_number;
for ($i=0; $i<$#lines; $i++) {
if (/c9391b56-b174-441b-921c-7d63/) {
for ($j=$i-5; $j <= $i+5; $j++) {
print $lines[$j];
}
}
}
I'm sure there are a million ways to make it look cleaner, but I'm also very sleepy right now. This seems like it would solve the problem, though. I'm really a big fan of Tie::File after having been 'corrected' on my non-usage of it not too long ago. Now I find uses for it everywhere. | [reply] [d/l] |
Re: grab 'n' lines from a file above and below a /match/
by mrpeabody (Friar) on Sep 17, 2004 at 04:35 UTC
|
Obligatory Tie::File solution. I haven't done any benchmarks, but I would guess it's as fast as the other Perl solutions while being less memory-intensive.
As others have said, /bin/grep is the way to go here.
#!/usr/bin/perl
use strict;
use warnings;
use Tie::File;
use Fcntl 'O_RDONLY';
my $DEBUG = 0;
my $text = qr/c9391b56-b174-441b-921c-7d63/;
my $file = 'GWSvc.log';
my $context = 3;
sub dprint { print @_ if $DEBUG };
my @lines;
tie @lines, 'Tie::File', $file, mode => O_RDONLY
or die "tie failed: $!";
for (my $i = 0; $i <= $#lines; $i++) {
dprint "SCAN: line $i\n";
if ($lines[$i] =~ /$text/) {
dprint "MATCH at line $i\n";
my $start = $i - $context;
if ($start < 0) {
$start = 0;
};
my $end = $i + $context;
for my $j ($start .. $end) {
dprint "$j: ";
print "$lines[$j]\n";
};
print "\n";
$i += $context;
};
};
| [reply] [d/l] |
|
It's actually slower and more memory intensive than any of the other solutions. Tie::File internally keeps a list of byte offsets for all the lines, and it needs lot of additional overhead that is supposed to optimize writes which you never make any use of.
Your code also doesn't get the edge cases right: if there's a match within less than $context lines of the previous, it will be missed.
You gave me an idea with regards to memory consumption, though:
#!/usr/bin/perl
use strict;
use warnings;
use Fcntl qw( :seek );
my $rx = qr/c9391b56-b174-441b-921c-7d63/;
my $to_print = 0;
my $context = 10;
my @offs = ( 0 ) x ( 1 + $context );
while(<>) {
my $context_start = shift @offs;
my $here = tell ARGV;
push @offs, $here;
if( /$rx/ ) {
if( not $to_print ) {
my $length = $here - $context_start;
seek ARGV, $context_start, SEEK_SET;
read ARGV, $_, $length;
}
$to_print = 1 + $context;
}
--$to_print, print if $to_print;
}
This only needs to keep $context offsets in memory.
Update: fixed bugs. It was ( 0 ) x $context which gave one too few lines of before-context and $here - $context_start + length which of course ate too much input — but that wasn't obvious with my test data. Oopsie.
Makeshifts last the longest.
| [reply] [d/l] |
|
It's actually slower and more memory intensive than any of the other solutions. Tie::File internally keeps a list of byte offsets for all the lines, and it needs lot of additional overhead that is supposed to optimize writes which you never make any use of.
Oops. Guessed wrong, then.
Your code also doesn't get the edge cases right: if there's a match within less than $context lines of the previous, it will be missed.
That was intentional, and it depends on your definition of "missed". That hit will be printed with the context of the previous hit. Changing the behavior would just require removing the line:
$i += $context;
| [reply] [d/l] |
Re: grab 'n' lines from a file above and below a /match/
by TedPride (Priest) on Sep 17, 2004 at 07:48 UTC
|
The solution is to store the last 5 lines visited in an array, and also have a variable tracking how many more lines beyond the current one have to be output:
$x = 5; # Number of lines to print above and below.
open (LOG, "GWSvc.log") || die "Unable to get a handle to the file: $!
+\n";
$after = 0;
while (<LOG>) {
if ($after) {
print $_; $after--;
}
else {
push (@lines, $_);
if ($#lines > $x) { shift(@lines); }
}
if (/c9391b56-b174-441b-921c-7d63/) {
print $line while ($line = shift(@lines));
$after = $x;
}
}
| [reply] [d/l] |
|
No offense, but s/The solution/A solution/. There is more than one way..
| [reply] |
Re: grab 'n' lines from a file above and below a /match/
by cosimo (Hermit) on Sep 17, 2004 at 06:52 UTC
|
| [reply] |
Re: grab 'n' lines from a file above and below a /match/
by DrHyde (Prior) on Sep 17, 2004 at 15:49 UTC
|
I have a rather stupid question
There are no stupid questions, only stupid ways of asking them, and you didn't do that so that's OK.
The solution you are looking for is, I suspect, to read a line at a time, populating an array of 2n+1 entries (n above the line, plus the line, plus n below the line). Once the array is full, shift the first entry out of it, and push a new entry onto the end. When the *middle* entry in the array matches your desired string, dump the whole array to the screen. Something like this ...
open(FILE, 'file') || die("Yaroo!\n");
my $N = 3; # 3 lines above and below
my $target = 'stuff what you want';
my @window = ('') x ($N + $N + 1);
while(<FILE>) {
shift @window;
push @window, $_;
print @window, "\n\n" if($window[$N] =~ /$target/);
}
foreach(1 .. $N) {
shift @window;
print @window, "\n\n" if($window[$N] =~ /$target/);
}
update: I should have explained, the foreach loop copes with the case where the target text appears within the last N lines of the file. | [reply] [d/l] [select] |
Re: grab 'n' lines from a file above and below a /match/
by Anonymous Monk on Sep 17, 2004 at 20:19 UTC
|
# In fat crayon mode for your pleasure
# spoofing your big log file handle as a little array for brevity
my @logs = ();
for(1..20){push(@logs, $_)}
my $lookBack = 2; # the number of lines behind the match you want
my $lookAhead = 2; # the number of lines ahead the match you want
my $matchReg = '9'; # what your matching on (can be inlined)
my $matched = 0; # flag when we have a match
my @buffer = (); # your lookBack buffer
# Iterate through the logs looking for match
foreach my $line (@logs) {
# Feed the buffer
push(@buffer, $line);
# Trim the buffer
shift(@buffer) if $#buffer > ($lookBack-1);
# Do stuff if this line matches
if ($line =~ /$matchReg/) {
# Print the buffer
for(@buffer){print "lookback: $_\n"}
# Wave a flag we have a match
$matched = $lookAhead;
# Nothing else to see here, move along
next;
}
# Still working the lookAhead from a prior match?
if($matched){
# one less match
--$matched;
print "LookAhead: $line\n";
}
}
| [reply] [d/l] |