penantes has asked for the wisdom of the Perl Monks concerning the following question:

Wise monks,

Im facing a problem.

I need to "grep" chunks of information in a log file... with lets say start marker (date)and end marker(#end trans#) of certain user=$user, eg:

2005/06/06 12:00:00 user=me func(<- start
blah blah blah
blah blah
blah
) #end trans# <- end
2005/06/06 12:01:00 user=other func(
blah blah blah
blah blah
) #end trans#

The information expressed here has "blah blah" its dynamic it can be 1 line or 20.

I've tryed to do a while inside a while but it didn't work.

Can you help me?

Thanks in advance

Replies are listed 'Best First'.
Re: Fetch data between markers
by thundergnat (Deacon) on Jun 06, 2005 at 16:25 UTC

    Set the input record separator to '#end trans#'.

    use warnings; use strict; $/ = '#end trans#'; my $user = 'me'; while (my $lines = <DATA>){ if ($lines =~ /user=$user\b/){ my @matches = $lines =~ /\bblah\b/g; print "$user - ", join ' ', @matches, "\n"; } } __DATA__ 2005/06/06 12:00:00 user=me func(<- start blah blah blah blah blah blah ) #end trans# <- end 2005/06/06 12:01:00 user=other func( blah blah blah blah blah ) #end trans#

    Update added code to filter on user name.

Re: Fetch data between markers
by DrPeter (Scribe) on Jun 06, 2005 at 18:32 UTC
    This sounds like a job for the .. operator.

    As described in the Perl Cookbook:

    while (<>){ if (/BEGIN PATTERN/ .. /END PATTERN/){ #line falls between BEGIN and END in the text. } }
    Just replace BEGIN PATTERN and END PATTERN as needed for your data.
      For those wondering why this works.....

      From the Perl Cookbook:

      The range operators, .. and ..., are probably the least understood of Perl's myriad operators. They were designed to allow easy extraction of ranges of lines without forcing the programmer to retain explicit state information. Used in scalar context, such as in the test of if and while statements, these operators return a true or false value that's partially dependent on what they last returned. The expression left_operand .. right_operand returns false until left_operand is true, but once that test has been met, it stops evaluating left_operand and keeps returning true until right_operand becomes true, after which it restarts the cycle. Put another way, the first operand turns on the construct as soon as it returns a true value, whereas the second one turns it off as soon as it returns true.

      TheStudent
        Hi there
        I read carefully everybodys opinions and found the .. operator to be the simplest way to do it. Altough, I think I didn't express myself very well since your code (in general) is very complex on the regex side altough its valid it is not necessary for the simple thing I needed to to wich I paste bellow.

        I abandoned the @array solutions because Im linking 1,2 - 1,3 giga files :)


        Nevertheless I deeply thank you all for your fast replies and for sharing with me that bit of knowledge.

        ps: I actually found fun in those while in while solutions :) geeee what a geek!

        How it will stay
        #!/usr/bin/perl use warnings; use strict; #It needs ARGS and readable file my $logfile=$ARGV[0]; my $contract=$ARGV[1]; unless (@ARGV == 2) { print "USAGE: $0 \"logfile\" \"numero\"\n"; exit(1); } my $output="/tmp/session_".$contract.".txt"; unless (-e $logfile) { print "O ficheiro $logfile usado para input n\343o existe. Verifique +o nome do mesmo sff.\n"; exit(1); } if (-e $output) { unlink($output); } open(OUTPUT,">>",$output) or die("Could not open conf file."); open(LOG, $logfile) or die("Could not open conf file."); while (<LOG>) { if (/:NOTICE:user=$contract,session=\d+:/ .. /\)\[/) { print OUTPUT $_ } } close(LOG); close(OUTPUT);
        Actually I allready had acomplish that result using a marker with a var and is actually 7 seconds faster than the .. operator, but, I have deprecate it because is better coded with the above script (IMHO).
        open(OUTPUT,">>",$output) or die("Could not open conf file."); open(LOG, $logfile) or die("Could not open conf file."); while (defined($line = <LOG>) ) { if ($line =~/:NOTICE:user=$contract,session=(\d+):/) { print OUTPUT $line; $echo=1; } elsif ($echo == 1) { print OUTPUT $line; if ($line =~/\)\[/) { $echo=0; } } }
      This sounds like a job for the .. operator.

      Here's a possible implementation (using an AoA):

      use strict; use warnings; my $start = qr/ (\d+\/\d+\/\d+) # date \s+ (\d+:\d+:\d+) # time \s+ user=([^\(]+) # user /x; my $end = qr/^\)\s*#[^#]+#/; #U2 my ( @parsed, @items ); while ( <DATA> ) { chomp; my $j = /$start/ .. /$end/; next unless $j; @items = /$start/ if $j == 1; push @items, $_ if $j > 1 and $j !~ /E/; if ( $j =~ /E/ ) { #U $j = 0, next unless /#end trans#/; #U push @parsed, [ @items ]; #U } #U } # Print it: foreach my $aref ( @parsed ) { print "$_\n" for @$aref; print "\n"; } __DATA__ 2005/06/06 12:00:00 user=me func( blah blah blah blah blah blah ) #end trans# 2005/06/06 12:01:00 user=other func( blah blah blah blah blah ) #end trans# 2005/06/06 12:01:00 user=yet other func( blah blah blah blah bl... OOOOPS! ) #trans aborted# 2005/06/06 13:01:00 user=yet ANother func( blah blah blah foo bar baz ) #end trans#

      Update: Modified lines marked #U to take into account revised DATA (added another 'field' at the end). Previous code was:

      push @parsed, [ @items ] if $j =~ /E/;

      Update 2: Oops, sorry, just realised I forgot to update the line marked #U2. Now this is done, my previous update makes sense, I think...

Re: Fetch data between markers
by davidrw (Prior) on Jun 06, 2005 at 16:30 UTC
    What's the while inside while code you tried? (we can advise if it's minor to fix or a compare it to other solutions.)

    As for other solutions, the first question i think is if you can slurp in the whole log file or not. Assuming you can't due to size, you were on the right track with a while loop... perhaps something like:
    while(my $s1 = <LOG>){ next unless $s1 =~ m#^(\d+/\d+/\d+) (\d+:\d+:\d+) user=(\S+) func\(# +; my ($date, $time, $user) = ($1, $2, $3); ... # above variables are available while(my $s2 = <LOG>){ last if $s2 =~ /^\) #end trans#/; ... # $s2 is log data } }
    Note you could do this w/a single while loop (as opposed to nested ones) if you used a boolean or two to keep track of being inside a log data section or not..

    Update: I like the above solution of $/ = '#end trans#' -- my original thought was to split() on the start line, but that would required the whole log in memory, so i switched back to the while/while method.. hopefully it will at least serve as a guide is debugging your while loop attempt.
Re: Fetch data between markers
by TedPride (Priest) on Jun 06, 2005 at 17:26 UTC
    You weren't specific as to whether you wanted the start marker to be date or user. I've assumed here that it's user, and that you only care about the associated data:
    use strict; use warnings; my $user = 'other'; my $val; while (<DATA>) { if (m/^\d+\/\d+\/\d+ \d+:\d+:\d+ user=$user func\(/) { $val .= $_ while ($_ = <DATA>) !~ /\) #end trans#/; last; } } print $val; __DATA__ 2005/06/06 12:00:00 user=me func( blah blah blah blah blah blah ) #end trans# 2005/06/06 12:01:00 user=other func( blah blah blah blah blah ) #end trans#