Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: get n lines before or after a pattern

by davido (Archbishop)
on Jul 25, 2012 at 16:33 UTC ( #983698=note: print w/replies, xml ) Need Help??

in reply to get n lines before or after a pattern

When you hear yourself saying "I need to know what comes n lines before XYZ", you should be thinking "I need to stash n previous lines while I iterate through the file." When you hear yourself saying, "I need to know what comes after XYZ until PDQ is found.", you should be thinking of how to identify state (ie, how to keep track of having found the trigger). You can keep track of state with a variable, or you can do it by flowing into a different branch of code. This snippet accomplishes your goal by stashing two lines at all times (clearing them only after XYZ is found), and by flowing into a different branch when XYZ has been found, until PDQ shows up.

As I mentioned above, this is one of several common ways of dealing with state.

use strict; use warnings; my $find = 'jack'; my $trigger_re = qr{^name\s+$find\b}; my $finally_re = qr(^lastname\s+\p{Alpha}+\b); my @stash; while( my $line = <DATA> ) { chomp $line; if( $line =~ $trigger_re ) { print "$_\n" for @stash; @stash = (); print $line, "\n"; while ( my $next = <DATA> ) { if( $next =~ $finally_re ) { print $next; last; } } } else { push @stash, $line; while( @stash > 2 ) { shift @stash; } } } __DATA__ start id 10 address Richmond name jack xxxxx aaaaa lastname black yyyy zzzzz id 11 address Central name rick cccccc dddddd lastname hanna eeeee yyyyy id 12 address denver name jack sssss tttttt lastname strong rrrrr mmmmm id 13 address Virginia name mick aaaaaaa ooooooo lastname jagger gggggg hhhhhh id 14 address Maine name rick sssss sssss lastname stewart ssssss ffffff end

The output is...

id 10 address Richmond name jack lastname black id 12 address denver name jack lastname strong

If the stash hasn't received two lines ahead of "name jack", it will quietly just print however many it accumulated (max 2). If the "lastname" never shows up, it will quietly flow through the end of the file. This may not be what you want; it's possible that you'll want to just carp about a malformed record the moment the next "name" shows up. That's pretty easy to implement, so I'll leave it to you if you find it advantageous. Similarly, it's a simple check to verify that two lines are stored in @stash prior to printing, and it would be easy to carp a warning about a malformed record there as well.

I build the regexes outside of the loop just to keep the loop code as simple (and general) as possible. This has the added efficiency benefit of assuring that the regex that contains variable interpolation will only be compiled once rather than each time through the loop.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://983698]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2017-11-21 22:31 GMT
Find Nodes?
    Voting Booth?
    In order to be able to say "I know Perl", you must have:

    Results (312 votes). Check out past polls.