Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: File Parsing and Pattern Matching

by johngg (Abbot)
on Sep 05, 2013 at 22:18 UTC ( #1052630=note: print w/ replies, xml ) Need Help??


in reply to File Parsing and Pattern Matching

Reading records in paragraph mode rather than line by line and pulling out all the information using a regex with look-aheads with the 0 or 1 quantifier.

use strict; use warnings; use 5.014; use Data::Dumper; open my $inFH, q{<}, \ <<EOD or die $!; // HEADER TAG // VERSION TAG TYPE VALUE1 EQUALS MAIN I am useless text CAUSE FAIL EFFECT ERROR ENDTYPE TYPE VALUE2 EQUALS MAIN I am useful test ENDTYPE TYPE VALUE3 EQUALS MAIN CAUSE DEGRADED ENDTYPE TYPE VALUE4 EQUALS MAIN EFFECT WARNING ENDTYPE EOD my $rxExtract = qr {(?xs) TYPE\s ( \S+ ) (?= .* (?: CAUSE\s ( \S+ ) ) )? (?= .* (?: EFFECT\s ( \S+ ) ) )? }; my %results; { local $/ = q{}; scalar <$inFH>; while ( <$inFH> ) { next unless m{$rxExtract}; $results{ $1 } = { CAUSE => defined $2 ? $2 : q{UNDEF}, EFFECT => defined $3 ? $3 : q{UNDEF}, }; } } say qq{$_:$results{ $_ }->{ CAUSE },$results{ $_ }->{ EFFECT }} for sort keys %results; print qq{\n}; print Data::Dumper ->new( [ \ %results ], [ qw{ *results } ] ) ->Sortkeys( 1 ) ->Dumpxs();

The results.

VALUE1:FAIL,ERROR VALUE2:UNDEF,UNDEF VALUE3:DEGRADED,UNDEF VALUE4:UNDEF,WARNING %results = ( 'VALUE1' => { 'CAUSE' => 'FAIL', 'EFFECT' => 'ERROR' }, 'VALUE2' => { 'CAUSE' => 'UNDEF', 'EFFECT' => 'UNDEF' }, 'VALUE3' => { 'CAUSE' => 'DEGRADED', 'EFFECT' => 'UNDEF' }, 'VALUE4' => { 'CAUSE' => 'UNDEF', 'EFFECT' => 'WARNING' } );

I hope this is of interest.

Cheers,

JohnGG


Comment on Re: File Parsing and Pattern Matching
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1052630]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (13)
As of 2014-07-31 11:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (248 votes), past polls