Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^2: How best to strip text from a file?

by Anonymous Monk
on Nov 05, 2012 at 14:38 UTC ( #1002334=note: print w/ replies, xml ) Need Help??


in reply to Re: How best to strip text from a file?
in thread How best to strip text from a file?

I have a similar but different problem. Say I have a file with a list of records, all have at least one field "FOO:" "BAR" and "BAZ" are optional fields. Each value may be multi line and the new lines are't consistent between variables e.g.

FOO: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore BAR: 2012 BAZ: 1234-567-890 FOO: test BAZ: 0987-654-321 FOO: test2 BAR: 2014
I'm having a hard time getting my head around regexes, and help would be appreciated.


Comment on Re^2: How best to strip text from a file?
Download Code
Re^3: How best to strip text from a file?
by Corion (Pope) on Nov 05, 2012 at 14:49 UTC

    Where does one record end and the next record start?

    If FOO: marks the start of a new record, I wouldn't try to collect everything with one regular expression but go through the input line by line, and either set up a new field name into which to collect, or flush the current set of data once a new starting marker has been found:

    use strict; use Data::Dumper; my %record; sub flush { print Dumper \%record; %record = (); }; my $current; while (<DATA>) { if( /^(FOO):(.*)/ ) { flush() if keys %record; $current = $1; $record{ $current }.= $2; } elsif( /^([A-Z]+):(.*)/ ) { $current = $1; $record{ $current }.= $2; } else { $record{ $current }.= $_; }; }; flush() if keys %record; __DATA__ FOO: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore BAR: 2012 BAZ: 1234-567-890 FOO: test BAZ: 0987-654-321 FOO: test2 BAR: 2014

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1002334]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (7)
As of 2015-07-05 09:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (61 votes), past polls