Beefy Boxes and Bandwidth Generously Provided by pair Networks Russ
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Parsing file and joining content into string

by Mark.Allan (Sexton)
on Jan 15, 2013 at 18:06 UTC ( #1013434=perlquestion: print w/ replies, xml ) Need Help??
Mark.Allan has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I am not asking for anyone to write any code here but I am looking for good advice where to start with my issue.

I have a flat file and what I need to do is read the contents of that file (which is simple enough) but the part I cant work out is I need to join the contents of the file into one complete string between pointers in a file.

Example is an extract from the test file

# 20130115 175816.654330 modify 0;mc_ueid='mc.hq_i_aix_01.10f59802.0'; mc_modhist=[hq_i_aix_01]; repeat_count=1; END # 20130115 175817.304403 modify 0;mc_ueid='mc.hq_i_aix_01.10d41bfd.2'; mc_modhist=[hq_i_aix_01]; repeat_count=14555; END # 20130115 175817.615425 modify 0;mc_ueid='mc.hq_i_aix_01.10c99dfe.1'; mc_modhist=[hq_i_aix_01]; repeat_count=14605; END # 20130115 175818.571722 modify 0;mc_ueid='mc.hq_i_aix_01.10f58d1f.0'; mc_modhist=[hq_i_aix_01]; repeat_count=10; END

I need to start at the # of each entry and finish at the END and remove the carriage returns and join the contents of the data into one complete string keeping the ";" separators. I need to do this for each occurrence of data which falls between # and END. I hope I've explained it ok

Any help would be much appreciated"

Comment on Parsing file and joining content into string
Download Code
Re: Parsing file and joining content into string
by Old_Gray_Bear (Bishop) on Jan 15, 2013 at 18:20 UTC
    Take a look at the special variable $/ AKA $INPUT_RECORD_SEPARATOR. If you set it to 'END', then your read will return all the data a paragraph at a time. Slice 'n' dice of the paragraph is left as an exercise.

    ----
    I Go Back to Sleep, Now.

    OGB

Re: Parsing file and joining content into string
by LanX (Abbot) on Jan 15, 2013 at 18:33 UTC
    > I am not asking for anyone to write any code here but I am looking for good advice

    great! =)

    > I need to start at the # of each entry and finish at the END

    2 possible approaches

    1. you change the INPUT_RECORD_SEPARATOR to 2 linebreaks $/="\n\n"

    Iterating now with  while ( $chunk = <$INPUT>) will give you multiple lines to process in chuncks.

    2. you use the flip-flop operator '..' to read all lines between from start till end pattern

    if ( $line =~ /^# \d+ \d+\.\d+$/ .. $line =~ /^END$/ ) { $chunk.=$line; } else { process($chunk) if $chunk; $chunk='': }

    You're free to directly process($line) and to skip the '$chunk'-part completely.

    > join the contents of the data into one complete string keeping the ";" separators.

    well $chunk is a complete string now, do a regex that substitutes ";\n" with ";"

    If you process line-by-line try chomp $line to kill all '\n'

    > I need to do this for each occurrence of data which falls between # and END.

    You need to get rid of start and end line?

    Then substitute them away with patterns given. see s///

    Or extend the flip-flop expression and add  and not //

    '//' defaults to matching the last successful pattern, in your case first and last line. not excludes them

    So I'm curious to see your code and I hope others won't take the fun away! =)

    HTH!

    Cheers Rolf

Re: Parsing file and joining content into string
by flexvault (Vicar) on Jan 15, 2013 at 18:51 UTC

    Welcome Mark.Allan,

    If you loop on the file as:

    my $string = ""; while ( <$IN> ) ## $IN needs to be open { chomp; if ( $_ eq "END" ) { $string = join ( " ", split " ", $string ); # $string =~ s/$other//g; ## In case you need to remove o +ther characters print $OUT "$string\n"; ## $OUT is open output file $string = ""; next; } $string .= $_; }

    I know you didn't want code, but it was easier to type code then describe :-)

    That special use of 'join' is in the Camel book (3rd edition) on page 154. It removes whitespace before and after the string, plus reduces multiple spaces to 1. I typed this in, so there may be some typos, but it should get you going, and remember to use 'use strict; use warning;' in your code.

    Hope it helps.

    "Well done is better than well said." - Benjamin Franklin

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1013434]
Approved by Old_Gray_Bear
Front-paged by Lotus1
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2014-04-21 02:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (489 votes), past polls