Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Parsing file and joining content into string

by Mark.Allan (Sexton)
on Jan 15, 2013 at 18:06 UTC ( [id://1013434]=perlquestion: print w/replies, xml ) Need Help??

Mark.Allan has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I am not asking for anyone to write any code here but I am looking for good advice where to start with my issue.

I have a flat file and what I need to do is read the contents of that file (which is simple enough) but the part I cant work out is I need to join the contents of the file into one complete string between pointers in a file.

Example is an extract from the test file

# 20130115 175816.654330 modify 0;mc_ueid='mc.hq_i_aix_01.10f59802.0'; mc_modhist=[hq_i_aix_01]; repeat_count=1; END # 20130115 175817.304403 modify 0;mc_ueid='mc.hq_i_aix_01.10d41bfd.2'; mc_modhist=[hq_i_aix_01]; repeat_count=14555; END # 20130115 175817.615425 modify 0;mc_ueid='mc.hq_i_aix_01.10c99dfe.1'; mc_modhist=[hq_i_aix_01]; repeat_count=14605; END # 20130115 175818.571722 modify 0;mc_ueid='mc.hq_i_aix_01.10f58d1f.0'; mc_modhist=[hq_i_aix_01]; repeat_count=10; END

I need to start at the # of each entry and finish at the END and remove the carriage returns and join the contents of the data into one complete string keeping the ";" separators. I need to do this for each occurrence of data which falls between # and END. I hope I've explained it ok

Any help would be much appreciated"

Replies are listed 'Best First'.
Re: Parsing file and joining content into string
by LanX (Saint) on Jan 15, 2013 at 18:33 UTC
    > I am not asking for anyone to write any code here but I am looking for good advice

    great! =)

    > I need to start at the # of each entry and finish at the END

    2 possible approaches

    1. you change the INPUT_RECORD_SEPARATOR to 2 linebreaks $/="\n\n"

    Iterating now with  while ( $chunk = <$INPUT>) will give you multiple lines to process in chuncks.

    2. you use the flip-flop operator '..' to read all lines between from start till end pattern

    if ( $line =~ /^# \d+ \d+\.\d+$/ .. $line =~ /^END$/ ) { $chunk.=$line; } else { process($chunk) if $chunk; $chunk='': }

    You're free to directly process($line) and to skip the '$chunk'-part completely.

    > join the contents of the data into one complete string keeping the ";" separators.

    well $chunk is a complete string now, do a regex that substitutes ";\n" with ";"

    If you process line-by-line try chomp $line to kill all '\n'

    > I need to do this for each occurrence of data which falls between # and END.

    You need to get rid of start and end line?

    Then substitute them away with patterns given. see s///

    Or extend the flip-flop expression and add  and not //

    '//' defaults to matching the last successful pattern, in your case first and last line. not excludes them

    So I'm curious to see your code and I hope others won't take the fun away! =)

    HTH!

    Cheers Rolf

Re: Parsing file and joining content into string
by Old_Gray_Bear (Bishop) on Jan 15, 2013 at 18:20 UTC
    Take a look at the special variable $/ AKA $INPUT_RECORD_SEPARATOR. If you set it to 'END', then your read will return all the data a paragraph at a time. Slice 'n' dice of the paragraph is left as an exercise.

    ----
    I Go Back to Sleep, Now.

    OGB

Re: Parsing file and joining content into string
by flexvault (Monsignor) on Jan 15, 2013 at 18:51 UTC

    Welcome Mark.Allan,

    If you loop on the file as:

    my $string = ""; while ( <$IN> ) ## $IN needs to be open { chomp; if ( $_ eq "END" ) { $string = join ( " ", split " ", $string ); # $string =~ s/$other//g; ## In case you need to remove o +ther characters print $OUT "$string\n"; ## $OUT is open output file $string = ""; next; } $string .= $_; }

    I know you didn't want code, but it was easier to type code then describe :-)

    That special use of 'join' is in the Camel book (3rd edition) on page 154. It removes whitespace before and after the string, plus reduces multiple spaces to 1. I typed this in, so there may be some typos, but it should get you going, and remember to use 'use strict; use warning;' in your code.

    Hope it helps.

    "Well done is better than well said." - Benjamin Franklin

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1013434]
Approved by Old_Gray_Bear
Front-paged by Lotus1
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2024-04-19 19:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found