Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

addn'l help with parsing here doc

by smackdab (Pilgrim)
on Oct 17, 2003 at 02:09 UTC ( #299923=perlquestion: print w/replies, xml ) Need Help??
smackdab has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I had some great help the other day from jonadab and others and would like a little more on adapting this code...

It works great for parsing a file, but sometimes I have already read the file into a $buffer and need to do the same processing.

If there is a way to easily convert the neat "local $/" file-handle trick, I don't see it. Do I need to somehow keep track of the character count in the $buffer?

The code parses k=v and heredoc config files

while (<CONFIG>) { if (/^\s*#/) { # ignore comment line } elsif (/^\s*$/) { # ignore blank line } elsif (/(\w+)\s*=\s*[<]{2}(\w+)/) { # heredoc (my $name, local $/) = ($1, "\n$2"); # ++ysth $config{$name} = <CONFIG>; chomp $config{$name}; # as etcshadow points out. } elsif (/(\w+)\s*=\s*(.*?)\s*$/) { # regular pair $config{$1}=$2; } else { warn "Ptooey: Could not parse config line: $_\n"; } }

Replies are listed 'Best First'.
Re: addn'l help with parsing here doc
by graff (Chancellor) on Oct 17, 2003 at 03:29 UTC
    Given that some entire config file has been loaded into $buffer, it would seem easiest to split that into lines, and then behave pretty much the same way as reading from a file, except that "local $/" is of no use in this case. The following is untested:
    @lines = split /\n/, $buffer; my ( $name, $end ) = ( '', '' ); for ( @lines ) { next if (/^\s*#/ or /^\s*$/ ); # skip comments, blank lines if ( $name ) { if ( /^$end$/ ) { chomp $config{$name}; #remove trailing "\n" $name = ''; } else { $config{$name} .= "$_\n"; } } elsif ( /(\w+)\s*=\s*(.*?)\s*$/ ) { # regular pair $config{$1} = $2; } elsif ( /(\w+)\s*=\s*<<(\w+)/ ) { # heredoc ( $name, $end ) = ( $1, $2 ); } else { warn "Ptooey: Could not parse config line: $_\n"; } }
    It's a little grotty, in the sense that you have to put the "within a HERE doc" behavior first in the "if...elsif..." series, because who knows whether/when the contents of a here-doc might trigger a false-alarm match on one of the other conditions. Also, as written above, there's nothing to warn about a here-doc that is not terminated at the last line in $buffer -- but that should be easy to figure out.
Re: addn'l help with parsing here doc
by etcshadow (Priest) on Oct 17, 2003 at 03:46 UTC
    You can outright replace any logic that goes like:
    while(<FILE>) { ...
    with this:
    for (split "(?<=\Q$/\E)", $contents_of_FILE) { ...
    and it does exactly the same thing.

    Which just means: split the content of the file on the zero-width positive lookbehind assertion of $/ (the input record separator). This is subtley different from just splitting on $/ (or, more safely, splitting on \Q$/\E), in that splitting on $/, itself, removes $/ from the output of the split... whereas splitting on the zero-width positive lookbehind assertion leaves it in. (This is because what is being splitted on is the empty string following each occurence of $/, so that is the thing that gets removed).

    Anyway, this may not be the best way to deal with the situation of your problem, but it is the most general solution for dropping in a replacement of a <FILE> loop with some sort of loop over the contents of FILE.

    Not an editor command: Wq

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://299923]
Approved by jdtoronto
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (2)
As of 2017-03-26 22:53 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (315 votes). Check out past polls.