Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Parsing using m//g

by pbeckingham (Parson)
on Sep 25, 2006 at 15:48 UTC ( #574763=perlquestion: print w/ replies, xml ) Need Help??
pbeckingham has asked for the wisdom of the Perl Monks concerning the following question:

Can someone help? I have given myself the challenge of doing some simple parsing, but in a complex way. Without focusing on why I choose to do this, can someone guide me towards a viable solution? Given the following input:

name1=value1 name2 = value2
This code parses it:
while (<$input>) { chomp; next if /^ \s* #/; next if /^ \s* $/; if (/^ \s* ([^=\s]+) \s* = \s* (.+) $/x) { # name is in $1, value is in $2 } }
That's not the question though. The question is, how would I parse the following:
name1=value1 name2 = value2 name3 = value3 but wait, there is more name4= value4
With Perl that has the form:
my $contents = do {local $/; <$input>}; while ($contents =~ / ANSWER_HERE /msg) { # name is in $1, value is in $2 }
Specifically, I want to use the //g form, to iterate over the string, and not perform a line-by-line parse, as in the first example. My attempts have thus far failed. The closest I got (without success) was:
my $contents = do {local $/; <$input>}; my $name = qr/\s* [^=\s]+ \s*/x; while ($contents =~ /^ ($name) = \s* (.+) (?= ^ $name = | $ ) /msg +x) { # name is in $1, value is in $2 }



pbeckingham - typist, perishable vertebrate.

Comment on Parsing using m//g
Select or Download Code
Re: Parsing using m//g
by ikegami (Pope) on Sep 25, 2006 at 16:00 UTC
    my $contents = do { local $/; <DATA> }; while ($contents =~ / \s* ([^=\s]+) \s* = \s* ( (?: (?! \s* (?: [^=\s]+ \s* = | $ ) ) . )* ) /xmsg ) { print("[$1 => $2]\n"); } __DATA__ name1=value1 name2 = value2 name3 = value3 but wait, there is more name4= value4

    Ouputs

    [name1 => value1] [name2 => value2] [name3 => value3] [name4 => value4]

    Update: The above works by never allowing bad data in the value. The following is an alternate solution that works by starting with an empty value, and extending it as much as possible.

    my $contents = do { local $/; <DATA> }; while ($contents =~ / \s* ([^=\s]+) \s* = \s* (.*?) # Extend the value. (?= \s* (?: [^=\s]+ \s* = | $ ) ) /xmsg ) { print("[$1 => $2]\n"); }

      To be correct, your output would have to be:

      [name1 => value1] [name2 => value2] [name3 => value3 but wait, there is more] [name4 => value4]



      pbeckingham - typist, perishable vertebrate.

        Simply change /.../xmsg to /.../xsg.
        and/or
        Simply change $ to \z.

        Update: Added the second (and better) option.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://574763]
Approved by herveus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2014-09-23 11:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (219 votes), past polls