Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Parsing using m//g

by pbeckingham (Parson)
on Sep 25, 2006 at 15:48 UTC ( #574763=perlquestion: print w/replies, xml ) Need Help??
pbeckingham has asked for the wisdom of the Perl Monks concerning the following question:

Can someone help? I have given myself the challenge of doing some simple parsing, but in a complex way. Without focusing on why I choose to do this, can someone guide me towards a viable solution? Given the following input:

name1=value1 name2 = value2
This code parses it:
while (<$input>) { chomp; next if /^ \s* #/; next if /^ \s* $/; if (/^ \s* ([^=\s]+) \s* = \s* (.+) $/x) { # name is in $1, value is in $2 } }
That's not the question though. The question is, how would I parse the following:
name1=value1 name2 = value2 name3 = value3 but wait, there is more name4= value4
With Perl that has the form:
my $contents = do {local $/; <$input>}; while ($contents =~ / ANSWER_HERE /msg) { # name is in $1, value is in $2 }
Specifically, I want to use the //g form, to iterate over the string, and not perform a line-by-line parse, as in the first example. My attempts have thus far failed. The closest I got (without success) was:
my $contents = do {local $/; <$input>}; my $name = qr/\s* [^=\s]+ \s*/x; while ($contents =~ /^ ($name) = \s* (.+) (?= ^ $name = | $ ) /msg +x) { # name is in $1, value is in $2 }

pbeckingham - typist, perishable vertebrate.

Replies are listed 'Best First'.
Re: Parsing using m//g
by ikegami (Pope) on Sep 25, 2006 at 16:00 UTC
    my $contents = do { local $/; <DATA> }; while ($contents =~ / \s* ([^=\s]+) \s* = \s* ( (?: (?! \s* (?: [^=\s]+ \s* = | $ ) ) . )* ) /xmsg ) { print("[$1 => $2]\n"); } __DATA__ name1=value1 name2 = value2 name3 = value3 but wait, there is more name4= value4


    [name1 => value1] [name2 => value2] [name3 => value3] [name4 => value4]

    Update: The above works by never allowing bad data in the value. The following is an alternate solution that works by starting with an empty value, and extending it as much as possible.

    my $contents = do { local $/; <DATA> }; while ($contents =~ / \s* ([^=\s]+) \s* = \s* (.*?) # Extend the value. (?= \s* (?: [^=\s]+ \s* = | $ ) ) /xmsg ) { print("[$1 => $2]\n"); }

      To be correct, your output would have to be:

      [name1 => value1] [name2 => value2] [name3 => value3 but wait, there is more] [name4 => value4]

      pbeckingham - typist, perishable vertebrate.

        Simply change /.../xmsg to /.../xsg.
        Simply change $ to \z.

        Update: Added the second (and better) option.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://574763]
Approved by herveus
[karlgoethebier]: https://de. wiki/Bembel#/ media/File: Bembel_Zum_blauen_ Bock_1978.jpg
[choroba]: as my friend says, Ich stehe Deutsche ver nicht aber klein ;-)
[Corion]: Krug is generic pottery ware for pouring beverages, but a Bembel is painted with that blue paint and contains only (!) Apfelwein
[shmem]: Corion: alas, air too.
[Corion]: shmem: Yes, very sad!
[choroba]: doesn't sound much useful, which means I'll probably remember it
[karlgoethebier]: "that blue pain" == Salzglasur as far as i remember
[karlgoethebier]: Paint!
[shmem]: karlgoethebier the blue pain comes after all bembels are empty

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (10)
As of 2017-05-24 08:55 GMT
Find Nodes?
    Voting Booth?