Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

string parsing with split

by MeatLips (Novice)
on Jan 24, 2011 at 20:35 UTC ( #883994=perlquestion: print w/ replies, xml ) Need Help??
MeatLips has asked for the wisdom of the Perl Monks concerning the following question:

Here's a question for the wise monks... I have this single line string:

key1=val1 key2=val2 key3=val3 key4="val4a val4b" key5="val5key=(0 1 2 3)" key6=(val6a val6b)

I want to parse that into an array so I can have something simple like this:

foreach my $x (@array) { print "$x\n"; }

return the following:

key1=val1 key2=val2 key3=val3 key4="val4a val4b" key5="valkey=(0 1 2 3)" key6=(val6a val6b)

I've been tearing my hair out, poring over various regex texts, trying various ways to use 'split', anything to figure out how to split this pig up the way I want it. What would the wise monks here suggest?

Comment on string parsing with split
Select or Download Code
Re: string parsing with split
by BrowserUk (Pope) on Jan 24, 2011 at 20:42 UTC

    Use a lookahead:

    $s = q[key1=val1 key2=val2 key3=val3 key4="val4a val4b" key5="val5key= +(0 1 2 3)" key6=(val6a val6b)];; @a = split " (?=key)", $s;; print for @a;; key1=val1 key2=val2 key3=val3 key4="val4a val4b" key5="val5key=(0 1 2 3)" key6=(val6a val6b)

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      This would seem to work if I literally had the word "key" in the string. I was using that as an example. The actual string has different words that are not the same.

      Here's something that looks a bit closer to the actual string:

      platform=linux hpfFamily=hwseries npsFamily=SWseries rackcnt=1 SPAs=1 SPUpSPA=6 CPUs=6 shrParts="/part1 /home/part2" maintIface="eth2" DEncpSPA=4 nDEncs=4 npsMName="SystemName" portnums="ports=(0 1 2 3)" ints=(eth0 eth1)
        Here's something that looks a bit closer to the actual string:

        Then here's something that may be a bit closer to a working solution for you?

        $s = q[platform=linux hpfFamily=hwseries npsFamily=SWseries rackcnt=1 +SPAs=1 SPUpSPA=6 CPUs=6 shrParts="/part1 /home/part2" maintIface="eth +2" DEncpSPA=4 nDEncs=4 npsMName="SystemName" portnums="ports=(0 1 2 3 +)" ints=(eth0 eth1)];; print for split ' (?=\w+=)', $s;; platform=linux hpfFamily=hwseries npsFamily=SWseries rackcnt=1 SPAs=1 SPUpSPA=6 CPUs=6 shrParts="/part1 /home/part2" maintIface="eth2" DEncpSPA=4 nDEncs=4 npsMName="SystemName" portnums="ports=(0 1 2 3)" ints=(eth0 eth1)

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: string parsing with split
by jethro (Monsignor) on Jan 24, 2011 at 20:57 UTC
    I think Text::CSV is a well known solution to such problems
      It's not up to the task. It can be configured to allow quoting to start in the middle of a field (loose quotes), but it doesn't support multiple quoting characters or balanced quotes (start quote character (e.g. "(") different than end quote character (")")).
Re: string parsing with split
by ikegami (Pope) on Jan 24, 2011 at 20:58 UTC
    Another way,
    my @pairs = / \G \s* ( [^=]+ = (?: " [^"]* " | \( [^)]* \) | \S* ) ) /xg;
Re: string parsing with split
by AnomalousMonk (Abbot) on Jan 24, 2011 at 22:27 UTC

    Here's my take on the problem. It has the advantage, IMHO, of being more easily adaptable to changing requirements because it is more modular.

    Notes:

    • The regex uses  \x22 in place of  " (double-quote) to avoid Windoze command-line escape-ology.
    • A quoted string cannot contain any sort of double-quote, escaped or otherwise.
    • A parenthetic group cannot contain a  ')' (right-paren).
    (Sorry for the line-wrap.)

    >perl -wMstrict -le "my $s = 'key1=val1 key2=val2 key3=val3 key4=\"val4a val4b\" ' . 'key5=\"val5key=(0 1 2 3)\" key6=(val6a val6b)' ; ;; my $key = qr{ [[:alpha:]] [[:alnum:]]+ }xms; my $val = qr{ [[:alpha:]] [[:alnum:]]+ }xms; my $d_quo = qr{ \x22 [^\x22]* \x22 }xms; my $paren = qr{ [(] [^)]* [)] }xms; ;; my $vals = qr{ $val | $d_quo | $paren }xms; ;; my @opts = $s =~ m{ $key \s* = \s* $vals }xmsg; ;; print qq{'$s'}; print qq{'$_'} for @opts; " 'key1=val1 key2=val2 key3=val3 key4="val4a val4b" key5="val5key=(0 1 2 + 3)" key6=(val6a val6b)' 'key1=val1' 'key2=val2' 'key3=val3' 'key4="val4a val4b"' 'key5="val5key=(0 1 2 3)"' 'key6=(val6a val6b)'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://883994]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (7)
As of 2014-12-26 23:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (176 votes), past polls