Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

How to slice complicated string into an array

by theNeuron (Novice)
on Feb 26, 2016 at 11:18 UTC ( #1156201=perlquestion: print w/replies, xml ) Need Help??

theNeuron has asked for the wisdom of the Perl Monks concerning the following question:

How to slice this string into an array?

"word1",word2,"word3,word4"

result should be [ 'word1', 'word2', 'word3,word4' ]

That is if the word is surrounded by apostroph don't include it into the result.

The regex seems to be simple: (?: "([^"]*)" | ([^,]) )(?:,|$), but to my surprise both the matched and unamtched parenthesis contribute to the result, making it something like

[ 'word1', undef, undef, 'word2', 'word3,word4', undef ]

I have tried to make use of %+ which should use the leftmost defined result. That works ok, but the code seems to be too unreadable for what it does:

while ( m/ (?: # non-capturing braces "(?<val>[^"]*)" # Find either string delimited by '"' | # or (?<val>[^,]*) # string up to next ',' ) (?:,|$) # After which there is either , or end of line /gx ) { push @input, $+{val}; }

I have tried to use @input = map { %+{val} } m/.../gx, but that fills the whole array with just the last match from whole string.

Is there a way to avoid the while + push loop and store the results in to array directly, just for the sake of making the code simpler and nicer?

Many thanks

__

Vlad

Replies are listed 'Best First'.
Re: How to slice complicated string into an array
by Corion (Pope) on Feb 26, 2016 at 11:54 UTC

    Have you thought about using Text::CSV_XS for parsing your CSV data?

      No I haven't. And as I just replied above, external module is not very helpful in my environment. Thank you for the tip though.
Re: How to slice complicated string into an array
by Anonymous Monk on Feb 26, 2016 at 11:53 UTC

    Is that CSV? Then use Text::CSV / Text::CSV_XS.

    use warnings; use strict; my $line = q{"word1",word2,"word3,word4"}; use Text::CSV; my $csv = Text::CSV->new({binary=>1,auto_diag=>2}); $csv->parse($line); use Data::Dumper; print Dumper( [ $csv->fields() ] ); __END__ $VAR1 = [ 'word1', 'word2', 'word3,word4' ];
      Yes, that is csv and I should have mentioned it. That module is not part of distribution (at least to the perl I have) which would limit the script usability in our environment. But thank you for the pointer.

        As a core module Text::ParseWords is convenient for simple cases.

        # perl use strict; use Text::ParseWords; use Data::Dumper; my $str = '"word1",word2,"word3,word4"'; my @words = quotewords(',', 0, $str); print Dumper \@words;
        poj
Re: How to slice complicated string into an array
by AnomalousMonk (Bishop) on Feb 26, 2016 at 14:31 UTC

    I think I like the Text::CSV or Text::ParseWords module-based parsing approaches a bit better, but if you have Perl version 5.10+, you can use a regex approach that's almost identical to your OPed regex if you take advantage of the  (?|...|...) "branch reset" construct (see Extended Patterns in perlre):

    c:\@Work\Perl\monks>perl -wMstrict -le "use 5.010; ;; use Data::Dump qw(dd); ;; my $s = '\"word1\",word2,\"word3,word4\"'; ;; my @ra = $s =~ m{ (?| \"([^^\x22]*)\" | ([^,]+) ) (?:,|$) }xmsg; dd \@ra; " ["word1", "word2", "word3,word4"]
    (Note that what appears above as  \"([^^\x22]*)\" due to the peculiarities of Windose command line interpretation and the limitations of my personal REPL should really be  "([^\x22]*)" or better yet  "([^"]*)" )


    Give a man a fish:  <%-{-{-{-<

Re: How to slice complicated string into an array
by ww (Archbishop) on Feb 26, 2016 at 12:29 UTC

    FWIW and not a part of your problem, but it appears that what you're calling an "apostroph(sic)" is a comma (at least in all the flavors of English I know) and not an apostrophe.

    Imprecise or incorrect language is confusing; sometimes even to the person using it!

Re: How to slice complicated string into an array
by Anonymous Monk on Feb 26, 2016 at 11:40 UTC
    If using match operator in list context, then also prefix with grep defined,
      That didn't come to my mind, thank you! grep { defined } it is.
Re: How to slice complicated string into an array
by Anonymous Monk on Feb 26, 2016 at 13:39 UTC
    Hi, there is a nice old core module that could help.
    use v5.14; use Text::ParseWords; my $line = q{"word1",word2,"word3,word4"}; my @words = quotewords( ",", 0, $line ); foreach( @words ){ say $_; }
      Sorry for that post, I haven't seen poj's answer before.
Re: How to slice complicated string into an array
by Anonymous Monk on Feb 29, 2016 at 08:40 UTC
    Buy the Perl Cookbook which has a neat solution:
    my @a; my $s = '"word1",word2,"word3,word4"'; push(@a, $+) while $s =~ m {"([^\"\\]*(?:\\.[^\"\\]*)*)",? | ([^,]+),? | ,}gx; print "|$_|\n" for (@a);

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1156201]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2019-08-25 01:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?