Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

How to slice complicated string into an array

by theNeuron (Novice)
on Feb 26, 2016 at 11:18 UTC ( #1156201=perlquestion: print w/replies, xml ) Need Help??

theNeuron has asked for the wisdom of the Perl Monks concerning the following question:

How to slice this string into an array?

"word1",word2,"word3,word4"

result should be [ 'word1', 'word2', 'word3,word4' ]

That is if the word is surrounded by apostroph don't include it into the result.

The regex seems to be simple: (?: "([^"]*)" | ([^,]) )(?:,|$), but to my surprise both the matched and unamtched parenthesis contribute to the result, making it something like

[ 'word1', undef, undef, 'word2', 'word3,word4', undef ]

I have tried to make use of %+ which should use the leftmost defined result. That works ok, but the code seems to be too unreadable for what it does:

while ( m/ (?: # non-capturing braces "(?<val>[^"]*)" # Find either string delimited by '"' | # or (?<val>[^,]*) # string up to next ',' ) (?:,|$) # After which there is either , or end of line /gx ) { push @input, $+{val}; }

I have tried to use @input = map { %+{val} } m/.../gx, but that fills the whole array with just the last match from whole string.

Is there a way to avoid the while + push loop and store the results in to array directly, just for the sake of making the code simpler and nicer?

Many thanks

__

Vlad

Replies are listed 'Best First'.
Re: How to slice complicated string into an array
by Corion (Pope) on Feb 26, 2016 at 11:54 UTC

    Have you thought about using Text::CSV_XS for parsing your CSV data?

      No I haven't. And as I just replied above, external module is not very helpful in my environment. Thank you for the tip though.
Re: How to slice complicated string into an array
by Anonymous Monk on Feb 26, 2016 at 11:53 UTC

    Is that CSV? Then use Text::CSV / Text::CSV_XS.

    use warnings; use strict; my $line = q{"word1",word2,"word3,word4"}; use Text::CSV; my $csv = Text::CSV->new({binary=>1,auto_diag=>2}); $csv->parse($line); use Data::Dumper; print Dumper( [ $csv->fields() ] ); __END__ $VAR1 = [ 'word1', 'word2', 'word3,word4' ];
      Yes, that is csv and I should have mentioned it. That module is not part of distribution (at least to the perl I have) which would limit the script usability in our environment. But thank you for the pointer.

        As a core module Text::ParseWords is convenient for simple cases.

        # perl use strict; use Text::ParseWords; use Data::Dumper; my $str = '"word1",word2,"word3,word4"'; my @words = quotewords(',', 0, $str); print Dumper \@words;
        poj
Re: How to slice complicated string into an array
by AnomalousMonk (Bishop) on Feb 26, 2016 at 14:31 UTC

    I think I like the Text::CSV or Text::ParseWords module-based parsing approaches a bit better, but if you have Perl version 5.10+, you can use a regex approach that's almost identical to your OPed regex if you take advantage of the  (?|...|...) "branch reset" construct (see Extended Patterns in perlre):

    c:\@Work\Perl\monks>perl -wMstrict -le "use 5.010; ;; use Data::Dump qw(dd); ;; my $s = '\"word1\",word2,\"word3,word4\"'; ;; my @ra = $s =~ m{ (?| \"([^^\x22]*)\" | ([^,]+) ) (?:,|$) }xmsg; dd \@ra; " ["word1", "word2", "word3,word4"]
    (Note that what appears above as  \"([^^\x22]*)\" due to the peculiarities of Windose command line interpretation and the limitations of my personal REPL should really be  "([^\x22]*)" or better yet  "([^"]*)" )


    Give a man a fish:  <%-{-{-{-<

Re: How to slice complicated string into an array
by ww (Archbishop) on Feb 26, 2016 at 12:29 UTC

    FWIW and not a part of your problem, but it appears that what you're calling an "apostroph(sic)" is a comma (at least in all the flavors of English I know) and not an apostrophe.

    Imprecise or incorrect language is confusing; sometimes even to the person using it!

Re: How to slice complicated string into an array
by Anonymous Monk on Feb 26, 2016 at 11:40 UTC
    If using match operator in list context, then also prefix with grep defined,
      That didn't come to my mind, thank you! grep { defined } it is.
Re: How to slice complicated string into an array
by Anonymous Monk on Feb 26, 2016 at 13:39 UTC
    Hi, there is a nice old core module that could help.
    use v5.14; use Text::ParseWords; my $line = q{"word1",word2,"word3,word4"}; my @words = quotewords( ",", 0, $line ); foreach( @words ){ say $_; }
      Sorry for that post, I haven't seen poj's answer before.
Re: How to slice complicated string into an array
by Anonymous Monk on Feb 29, 2016 at 08:40 UTC
    Buy the Perl Cookbook which has a neat solution:
    my @a; my $s = '"word1",word2,"word3,word4"'; push(@a, $+) while $s =~ m {"([^\"\\]*(?:\\.[^\"\\]*)*)",? | ([^,]+),? | ,}gx; print "|$_|\n" for (@a);

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1156201]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (8)
As of 2020-02-25 13:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?










    Results (110 votes). Check out past polls.

    Notices?