Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

How can I split a comma-delimited string when the fields can have commas in them?

( #5722=categorized question: print w/ replies, xml ) Need Help??
Contributed by dbetz on Mar 20, 2000 at 20:48 UTC
Q&A  > strings


Description:

Normally I would use split on ',', but that will not work because of the comma embedded within certain fields. Is there a "perl" way to parse this into its proper fields?

"R", "2164", "27-2164", "270102", "Add Terminal Server to John, Jane, and George's PC's", "3/13/00", "3/27/00", "00/00/00", "02:00:00", "Jane Doe", " 3 - Released", "Jane Doe"

Edited by davido: Added code tags and more legible formatting.

Answer: How can I split a comma-delimited string when the fields can have commas in them?
contributed by chromatic

The best answer is to use Text::CSV from CPAN. Otherwise, you'll have to craft a regex which can handle obscure cases like commas between quotes, escaped quotes within quotes, and other funny stuff like that.

One possibility is:

$string =~ m!"*?([^,])"*?(?:=,)!;
Answer: How can I split a comma-delimited string when the fields can have commas in them?
contributed by turnstep

Here's an answer from Mastering Regular Expressions:

sub parse_csv { my $text = shift; ## record containing comma-separated values my @new = (); push(@new, $+) while $text =~ m{ ## the first part groups the phrase inside the quotes "([^\"\\]*(?:\\.[^\"\\]*)*)",? | ([^,]+),? | , }gx; push(@new, undef) if substr($text, -1,1) eq ','; return @new; ## list of values that were comma-spearated } ## Use like this: @goodlist = parse_csv($csvlist);

Ugly, to be sure, but the complexity level really kicks up a notch when you add the delimiters into the fields themselves. Also, the above snippet allows quotes inside the fields, as long as they are backslashed.

Please (register and) log in if you wish to add an answer



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others having an uproarious good time at the Monastery: (2)
    As of 2014-07-26 12:44 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My favorite superfluous repetitious redundant duplicative phrase is:









      Results (176 votes), past polls