http://www.perlmonks.org?node_id=269127


in reply to Re: Re: Writing a CSV Parser/Printer
in thread Writing a CSV Parser/Printer

You have to provide us with line numbers. This can't be done on perlmonks. You can get them with:

perl -pe '$_="$.: $_"' your_input > your_output
I'm not sure how your desired output should look like. Maybe this will help you. It uses RegEx:

use strict; use warnings; while (<DATA>) { my (@fields)= split /, /; foreach (@fields) { if (s/^"((?:[^"\\]|\\.)*)"$/$1/) { #correct tr/\\//d; # No more \ print "$_\n"; } } } __END__ "Perlmonks", "http://www.perlmonks.org", "excellent ;)" "csv", "csv\"xxx", "trall\ala"
Short explanation for the RegEx:

/^"((?:[^"\\]|\\.)*)"$/$1/

^"
matches your field's quotechar at the start of the field
(...)
will "remember" what was matched inside the quotes
(?:...)*
This will match anything in place of the ... and tells the parser that it may apear as often as possible. Even zero times
[^"\\]
will match any character but " and \
|
is an alternative. Either the left or the right part has to match
\\.
Will match any "escaped" character
"$
again your quotechar but now at the end

Replies are listed 'Best First'.
Re: Re: Re: Re: Writing a CSV Parser/Printer
by Anonymous Monk on Jun 26, 2003 at 07:42 UTC

    Thanks for the excellent explanation :)

    With regards to the error line numbers: there aren't any errors anymore - it just doesn't produce the desired results. I have a feeling it's quite a ways away as well - your approach is far clearer.

    One question about the split if it's fed data like:

    "csv", "csv\"x, xx", "trall\ala"

    It will choke on the second entry. How would I go about avoiding this? I could use something like split/","/; which would make problems far less likely, but is there a better way? Some sort of notation for when it's inside the field?

      Okay. That one isn't easy ;-)

      I have to admit that I have no good solution at hand.

      Maybe a search here will help.

      BTW: Why don't you register here. It's far more easy to recognize you if you're no longer Anonymous Monk. It just costs you some time...

        No problem - I got Text::CSV to work with BrowserUk's help, and since writing a parser is rather tedious, error-prone work, I'll probably leave that for another day :)

        Thanks for the help :)