Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Text::CSV - parsing

by Eily (Monsignor)
on Mar 14, 2017 at 18:06 UTC ( [id://1184579]=note: print w/replies, xml ) Need Help??


in reply to Text::CSV - parsing

Actually Text::CSV handles quoted fields pretty well, but the default quote character is ". See the quote_char section of the Text::CSV. Note that the sep_char is a single character though, so spaces are not removed, which would make 101, '1997' invalid CSV, because the quote_char should be the next thing after the sep_char. To allow spaces, set allow_whitespace to 1:

use strict; use Text::CSV; use Data::Dumper; my @lines=<DATA>; my $csv = Text::CSV->new ({sep_char => ',' , quote_char => "'", allow +_whitespace => 1}); my @AoA; for (@lines) { if (/^\((.+?)\).?/){my $con=$1; if ($csv->parse($con)) { my @fields = $csv->fields(); push @AoA,[@fields]; } else { warn "Line could not be parsed: $_\n"; } } } print Dumper \@AoA; __DATA__ (101, '1997-02-25', 'S1', 31.00, NULL, 0.00, 'this becomes two fields, + so no go', 5.11), (102, '1998-03-26', 'S1', 31.00, NULL, 0.00, 'this will remain one fie +ld', 6.11),
$VAR1 = [ [ '101', '1997-02-25', 'S1', '31.00', 'NULL', '0.00', 'this becomes two fields, so no go', '5.11' ], [ '102', '1998-03-26', 'S1', '31.00', 'NULL', '0.00', 'this will remain one field', '6.11' ] ];

Replies are listed 'Best First'.
Re^2: Text::CSV - parsing
by haukex (Archbishop) on Mar 14, 2017 at 18:16 UTC

    Good post, just a note: this will not work right if the quoted strings happen to be broken onto two lines. In that case, Text::CSV's getline is better (see the section "Embedded newlines" in the doc):

    use warnings; use strict; use Text::CSV; use Data::Dump; my $csv = Text::CSV->new ({ binary=>1, auto_diag=>2, quote_char=>"'", allow_whitespace=>1 }); while ( my $row = $csv->getline( \*DATA ) ) { dd $row; } $csv->eof; __DATA__ (101, '1997-02-25', 'S1', 31.00, NULL, 0.00, 'this becomes two fields, so no go', 5.11), (102, '1998-03-26', 'S1', 31.00, NULL, 0.00, 'this will remain one field', 6.11),

    Output:

    [ "(101", "1997-02-25", "S1", "31.00", "NULL", "0.00", "this becomes two fields,\nso no go", "5.11)", "", ] [ "(102", "1998-03-26", "S1", "31.00", "NULL", "0.00", "this\nwill remain one field", "6.11)", "", ]

      Well, this doesn't deal with the parentheses properly, so this would make your ++solution a better one then. You don't have to worry about embedded newlines, enclosing parentheses, the non default quote character or extra white space. And parsing DB data as SQL makes sense :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1184579]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2024-04-25 14:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found