Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^7: Perl custom sort for Portuguese Lanaguage

by haukex (Bishop)
on Jul 08, 2020 at 21:06 UTC ( #11119047=note: print w/replies, xml ) Need Help??


in reply to Re^6: Perl custom sort for Portuguese Lanaguage
in thread Perl custom sort for Portuguese Lanaguage

This works for me: csv (in => 'quux.csv', filter => {1 => sub { !/^#/ }});

Unfortunately that also filters lines whose first field is "#foo" (with the quotes). I remember Tux recently saying filtering before parsing wasn't supported, though I'm having trouble finding the reference at the moment (it could have been in the chatterbox too*). It may be a bit tricky because this is valid CSV too:

abc,"d #e f",ghi

(That's one row, ["abc", "d\n#e\nf", "ghi"].)

* Update: I looked again and I think it must have been in the chatterbox; I do distinctly remember someone having a similar question recently...

Replies are listed 'Best First'.
Re^8: Perl custom sort for Portuguese Lanaguage
by choroba (Archbishop) on Jul 08, 2020 at 21:27 UTC
    The meta info knows whether the field was quoted or not.
    #!/usr/bin/perl use warnings; use strict; use Text::CSV_XS; my $csv = 'Text::CSV_XS'->new ({ binary => 1, auto_diag => 1, keep_meta_info => 1 }); open my $in, '<:encoding(utf8)', shift or die $!; while (my $row = $csv->getline($in)) { next if $row->[0] =~ m/^#/ && ! $csv->is_quoted(0); $csv->say(*STDOUT, $row); }

    Tested with

    #x,y,z skip abc,"d #e f",ghi keep #comment skip a,b,c,#xyz keep "#foo",x,y,z keep
    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      The meta info knows whether the field was quoted or not.

      True, though AFAICT the meta_info doesn't seem to keep track of escaped characters:

      use warnings; use strict; use Data::Dump; use Text::CSV; my $csv = Text::CSV->new({ binary=>1, auto_diag=>2, keep_meta_info=>1, escape_char=>"\\" }); while ( my $row = $csv->getline(*DATA) ) { dd $row, $csv->meta_info; } $csv->eof or $csv->error_diag; __DATA__ foo,bar "#foo","bar" #foo,bar \#foo,bar
Re^8: Perl custom sort for Portuguese Lanaguage
by soonix (Canon) on Jul 09, 2020 at 06:31 UTC
    In this special case it looks like there won't be portuguese words starting with a "#", so it would work for OP, as long as he is aware of it
      In this special case it looks like there won't be portuguese words starting with a "#", so it would work for OP, as long as he is aware of it

      True as well :-) (I guess this is more about the generic case of filtering comments from CSV files.)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11119047]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2021-12-02 19:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    R or B?



    Results (23 votes). Check out past polls.

    Notices?