Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Perl script for the post processing of one CSV file

by rjt (Curate)
on Oct 03, 2019 at 05:52 UTC ( #11106987=note: print w/replies, xml ) Need Help??


in reply to Perl script for the post processing of one CSV file

What have you tried already? See How do I post a question effectively?, but you have at least provided some good sample input and output, so that's a good start.

See Text::CSV for the CSV processing.

use Text::CSV 'csv'; my $aoh = csv ( in => 'input.csv', headers => 'auto' ); # Fetch the "patN" column names, in order. This works # for single digit "patN" names, as that was your # example. Multi-digit names will require a more complex # sort, left as an exercise to the reader. my @pats = sort grep /^pat\d+$/, keys %{$aoh->[0]}; my %sums; # Column sums. $sum{column} += $value; for my $row ($@aoa) { for (map { $row->{$_} } @pats) { # You are now iterating over every patN value, # in order. Perform your transformation } # Just an example. $sums{$_} += $row->{$_} for @pats; }

This is just a skeleton to get you started. Certainly more code than you posted. The code will require modification, does not output anything, and some of my simple logic may prove to be inadequate, but hopefully illustrates the general approach you might take. Definitely read the Text::CSV manual thoroughly, and probably perldata as well.

use strict; use warnings; omitted for brevity.

Replies are listed 'Best First'.
Re^2: Perl script for the post processing of one CSV file
by Tux (Abbot) on Oct 03, 2019 at 07:03 UTC

    You can stream it and prevent memory hogs on big data. Additionally, install Text::CSV_XS for speed

    use Text::CSV_XS "csv"; my %sums; # Column sums. $sum{column} += $value my @head; my @pats; csv (in => "input.csv", out => undef, bom => 1, kh => \@head, on_in => + sub { my ($csv, $row) = @_; unless (@pats) { # Fetch the "patN" column names, in order. This works # for single digit "patN" names, as that was your # example. Multi-digit names will require a more complex # sort, left as an exercise to the reader. # XXX @pats = sort { grep m/^pat\d+$/ } @head; @pats = sort grep m/^pat\d+$/ => @head; # This line fixed } for (@{$row}{@pats}) { # You are now iterating over every patN value, # in order. Perform your transformation } # Just an example. $sums{$_} += $row->{$_} for @pats; });

    update: I changed the grep line which I blindly copied from the original code


    Enjoy, Have FUN! H.Merijn
      You can stream it and prevent memory hogs on big data. Additionally, install Text::CSV_XS for speed use Text::CSV_XS "csv";

      Install Text::CSV_XS, yes, but don't explicitly use Text::CSV_XS. Text::CSV is smart enough to pull in the XS version if installed, and will fall back to pure Perl if not. There is usually no point in having the script break if XS isn't available.

      perl -MText::CSV -e 'print Text::CSV->module' Text::CSV_XS

      I agree with your streaming suggestion. I had opted to keep my example simple given the 20k line input. Good to teach the streaming approach, though. ++

      use strict; use warnings; omitted for brevity.
      Hi Superdoc ,

      Thanks , I have tried it but getting an error as stated below

      Not enough arguments for grep at csv_to_output_report.pl line 18, near "m/^pat\d+$/ }"

      Could you guide me to resolve this error ?

      Thanks Kshitij

      Hi Superdoc ,

      I cant use these subroutines "bom" and "kh" since I am using older version of perl. Can you help me out with the code without using these subroutines ?

      Thanks Kshitij

        Those are arguments, not subroutines, and they are part of Text::CSV, not Perl itself. What version of Perl are you using (perl -v from a prompt), and what version of Text::CSV do you have installed? Find that out with perl -MText::CSV -e 'print Text::CSV->VERSION' again from a prompt. The latest is 2.00, released in May of this year.

        You can upgrade your version of Text::CSV using CPAN (and install Text::CSV_XS while you're at it!), and you will be able to use these attributes. Using CPAN

        since I am using older version of perl

        I have some great news for you. You can upgrade individual modules without upgrading the entire perl installation. However, if your perl is so old (which version?) why not take this opportunity to upgrade the entire system anyway?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11106987]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2020-03-30 09:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    To "Disagree to disagree" means to:









    Results (175 votes). Check out past polls.

    Notices?