http://www.perlmonks.org?node_id=998554


in reply to Sort CSV file within Excel based on specific column

I agree with runrig, that you don't need Excel to sort your csv file. Consider the following:

FileName.csv file contents before (adapted from here):

REVIEW_DATE,AUTHOR,ISBN,DISCOUNTED_PRICE 1985/01/21,"Douglas Adams",0345391802,5.95 1990/01/12,"Douglas Hofstadter",0465026567,9.95 1998/07/15,"Timothy ""The Parser"" Campbell",0968411304,18.99 1999/12/03,"Richard Friedman",0060630353,5.95 2001/09/19,"Karen Armstrong",0345384563,9.95 2002/06/23,"David Jones",0198504691,9.95 2002/06/23,"Julian Jaynes",0618057072,12.50 2003/09/30,"Scott Adams",0740721909,4.95 2004/10/04,"Benjamin Radcliff",0804818088,4.95 2004/10/04,"Randel Helms",0879755725,4.50

The script which numerically sorts the above csv on the last field:

use strict; use warnings; use Text::CSV; my ( $csvFileName, @csvLines ) = 'FileName.csv'; my $csv = Text::CSV_XS->new( { binary => 1, auto_diag => 2 } ) or die "Cannot use CSV: " . Text::CSV->error_diag(); open my $csvfh, '<', $csvFileName or die $!; # Get first line (array reference to parsed column names) my $columnNames = $csv->getline($csvfh); # $row contains an array reference to the parsed csv line while ( my $row = $csv->getline($csvfh) ) { push @csvLines, $row; } close $csvfh; # $a->[3] dereferences the array reference to get the third element @csvLines = sort { $a->[3] <=> $b->[3] } @csvLines; # Add column names array reference to beginning of array unshift @csvLines, $columnNames; $csv->eol("\n"); # Print the sorted csv lines to a file open $csvfh, '>', "sorted_$csvFileName" or die $!; $csv->print( $csvfh, $_ ) for @csvLines; close $csvfh;

Results written to file sorted_FileName.csv:

REVIEW_DATE,AUTHOR,ISBN,DISCOUNTED_PRICE 2004/10/04,"Randel Helms",0879755725,4.50 2003/09/30,"Scott Adams",0740721909,4.95 2004/10/04,"Benjamin Radcliff",0804818088,4.95 1985/01/21,"Douglas Adams",0345391802,5.95 1999/12/03,"Richard Friedman",0060630353,5.95 1990/01/12,"Douglas Hofstadter",0465026567,9.95 2001/09/19,"Karen Armstrong",0345384563,9.95 2002/06/23,"David Jones",0198504691,9.95 2002/06/23,"Julian Jaynes",0618057072,12.50 1998/07/15,"Timothy ""The Parser"" Campbell",0968411304,18.99

You'll need to change the following:

{ $a->[3] <=> $b->[3] }

to (at least):

{ $a->[5] <=> $b->[5] }

If column F isn't numeric, change the <=> to cmp.

Hope this helps!

Edit: Script modified to output to file. Also removed the or die $! from--and shortened--the $csv->print (thank you, Tux).

Replies are listed 'Best First'.
Re^2: Sort CSV file within Excel based on specific column
by Tux (Canon) on Oct 12, 2012 at 06:02 UTC

    When using auto_diag => 2, or die … is unneeded. It will happen before inside Text::CSV or Text::CSV_XS, as they use the underlying print.

    for my $csvLine (@csvLines) { $csv->print( $csvfh, $csvLine ) or die $!; } -> $csv->print ($csvfh, $_) for @csvLines;

    Enjoy, Have FUN! H.Merijn

      Thank you, Tux. Have modified the script.

Re^2: Sort CSV file within Excel based on specific column
by Anonymous Monk on Sep 02, 2014 at 06:52 UTC
    I was looking for this, and is working exactly what i need thank you so much