Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Sorting a CSV file

by farhan (Novice)
on Jul 17, 2006 at 02:00 UTC ( #561634=perlquestion: print w/ replies, xml ) Need Help??
farhan has asked for the wisdom of the Perl Monks concerning the following question:

I have comma separated CSV file. And there are 5 columns date,delta time, bits, bitsin, bitsout.
Task is
1) sort the file by date
2) ignore the bits
3) compare the bitsin and bitsout and keep the highest one
4) sort the higest one
5) print all above four steps comma separated
I have done first 3 steps and i dont know how i can perform step 4 any body can help me

Raw Data File Contents
12/6/2006 00:02:23,301,1151.78735352,677.26245117,474.52490234
12/6/2006 00:07:22,299,1108.97656250,641.49835205,467.47827148
12/6/2006 00:12:22,300,1020.15997314,590.40002441,429.76000977
12/6/2006 00:17:23,301,981.26245117,562.01995850,419.24252319

OutPut File
1/6/2006 00:03:46 300,641.06665039,477.60000610,641.06665039
1/6/2006 00:08:46 300,563.78668213,425.60000610,563.78668213
1/6/2006 00:13:47 301,505.99334717,430.72424316,505.99334717

Wanting Output

1/6/2006 00:03:46 300,641.06665039,477.60000610,641.06665039 ,505.99334717
1/6/2006 00:08:46 300,563.78668213,425.60000610,563.78668213 ,563.78668213
1/6/2006 00:13:47 301,505.99334717,430.72424316,505.99334717 ,641.06665039

open(RAW,"raw_data") or die ("could not open the file $!"); while(<RAW>) { chomp; next if /^(\s)*$/; my @this_record = split(/,/,$_); push(@records,\@this_record); } my @sorted = sort{$a->[0] <=> $b->[0]} @records; foreach $record (@sorted) { if($record->[3] >= $record->[4]) { $final = $record->[3]; } else { $final = $record->[4]; } print "$record->[0] $record->[1],$record->[3],$record->[4],$final\n"; } close(RAW);

2006-07-17 Retitled by Corion, as per Monastery guidelines
Original title: 'Sorting'

Comment on Sorting a CSV file
Download Code
Re: Sorting a CSV file
by McDarren (Abbot) on Jul 17, 2006 at 02:59 UTC
    "4) sort the higest one "

    Sort the highest one what?

    Do you mean:

    1. sort by bitsin?
    2. sort by bitsout?
    3. extract the highest bitsin value?
    4. extract the highest bitsout value?
    5. something else entirely?

    You need to clarify.

    By the way, the "sorting by date" that you've implemented does not work. If you enable warnings, you will see the following output:

    Argument "12/6/2006 00:07:22" isn't numeric in numeric comparison (<=> +) at foo.pl line 15, <DATA> line 4. Argument "12/6/2006 00:02:23" isn't numeric in numeric comparison (<=> +) at foo.pl line 15, <DATA> line 4. Argument "12/6/2006 00:17:23" isn't numeric in numeric comparison (<=> +) at foo.pl line 15, <DATA> line 4. Argument "12/6/2006 00:12:22" isn't numeric in numeric comparison (<=> +) at foo.pl line 15, <DATA> line 4.
    You get that because you are tring to do a numeric sort on "stringified" data. To properly sort by date, you'll need to first convert the strings into something that sort can make sense of. For example, a unix timestamp. Something like Date::Manip or Date::Calc will be useful for that.

    Cheers,
    Darren

      Darren, Thanks for reply.If u enable warning, it gives error otheriwse it is sorting by date. later on I will use different approach
      but in mean time i want solution. Do you know how can i solve my above mentioned problem?
        "..it gives error otheriwse it is sorting by date"

        No. As I already pointed out, it is not sorting by date at all. It only appears to be be working because your sample data happens to be ordered by date. Try re-arranging the lines in your sample data and you will see what I mean.

      Sorry Darrent I could not make it clear.
      Sort by date, bitsin,bitsout, compare bitsin and bitsout and keep the highest one, sort the highest value which you get from previous comparision
      Out put should look like as date, bitsin, bitsout, highest one, sort highest one
      01/01/2006 00:03:12, 210, 212, 212, 212
        I'm sorry but I honestly don't understand what it is that you want. I've read it through over and over again and I just don't get it. Hopefully somebody else will be able to understand and point you in the right direction.

        Again, I'm sorry.
        Darren

Re: Sorting a CSV file
by rsiedl (Friar) on Jul 17, 2006 at 03:39 UTC
    Hi,

    You might want to look at Text::CSV for parsing your csv file. It could make things a bit simpler.

    And then, as McDarren says, Date::Calc would be quite helpful for your sort. Perhaps convert it to a unixtime format and then perform the sort.

    A sort sub to get the job done:
    # Store your values in a hash of hashes # $kols{line1} = { date => 12-12-06, bits => 12, ... } my $orderby = "date" # or "bits, ..." my $updown = "DESC" # or "ASC" foreach (sort { &mysort($updown) } keys %kols) { } # end-foreach sub mysort { my ($direction) = @_; if ($direction eq "DESC") { $kols{$b}{$orderby} <=> $kols{$a}{$orderby} || $kols{$b}{$orderby} cmp $kols{$a}{$orderby} } # end-if else { $kols{$a}{$orderby} <=> $kols{$b}{$orderby} || $kols{$a}{$orderby} cmp $kols{$b}{$orderby} } # end-else } # end-sub

    Cheers.
Re: Sorting a CSV file
by CountZero (Bishop) on Jul 17, 2006 at 14:17 UTC
    Think of your csv-file as a simple form of database and then you can use DBI and DBD::CSV on it and use the strength of SQL to select and sort as you like.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://561634]
Approved by McDarren
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (18)
As of 2014-07-14 17:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (268 votes), past polls