Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Sort on Table headers

by Anonymous Monk
on Apr 18, 2012 at 19:37 UTC ( #965790=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I have a tab-delimited table that I want to sort (descending) its records based on the entries in two columns (score and value), an excerpt of this table is provided below. The scenario is that if two consecutive records have an equal reading for the value then we resort to sorting these based on which score reading is higher. I thought it will be easy, but I tried everything from simple sorting to indexed looping to a Schwartzian transform and apparently I am not sure I am implementing anything right after all hence I seek more enlightenment from you. My data structure is an array of arrays.
use strict; use warnings; use Data::Dumper my $header = <DATA>; #capture the header line my @headers = split(/\t+/,$header); while(my $line = <DATA>){ my @records = split(/\t+/,$line); # print "@records"; push @recordsArray,@records; } #I lost control #my @sorted_array = map{$_->[0]} sort{$a->[1] <=> $b->[1]} map{[$_, sp +lit /\t+/]} @recordsArray; #the only little bit of the above line which is operational my @sorted_array = map{split(/\t/, $_)} @recordsArray; print join("\t", @headers); print join("\t",@sorted_array); #print Dumper(\@recordsArray); __DATA__ ID distance score value start stop done remaining + N_425 614 17.01 425 40 12 308 322 N_542 1290 18.74 542 53 15 237 251 N_372 870 15.66 372 80 15 262 276 N_236 814 15.65 236 69 13 185 200
After sorting the table will look like the following:
ID distance score value start stop done remaining N_542 1290 18.74 542 53 15 237 251 N_372 870 15.66 372 80 15 262 276 N_236 814 15.65 236 69 13 185 200 N_425 614 17.01 425 40 12 308 322

Comment on Sort on Table headers
Select or Download Code
Re: Sort on Table headers
by BillKSmith (Chaplain) on Apr 18, 2012 at 20:27 UTC
    I have implemented your specification, but did not get the expected result. (Did you mean to sort on distance rather than value?)
    use strict; use warnings; use Data::Dumper; my $header = <DATA>; #capture the header line my @headers = split(/\t+/,$header); my @recordsArray; while(my $line = <DATA>){ my @records = split(/\t+/,$line); # print "@records"; push @recordsArray,[@records]; } #print Dumper(\@recordsArray); my @SortedArray = sort {$b->[3]<=>$a->[3] or $b->[2]<=>$a->[2]} @recor +dsArray; print Dumper(\@SortedArray); __DATA__ ID distance score value start stop done remaining + N_425 614 17.01 425 40 12 308 322 N_542 1290 18.74 542 53 15 237 251 N_372 870 15.66 372 80 15 262 276 N_236 814 15.65 236 69 13 185 200
Re: Sort on Table headers
by snape (Pilgrim) on Apr 18, 2012 at 20:53 UTC

    What do you mean "if two consecutive records have an equal readings". The code takes into consideration if two fields have same values the n sort it via score.

    #!/tools/bin/perl use strict; use warnings; use Data::Dumper; open DATA, "file1" or die $!; my %recordsArray; my $header = <DATA>; #capture the header line my @headers = split(/\t+/,$header); while(my $line = <DATA>){ chomp($line); my @records = split(/\t+/,$line); # print "@records"; my $score = $records[2]; my $value = $records[3]; $recordsArray{$score}{$value} = $line; } print $header,"\n"; foreach my $score (reverse sort keys %recordsArray){ foreach my $value (reverse sort keys %{$recordsArray{$score}}){ print $recordsArray{$score}{$value},"\n"; } } ----DATA was modified for ids having same values----- ID distance score value start stop done remaining + N_425 614 17.01 425 40 12 308 322 N_542 1290 18.74 542 53 15 237 251 N_372 870 15.66 372 80 15 262 276 N_236 814 15.70 372 69 13 185 200 --------Result-------------------- ID distance score value start stop done remaining + N_542 1290 18.74 542 53 15 237 251 N_425 614 17.01 425 40 12 308 322 N_236 814 15.70 372 69 13 185 200 N_372 870 15.66 372 80 15 262 276
Re: Sort on Table headers
by temporal (Pilgrim) on Apr 18, 2012 at 21:33 UTC
    #I lost control

    I lol'd, going to start leaving comments about my decaying psychological state in my code:

    # Day 106 # out of Ramen # must set more mousetraps # Day 108 # perl -w # produces maniacal laughter # Day 111 # murdered the Roomba # had it coming # not palatable # Day 117 # woke up from nightmare # was writing this subroutine # the mice, the horrible mice! # Day 118 # Dobbins forks my code, AGAIN # I SEE YOU, DOBBINS # Day 120 # hearing voices # they're speaking LISP # parenthesizing me in their... # ...unholy cdr # Day 122 # something is living # in /etc # WATCHING # cd cd cd cd cd cd # Day 124 # no more mice. starvation setting in # must invite Dobbins for... dinner # Day 126 # print "all $work and no @play makes ${\ $self->set('a dull boy') }\n +" x 1e10; # Day 127 # code nearly complete # test cases... missing # must execute... deploy to... production # Day 135 # ...power dead... # ...soul-crushing darkness closing in... # UPS fading...
      # Day 149 # Well? Shall I go? # Yes, I should go. # Day 153 # I have not moved.
Re: Sort on Table headers
by scorpio17 (Monsignor) on Apr 18, 2012 at 21:39 UTC

    I think you want something more like this:

    use strict; use warnings; my @recordsArray; my @headers; while(my $line = <DATA>) { chomp $line; next unless $line; # skip blank lines if ($. == 1) { # first line? @headers = split(/\s+/, $line); next; } push @recordsArray, [ split(/\s+/, $line) ]; # store array ref } # sort first on column 2 (score), then on column 3 (value) # column 3 will only be used when column 2 values are the same my @sorted_array = sort { $a->[2] <=> $b->[2] || $a->[3] <=> $b->[3] } @recordsArray; print join("\t", @headers), "\n"; for my $row (@sorted_array) { # each element is an array ref... print join("\t", @$row), "\n"; # ...must dereference the array refs } __DATA__ ID distance score value start stop done remaining + N_425 614 17.01 425 40 12 308 322 N_542 1290 18.74 542 53 15 237 251 N_372 870 15.66 372 80 15 262 276 N_236 814 15.65 236 69 13 185 200 N_991 814 14.65 9 69 13 185 200 N_992 814 14.65 8 69 13 185 200 N_993 814 14.65 7 69 13 185 200 N_994 814 14.65 6 69 13 185 200 N_995 814 14.65 5 69 13 185 200

    which produces this output:

    ID distance score value start stop done remaining N_995 814 14.65 5 69 13 185 200 N_994 814 14.65 6 69 13 185 200 N_993 814 14.65 7 69 13 185 200 N_992 814 14.65 8 69 13 185 200 N_991 814 14.65 9 69 13 185 200 N_236 814 15.65 236 69 13 185 200 N_372 870 15.66 372 80 15 262 276 N_425 614 17.01 425 40 12 308 322 N_542 1290 18.74 542 53 15 237 251

    I added some additional data, just to make sure it was sorting correctly when col 2 values for two rows were equal. If this isn't what you want, you'll have to clarify the problem.

Re: Sort on Table headers
by Kenosis (Priest) on Apr 18, 2012 at 22:15 UTC

    You can coax a Schwartzian transform into doing this work for you:

    use strict; use warnings; map{print $_->[0]} sort{$a->[1] <=> $b->[1] || $a->[2] <=> $b->[2]} map{[$_, (split /\t/)[2], (split /\t/)[3]]} <DATA>; #ID distance score value start stop done remainin +g __DATA__ N_425 614 17.01 425 40 12 308 322 N_542 1290 18.74 542 53 15 237 251 N_372 870 15.66 372 80 15 262 276 N_236 814 15.65 236 69 13 185 200 N_991 814 14.65 9 69 13 185 200 N_992 814 14.65 8 69 13 185 200 N_993 814 14.65 7 69 13 185 200 N_994 814 14.65 6 69 13 185 200 N_995 814 14.65 5 69 13 185 200

    Results:

    N_995 814 14.65 5 69 13 185 200 N_994 814 14.65 6 69 13 185 200 N_993 814 14.65 7 69 13 185 200 N_992 814 14.65 8 69 13 185 200 N_991 814 14.65 9 69 13 185 200 N_236 814 15.65 236 69 13 185 200 N_372 870 15.66 372 80 15 262 276 N_425 614 17.01 425 40 12 308 322 N_542 1290 18.74 542 53 15 237 251

    Thank you, scorpio17, for the extended data set; and the sorting routine here is essentially the same as yours.

    Hope this helps!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://965790]
Approved by biohisham
Front-paged by biohisham
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (10)
As of 2014-07-23 23:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (154 votes), past polls