Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

How to sort hash keys numerically?

by rnaeye (Friar)
on Mar 20, 2013 at 16:21 UTC ( [id://1024565]=perlquestion: print w/replies, xml ) Need Help??

rnaeye has asked for the wisdom of the Perl Monks concerning the following question:

Hi! Monks,
Can I please ask for your wisdom? I want to print the hash keys in numerical order, but my short script sorts keys in ASCII order. I was wondering if you could advice me on this. A second question is that I am planning to run this script on a file that contain 10-15 million lines. Do you think this is the best way of doing it. I am just trying to calculate the coverage on each base position. Thank you for your help!

use warnings; use strict; use 5.010; my %base_positon; while(<DATA>){ my ($chr, $start, $end)= split; $base_positon{$_}++ for ($start..$end); } # while( my($key, $value) = each %base_positon){ # say $key,"\t",$value; # } foreach my $key (sort(keys %base_positon) ){ say $key,"\t", $base_positon{"$key"}; } __DATA__ chrM 0 49 M01193:66:000000000-A386C:1:1112:20711:7517 0 + + chrM 0 49 M01193:66:000000000-A386C:1:1112:12448:7530 0 + + chrM 0 46 M01193:66:000000000-A386C:1:2108:26167:23502 0 + + chrM 0 46 M01193:66:000000000-A386C:1:1101:17077:1444 0 + - chrM 0 50 M01193:66:000000000-A386C:1:1101:17602:1741 42 + + chrM 0 46 M01193:66:000000000-A386C:1:1101:13807:1866 0 + + chrM 0 46 M01193:66:000000000-A386C:1:1101:16360:2204 0 + - chrM 0 46 M01193:66:000000000-A386C:1:1101:13075:2236 0 + - chrM 0 46 M01193:66:000000000-A386C:1:1101:15485:2329 0 + - chrM 0 50 M01193:66:000000000-A386C:1:1101:13054:2607 42 + -

Replies are listed 'Best First'.
Re: How to sort hash keys numerically?
by BrowserUk (Patriarch) on Mar 20, 2013 at 16:40 UTC
    A second question is that I am planning to run this script on a file that contain 10-15 million lines. Do you think this is the best way of doing it.

    Code updated.

    Maybe not. All your keys are numeric and chrM is only around 16k in length, so you can save space by using an array instead of a hash and the save time by not having to sort:

    use warnings; use strict; use 5.010; my @pos; while(<DATA>){ my( $chr, $start, $end ) = split; $pos[ $_ ]++ for $start .. $end; } for my $pos ( 0 .. $#pos ) ){ next unless defined $pos[ $pos ]; say $pos,"\t", $pos[ $pos ]; }

    However, this might exhuast your memory if you tried to run in on some of the larger genomes.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: How to sort hash keys numerically?
by davido (Cardinal) on Mar 20, 2013 at 16:25 UTC

    Perl's documentation for sort provides the following example:

    # sort numerically ascending @articles = sort {$a <=> $b} @files;

    So applying that to your code, it would look like this:

    foreach my $key ( sort { $a <=> $b } keys %base_position ) { ...

    Perl's sort defaults to sorting ASCII-betically. If you want it to sort numerically, you need to supply a code block that uses the <=> operator (perlop).

    Update:

    Your second question is a good one. One area where you could improve would be to use an array instead of a hash as an accumulator. Although both array and hash lookups are O(1), the constant factor involved in hash lookups is more expensive than for array lookups. Since you're doing lookups 40 to 50 times per line from the file, and since all of your indices seem to be within a narrow numeric range from zero to fifty, an array makes a lot of sense. This would also eliminate the need to sort.

    However, it's possible you're IO bound anyway, and that the time spent in calculating hash lookups is shadowed by waiting for input. ....profiling would tell a lot.


    Dave

Re: How to sort hash keys numerically?
by rnaeye (Friar) on Mar 20, 2013 at 17:11 UTC

    Thank you guys!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1024565]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (2)
As of 2024-04-24 23:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found