Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^2: sorting an array of file names

by hotel (Beadle)
on Dec 19, 2009 at 17:58 UTC ( [id://813541]=note: print w/replies, xml ) Need Help??


in reply to Re: sorting an array of file names
in thread sorting an array of file names

First of all, thanks for the help. And forgive me for my messed up english. I wanted to sort the array with respect to the "number" before the first dot, not the first digit, of course. Anyway, it turned out to be a matter of using ST really, and i noticed that implementing an ST by myself is beyond my programming skills for now. However, when i used the code shmem proposed, it worked. Thanks to all of you for your time. I hope the OP (at least its topic) will be useful for future googlers, who are seeking for a solution for this particular, ehm, matter.

Replies are listed 'Best First'.
Re^3: sorting an array of file names
by almut (Canon) on Dec 19, 2009 at 19:50 UTC
    Anyway, it turned out to be a matter of using ST

    I don't think you need a Schwartzian Transform here.  An ST makes sense if the individual comparison operation is computationally expensive. This is not the case with interpreting a string as a number, in particular as the conversion is done only once for each string and then "cached" in the NV/IV fields of the scalar variable(*).  In other words, the simple approach (not using ST) is even faster in this case:

    #!/usr/bin/perl use strict; use warnings; no warnings 'numeric'; use Benchmark 'cmpthese'; for my $e (2..5) { my $n = 10**$e; print "\nNumber of file names: $n\n"; my @data; push @data, join(".", int(rand($n)), int(rand($n)), 'force.0.5.1LG +Y.pdb') for 1..$n; cmpthese( 10**(6-$e), { 'simple' => sub { my @unsorted = @data; my @sorted = sort { $a <=> $b } @unsorted; }, 'ST' => sub { my @unsorted = @data; my @sorted = map $_->[0], sort { $a->[1] <=> $b->[1] } map { [ $_, int $_ ] } @unsorted; }, } ); } __END__ Number of file names: 100 Rate ST simple ST 3247/s -- -75% simple 12987/s 300% -- Number of file names: 1000 Rate ST simple ST 248/s -- -79% simple 1176/s 375% -- Number of file names: 10000 Rate ST simple ST 10.3/s -- -74% simple 39.2/s 280% -- Number of file names: 100000 s/iter ST simple ST 1.87 -- -50% simple 0.943 99% --

    Another beneficial side effect of the simple approach is that if you happen to have two names like this

    30.31.force.0.5.1LGY.pdb 30.32.force.0.5.1LGY.pdb

    they would be ordered in some useful way, because the fractional part of the number is automatically taken into consideration when just treating the name as a number.


    (*)

    use Devel::Peek; my $s = "30.31.force.0.5.1LGY.pdb"; Dump $s; print 0+$s, "\n"; # treat as number Dump $s; __END__ SV = PV(0x605150) at 0x604fa0 REFCNT = 1 FLAGS = (PADBUSY,PADMY,POK,pPOK) PV = 0x6370d0 "30.31.force.0.5.1LGY.pdb"\0 CUR = 24 LEN = 32 30.31 SV = PVNV(0x607880) at 0x604fa0 REFCNT = 1 FLAGS = (PADBUSY,PADMY,NOK,POK,pIOK,pNOK,pPOK) IV = 30 <--- NV = 30.31 <--- PV = 0x6370d0 "30.31.force.0.5.1LGY.pdb"\0 CUR = 24 LEN = 32

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://813541]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (4)
As of 2024-03-29 00:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found