Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

How do I sort a list of strings?

( #2254=categorized question: print w/ replies, xml ) Need Help??
Contributed by vroom on Jan 21, 2000 at 00:18 UTC
Q&A  > sorting


Answer: How I do I sort a list of strings?
contributed by vroom

Use the sort operator sort defaults to a textual comparison unless something else is put in the sort block see How do I sort a list of numbers?

@sorted=sort @unsorted;
Answer: How I do I sort a list of strings?
contributed by Mago

Quoting from perldoc -f sort:

sort SUBNAME LIST sort BLOCK LIST sort LIST

In list context, this sorts the LIST and returns the sorted list value. In scalar context, the behaviour of sort() is undefined.
If SUBNAME or BLOCK is omitted, sorts in standard string comparison order. If SUBNAME is specified, it gives the name of a subroutine that returns an integer less than, equal to, or greater than 0, depending on how the elements of the list are to be ordered. (The <=> and cmp operators are extremely useful in such routines.) SUBNAME may be a scalar variable name (unsubscripted), in which case the value provides the name of (or a reference to) the actual subroutine to use. In place of a SUBNAME, you can provide a BLOCK as an anonymous, in-line sort subroutine.
If the subroutine's prototype is ($$), the elements to be compared are passed by reference in @_, as for a normal subroutine. This is slower than unprototyped subroutines, where the elements to be compared are passed into the subroutine as the package global variables $a and $b (see example below). Note that in the latter case, it is usually counter-productive to declare $a and $b as lexicals.
In either case, the subroutine may not be recursive. The values to be compared are always passed by reference, so don't modify them.
You also cannot exit out of the sort block or subroutine using any of the loop control operators described in perlsyn or with goto.
When use locale is in effect, sort LIST sorts LIST according to the current collation locale.
Perl 5.6 and earlier used a quicksort algorithm to implement sort. That algorithm was not stable, and could go quadratic. (A stable sort preserves the input order of elements that compare equal. Although quicksort's run time is O(NlogN) when averaged over all arrays of length N, the time can be O(N**2), quadratic behavior, for some inputs.) In 5.7, the quicksort implementation was replaced with a stable mergesort algorithm whose worst case behavior is O(NlogN). But benchmarks indicated that for some inputs, on some platforms, the original quicksort was faster. 5.8 has a sort pragma for limited control of the sort. Its rather blunt control of the underlying algorithm may not persist into future perls, but the ability to characterize the input or output in implementation independent ways quite probably will.

Examples:

# sort lexically @articles = sort @files; # same thing, but with explicit sort routine @articles = sort {$a cmp $b} @files; # now case-insensitively @articles = sort {uc($a) cmp uc($b)} @files; # same thing in reversed order @articles = sort {$b cmp $a} @files; # sort numerically ascending @articles = sort {$a <=> $b} @files; # sort numerically descending @articles = sort {$b <=> $a} @files; # this sorts the %age hash by value instead of key # using an in-line function @eldest = sort { $age{$b} <=> $age{$a} } keys %age; # sort using explicit subroutine name sub byage { $age{$a} <=> $age{$b}; # presuming numeric } @sortedclass = sort byage @class; sub backwards { $b cmp $a } @harry = qw(dog cat x Cain Abel); @george = qw(gone chased yz Punished Axed); print sort @harry; # prints AbelCaincatdogx print sort backwards @harry; # prints xdogcatCainAbel print sort @george, 'to', @harry; # prints AbelAxedCainPunishedcatchaseddoggonetoxyz # inefficiently sort by descending numeric compare using # the first integer after the first = sign, or the # whole record case-insensitively otherwise @new = sort { ($b =~ /=(\d+)/)[0] <=> ($a =~ /=(\d+)/)[0] || uc($a) cmp uc($b) } @old; # same thing, but much more efficiently; # we'll build auxiliary indices instead # for speed @nums = @caps = (); for (@old) { push @nums, /=(\d+)/; push @caps, uc($_); } @new = @old[ sort { $nums[$b] <=> $nums[$a] || $caps[$a] cmp $caps[$b] } 0..$#old ]; # same thing, but without any temps @new = map { $_->[0] } sort { $b->[1] <=> $a->[1] || $a->[2] cmp $b->[2] } map { [$_, /=(\d+)/, uc($_)] } @old; # using a prototype allows you to use any comparison subroutine # as a sort subroutine (including other package's subroutines) package other; sub backwards ($$) { $_[1] cmp $_[0]; } # $a and $b are not set + here package main; @new = sort other::backwards @old; # guarantee stability, regardless of algorithm use sort 'stable'; @new = sort { substr($a, 3, 5) cmp substr($b, 3, 5) } @old; # force use of mergesort (not portable outside Perl 5.8) use sort '_mergesort'; # note discouraging _ @new = sort { substr($a, 3, 5) cmp substr($b, 3, 5) } @old;

If you're using strict, you must not declare $a and $b as lexicals. They are package globals. That means if you're in the main package and type
@articles = sort {$b <=> $a} @files;

then $a and $b are $main::a and $main::b (or $::a and $::b), but if you're in the FooPack package, it's the same as typing
@articles = sort {$FooPack::b <=> $FooPack::a} @files;

The comparison function is required to behave. If it returns inconsistent results (sometimes saying $x[1] is less than $x[2] and sometimes saying the opposite, for example) the results are not well-defined.

Edited by QandAEditors for character escapes

Please (register and) log in if you wish to add an answer



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others browsing the Monastery: (10)
    As of 2014-07-22 21:44 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My favorite superfluous repetitious redundant duplicative phrase is:









      Results (129 votes), past polls