Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Sorting on identical values

by slugger415 (Monk)
on Aug 08, 2019 at 17:15 UTC ( #11104187=perlquestion: print w/replies, xml ) Need Help??

slugger415 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Perl monks, I am in need of some sorting cleverness.

I have a list of long strings that includes URLs and other bits, e.g.

http://myurl.com/search/stringA?SomeMoreStuffA http://myurlA.com/search/stringB?SomeMoreStuffB http://myurlB.com/search/stringC?SomeMoreStuffX http://myurlC.com/search/stringA?SomeMoreStuffXYZ http://myurl.com/search/stringZ?SomeMoreStuffZZZ

I want to sort on just the strings between /search/ and the ? character, e.g.

$URL =~/search/(.+)\? my($searchString) = $1;

(Note that rows 1 and 4 above will have identical $searchString values.)

But I also want to keep a reference to the full original URL string and anything else in the string.

I can obviously create an array with the $searchString values and sort that, but how do I map them to the original full string? I can't use a hash, since there are often duplicates (e.g. rows 1 and 4 above, both 'stringA').

Any thoughts? Thanks much as always.

Scott

Replies are listed 'Best First'.
Re: Sorting on identical values
by tybalt89 (Prior) on Aug 08, 2019 at 17:26 UTC
    #!/usr/bin/perl # https://perlmonks.org/?node_id=11104187 use strict; use warnings; print map $_->[0], sort { $a->[1] cmp $b->[1] } map [ $_, m{/search/(.*?)\?} ], <DATA>; __DATA__ http://myurl.com/search/stringA?SomeMoreStuffA http://myurlA.com/search/stringB?SomeMoreStuffB http://myurlB.com/search/stringC?SomeMoreStuffX http://myurlC.com/search/stringA?SomeMoreStuffXYZ http://myurl.com/search/stringZ?SomeMoreStuffZZZ
Re: Sorting on identical values
by holli (Abbot) on Aug 08, 2019 at 17:26 UTC
    You want a schwartzian transform. Basically
    my @sorted_urls = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { m|/search/([^\?]+)\?|; [$_, $1] } @urls;


    holli

    You can lead your users to water, but alas, you cannot drown them.
Re: Sorting on identical values
by BillKSmith (Prior) on Aug 08, 2019 at 18:15 UTC
    A module (List::SomeUtils) can make this easy.
    >type slugger.pl use strict; use warnings; use List::SomeUtils qw(sort_by); use Data::Dumper; my @strings = ( "http://myurl.com/search/stringA?SomeMoreStuffA", "http://myurlA.com/search/stringB?SomeMoreStuffB", "http://myurlB.com/search/stringC?SomeMoreStuffX", "http://myurlC.com/search/stringA?SomeMoreStuffXYZ", "http://myurl.com/search/stringZ?SomeMoreStuffZZZ", ); my @sorted_strings = sort_by {/search(.+)\?/;$1} @strings; print Dumper(\@sorted_strings); >perl slugger.pl $VAR1 = [ 'http://myurl.com/search/stringA?SomeMoreStuffA', 'http://myurlC.com/search/stringA?SomeMoreStuffXYZ', 'http://myurlA.com/search/stringB?SomeMoreStuffB', 'http://myurlB.com/search/stringC?SomeMoreStuffX', 'http://myurl.com/search/stringZ?SomeMoreStuffZZZ' ]; >
    Bill
Re: Sorting on identical values (updated)
by AnomalousMonk (Bishop) on Aug 08, 2019 at 20:31 UTC

    And bringing up the rear, might as well have a GRT example:

    c:\@Work\Perl\monks>perl use strict; use warnings; print "perl version $] \n"; my @unsorted = qw( http://myurl.com/search/stringA?SomeMoreStuffA http://myurlA.com/search/stringB?SomeMoreStuffB http://myurlB.com/search/stringC?SomeMoreStuffX http://myurlC.com/search/stringA?SomeMoreStuffXYZ http://myurl.com/search/stringZ?SomeMoreStuffZZZ ); my $delim = '?'; sub decorate { return pack 'a* a a*', m{ /search/ ([^\Q$delim\E]*) } +xms, $delim, $_; } sub undecorate { return m{ [\Q$delim\E] (.*) }xms; } my @sorted = map undecorate(), sort # map { print("== '$_' \n"); $_; } # for debug map decorate(), @unsorted ; print "'$_' \n" for @sorted; __END__ perl version 5.008009 'http://myurl.com/search/stringA?SomeMoreStuffA' 'http://myurlC.com/search/stringA?SomeMoreStuffXYZ' 'http://myurlA.com/search/stringB?SomeMoreStuffB' 'http://myurlB.com/search/stringC?SomeMoreStuffX' 'http://myurl.com/search/stringZ?SomeMoreStuffZZZ'

    Update:

    sub decorate   { return pack 'a* a a*', m{ /search/ ([^\Q$delim\E]*) }xms, $delim, $_; }
    The use of pack in this function is overkill. join achieves the same effect with a little bit less overhead and arguably more clarity:
        sub decorate { return join '', m{ /search/ ([^\Q$delim\E]*) }xms, $delim, $_; }


    Give a man a fish:  <%-{-{-{-<

Re: Sorting on identical values
by slugger415 (Monk) on Aug 08, 2019 at 22:41 UTC

    Wonderful! thank you all for the useful replies. (I now have to learn more about Schwartzian sorting... great examples.)

      The Schwartzian transform is a general technique or pattern that is applicable to much more than sorting problems.


      Give a man a fish:  <%-{-{-{-<

      Refer to the FAQ.
      perldoc -q "How do I sort an array by (anything)?"
      Bill

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11104187]
Approved by toolic
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2020-10-24 22:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (247 votes). Check out past polls.

    Notices?