Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re: Getting impossible things right (behaviour of keys)

by blakem (Monsignor)
on Oct 24, 2001 at 13:08 UTC ( #121043=note: print w/ replies, xml ) Need Help??

in reply to Getting impossible things right (behaviour of keys)

Use a custom sort in your foreach:

foreach my $suffix (keys %sufdata)
foreach my $suffix (sort {length($b) <=> length($a)} keys %sufdata)


Comment on Re: Getting impossible things right (behaviour of keys)
Select or Download Code
Re:{2} Getting impossible things right (behaviour of keys)
by jeroenes (Priest) on Oct 24, 2001 at 13:43 UTC
    Or even:
    foreach my $suffic( sort { length($b) <=> length($a) or $a cmp $b } keys %sufdata)
    to sort both on length and alphanumeric
      Or you could exploit the fact that the regex looks for the leftmost matching string, and do something like:
      my $pattern = join('|',(sort {$a cmp $b} keys %sufdata)); if ($name =~ s/($pattern)$/$sufdata{$1}/o) { print "Suffix $1 -> $name\n"; return; } print " no rule apply -> $name\n";


        Hmm.. from perlre:
        Alternatives are tried from left to right, so the first alternative found for which the entire expression matches, is the one that is chosen. This means that alternatives are not necessarily greedy. For example: when matching `foo|foot' against "barefoot", only the "foo" part will match, as that is the first alternative tried, and it successfully matches the target string.

        So that would take a string random with respect to length.


        Update: I see. Can you explain that regex-feature?

        If this is going to be at all robust (umm and work as desired, sorry Blakem) I would change the sort to the following:
        my $regex=join '|', map {substr $_,2} sort {$a cmp $b} map {pack "SA*",length($_),quotemeta($_)} keys %su +fdata;
        Your code doesnt actually sort the words by length. (Yes I _am_ deliberately storing the length before I quotemeta it.)


        Thanks to Amoe I reexamined this and realized I missed an opportunity for lazyness that geeky virtue:

        my $regex=join '|', map {substr $_,2} sort map {pack "SA*",length($_),quotemeta($_)} keys %su +fdata;
        Although IIRC perl will optimize the first into the second anyway, it does save about 10 chars or so..
        Oh also for the curious this is more modern form of the Schwartizian Transform which is a very cool trick. Unfortunately I cant remember the name of this version, nor the link to the excellent document I read about it. Hopefully someone that does will post a reply.

        Tilly kindly supplied the link (see replies to this post). However the name I had in mind is the GRT or Guttman Rosler Transform.

        DeMerphq / Yves
        Have you registered your Name Space?

        I don't believe sorting on cmp alone will do it in this case. If you had two keys 'ba' and 'cba', then you're going to get the wrong behavior. As long as you're tying the regex to the end of the string, length should be sufficient.

        Dr. Michael K. Neylon - || "You've left the lens cap of your mind on again, Pinky" - The Brain
        "I can see my house from here!"
        It's not what you know, but knowing how to find it if you don't know that's important

        I feel ashamed...

        but I donīt understand this code. I would like to before I try if it works or not. (call me a theory fetish monk)

        First I donīt understand why you use cmp for the sort.
        Then (or because of this) I donīt see how we can be sure that the longest suffixes are examined first.

        I could understand what ($name =~ s/($pattern)$/$sufdata{$1}/o) does. And in fact it is elegant to use regexp instead of an own iteration, just the $pattern creation isnīt clear.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://121043]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2014-07-11 21:48 GMT
Find Nodes?
    Voting Booth?

    When choosing user names for websites, I prefer to use:

    Results (235 votes), past polls