Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Re: Re:{2} Getting impossible things right (behaviour of keys)

by demerphq (Chancellor)
on Oct 24, 2001 at 14:45 UTC ( #121071=note: print w/ replies, xml ) Need Help??


in reply to Re: Re:{2} Getting impossible things right (behaviour of keys)
in thread Getting impossible things right (behaviour of keys)

If this is going to be at all robust (umm and work as desired, sorry Blakem) I would change the sort to the following:

my $regex=join '|', map {substr $_,2} sort {$a cmp $b} map {pack "SA*",length($_),quotemeta($_)} keys %su +fdata;
Your code doesnt actually sort the words by length. (Yes I _am_ deliberately storing the length before I quotemeta it.)

:-)

Update
Thanks to Amoe I reexamined this and realized I missed an opportunity for lazyness that geeky virtue:

my $regex=join '|', map {substr $_,2} sort map {pack "SA*",length($_),quotemeta($_)} keys %su +fdata;
Although IIRC perl will optimize the first into the second anyway, it does save about 10 chars or so..
Oh also for the curious this is more modern form of the Schwartizian Transform which is a very cool trick. Unfortunately I cant remember the name of this version, nor the link to the excellent document I read about it. Hopefully someone that does will post a reply.

Update2
Tilly kindly supplied the link (see replies to this post). However the name I had in mind is the GRT or Guttman Rosler Transform.

DeMerphq / Yves
--
Have you registered your Name Space?


Comment on Re: Re: Re:{2} Getting impossible things right (behaviour of keys)
Select or Download Code
Re (tilly) 6: Getting impossible things right (behaviour of keys)
by tilly (Archbishop) on Oct 24, 2001 at 15:34 UTC
    I think the phrase you want is, "packed default".

    It is discussed in this paper on efficient sorting in Perl.

Re: Re: Re: Re:{2} Getting impossible things right (behaviour of keys)
by blakem (Monsignor) on Oct 24, 2001 at 21:35 UTC
    Ah, but you dont *need* to sort by length... the regex is anchored at the end, so the pattern that matches first from left to right will *already be* the longest match. For instance, look at the following code:
    #!/usr/bin/perl -wT use strict; my $text = 'fedcba'; $text =~ (/(a|ba|cba|dcba)$/); print "Match: $1\n"; =OUTPUT Match: dcba
    It matches on the longest one, even though its at the end of the alternation.... thats because the regex engine works from left to right, and the first one that matches wins. That was the whole point of my post, sorry I wasn't more explicit.

    -Blake

      Yes I see it now. Leftmost longest. I omitted the implication of the $. I should have caught the hint.

      Good one. :-)

      Yves / DeMerphq
      --
      Have you registered your Name Space?

        Well, it was my last post of the night, and I skimped on the explanation so I could get some sleep ;-P

        This thread illustrates why I try to post complete, self-contained scripts. (i.e. sample input via __DATA__, printed output samples, etc) I think much of the confusion could have been avoided, if the original script had a set of expected inputs and outputs.

        -Blake

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://121071]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (12)
As of 2014-08-21 19:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (143 votes), past polls