Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Re: Re: Perl's pearls

by petral (Curate)
on Jan 02, 2002 at 20:17 UTC ( #135696=note: print w/ replies, xml ) Need Help??


in reply to Re: Re: Perl's pearls
in thread Perl's pearls

It seems like the main improvement/optimization would be not looping twice through the list of all words.  Move *all* processing into the main loop:

my (%word, %gram); while (<>) { chomp; # $_ = lc $_; /[^a-z]/ and next; my $sig = pack "C*", sort unpack "C*", $_; if (exists $word{$sig}) { if (exists $gram{$sig}) { next if $gram{$sig} =~ /\b$_\b/; $gram{$sig} .= " $_"; # rare } else { next if $word{$sig} eq $_; $gram{$sig} = "$word{$sig} $_"; # rare } } else { $word{$sig} = $_; # mostly } } print join "\n", (sort values %gram), ''; # just output short list

Only the first word of an anagram set is in both lists.
Here's some more finds, mostly from the short OED from here
ablest bleats stable tables adroitly dilatory idolatry angered derange enraged grandee grenade ascertain cartesian sectarian asleep elapse please aspirant partisan attentive tentative auctioned cautioned education canoe ocean comedian demoniac compile polemic covert vector danger gander garden deist diets edits idest sited tides emits items metis mites smite times emitter termite lapse leaps pales peals pleas nastily saintly obscurantist subtractions observe obverse verbose opt pot top opts post pots spot stop tops opus soup oy yo petrography typographer peripatetic precipitate present repents serpent presume supreme resin rinse risen siren salivated validates slitting stilting tiltings titlings tlingits views wives vowels wolves woodlark workload


  p


Comment on Re: Re: Re: Perl's pearls
Select or Download Code
Re: Re: Re: Re: Perl's pearls
by gmax (Abbot) on Jan 02, 2002 at 20:55 UTC
    Brilliant! On my computer, your script is 13% faster than mine, using my 100_000 words list. With the one that you suggested (thanks, BTW) which is more than double, the gain is 23%!
    It means that yous solution is more scalable and thus better suitable for this kind of tasks.
    Like every "eureka" solution, your improvement looks quite simple, now that I see it! :-)
    Thanks.
     _  _ _  _  
    (_|| | |(_|><
     _|   
    

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://135696]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (11)
As of 2014-09-23 18:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (238 votes), past polls