Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Challenge: 8 Letters, Most Words (2of12int.txt in 20^H^H 10 secs; Pure Perl)

by BrowserUk (Pope)
on Oct 09, 2013 at 06:25 UTC ( #1057488=note: print w/ replies, xml ) Need Help??


in reply to Challenge: 8 Letters, Most Words

Exhaustive search should work for any dictionary. (The 10 secs doesn't include the sort, but it could use a high watermark.):

#! perl -slw use strict; use Time::HiRes qw[ time ]; sub uniq{ my %x; @x{@_} = (); keys %x } my $start = time; my @lookup = map{ my $bits = pack 'C', $_; [ grep vec( $bits, $_, 1 ), 0 .. 7 ] } 0 .. 255; my %dict; while( <> ) { chomp; next if length > 8; my $r = \%dict; $r = $r->{ $_ } //= {} for sort split '', $_; push @{ $r->{_} }, $_; } my %stats; sub X { my( $first, $soFar, $tree ) = @_; if( @$soFar == 8 ) { my @words = uniq map { my $r = \%dict; $r = $r->{ $_ } for @{ $soFar }[ @{ $lookup[ $_ ] } ]; exists $r->{_} ? @{ $r->{_} } : (); } 0 .. 255; $stats{ join '', @$soFar } = \@words; return; } for( $first .. 'z' ) { next unless exists $tree->{ $_ }; X( $_, [ @$soFar, $_ ], $tree->{ $_ } ); } return; } X( 'a', [], \%dict ); print time - $start; <STDIN>; printf "[%d] %s : @{[ @{$stats{ $_ }} ]}\n", scalar @{$stats{$_}}, $_ +for sort{ @{ $stats{ $b } } <=> @{ $stats{ $a } } } keys %stats; __END__ [ 7:09:30.87] C:\test>1056884-calc 2of12inf.txt 10.4807748794556 [346] aeinprst : nape tars pets ape panties taper inept nastier print +airs spinet sniper pane rants pint peat spear terns tears trains reps earn tines paints napes ate strep rest repaint ti inapt ri +nse ripest astir priest neat part nets pans retina apes tarps reaps nip rain tern res sip sprint instep sear inters in prints +ires erst parent snip re astern satire praise stein pirates pirate pertains tan tar pie par tis nite ant apter pa ants rites ranis + rite eta as pastier rents nit ties pastern rate sapient strip pants pits apt pent snipe niters ran pate tsar stain pints panie +r spirea rapines tapers repast irate pars span pairs tens raps ripen stair tarn tans retsina tin rips pies inert eats reins ear +esprit aster rapes pin tarsi sera neaps rani step rein resin inter piastre stria painters sprain neap tripes tires raise tier en st +ern parents pas ares sari pert satin sine nitres ripens tape prates striae retain arts pet ani sat tare pertain insert reap pit pra +te pea trap antsier pantie sterna tries rasp east pen spit sit arise naps tire tea pares pat snare stir spar pine rap tares is pa +n rant earns pant art it painter sate pear tips pens rep pitas tarp penis aspen antis teas nips tap aspire piaster traps past pier er +a ears asp sane star seat entrap rip snit siren sin nae pins spare tapes stare niter spate sitar rent prise taps anti trip trips ti +ns spire ens tarns spent traipse antes tripe rise tapir tip pest peas nest sent its sir pries retains anise rat pates pare rats tr +ain saner pita eat nitre spat parse tapirs tiers ten risen nears sap spite set stripe sprite paniers sire strap tie sea septa pis + site psi eras repaints an pain paste ripe ire net entraps nits pair apse pains rates rapt per saint near paint rains spin strain + etas snap pines rapist rape are rapine sepia pantries spine peats tear at parties pi ante sprat retinas inset spa arisen ins parts + nites tine piers panes pats nap pears air [344] aeilprst : liars tars platies pets ape taper lairs airs lire lit +er ails peat spear plies palest tears lets lest reps ate strep rest lips leis slate ti ripest astir lie priest seal part apes t +arps piles reaps trial plate res sip paler litres sear retail ires erst rails salt plates re satire praise pirates pirate lea + liras tar pie retails pleas par pleats tis apter pa realist pelt rites peril alters rite relit stale eta earls as litre pa +stier lies trails ties rate strip pits apt petal pate list tsar spirea tapers petals repast irate pars plea pairs pleat stap +ler raps stair saltier lit pales tali alert rips spilt pies eats laser ear peal slit esprit aster rapes tarsi sera ales slip +alit spelt step alter piastre silt stria tripes tires raise liar tier leapt tiler teals pas ares sari sale pert tape prates +pal striae riles sail alp plats arts pet plait sat tare reap pit late pails steal prate least pea tiles trap tries slier rasp +sepal pastel east spit sit arise alps tire tea pares pat rial stir spar laps later rap tares is la rials ilea leas staler alert +s art it last sate ale aisle pear tips leaps rep pitas tarp trail lept tile stile teas tap aspire piaster teal traps past tai +l pier split era ears asp star seat pile rip spare rile tapes stare spate sitar pelts triples las prise islet lisper lepta tap +s plaster staple trip pail spiel lair pilaster trips slat pearl spire lip traipse tripe rise tapir lei pale tip slept pest peas +tilers its sir lite pries rat pearls serial rail pates pare rats lapse tails pita eat slap spat pals parse plaits tapirs lisp + tiers lap pliers salter tales sap spite set real ail stripe sprite sire strap splat tie sea septa pis site psi eras plat pa +ste ripe ire let pair apse leap rates rapt per triple etas perils rapist rape liters lira are trials spiral sepia peats tear + at parties pi earl peals sprat spa isle parts tale piers pats pears air ...

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: Challenge: 8 Letters, Most Words (2of12int.txt in 20^H^H 10 secs; Pure Perl)
Download Code
Re^2: Challenge: 8 Letters, Most Words (2of12int.txt in 20^H^H 10 secs; Pure Perl)
by hdb (Prior) on Oct 09, 2013 at 09:40 UTC

    It is quite impressive. I have not yet understood how it works. It seems not to work on the following dictionary:

    exactest one two three four five six seven eight nine ten

    where one solution is eehnortw covering four words but your script says [1] aceesttx : exactest.

    Update: After further study it looks to me, that you are checking all 8 letter classes derived from the 8 letter words in the dictionary and see how many words are captured. Which is probably giving you the correct solution for many real world dictionaries. This way, you do avoid the combinatorial explosion that makes this challenge difficult...

    Still quite impressive!

        Even if it is not guaranteed to deliver the correct solution, it is so much faster than my attempt that also gives the same answer. My brute force appraoch already runs for more than 24 hours...

      It does recurse through all 8-letter combos, but short-circuits at each level if the combination so far cannot be derived from the dictionary supplied.

      next unless exists $tree->{ $_ };

      If the letter combination for your 'eehnortw' existed in the dictionary, it would be found very quickly:

      [17:25:23.81] C:\test>type hdb.dict exactest one two three four five six seven eight nine ten torewhen [17:25:47.91] C:\test>1056884-calc hdb.dict 0.00380706787109375 [5] eehnortw : torewhen three one two ten [1] aceesttx : exactest

      Comment out the short-circuit and it will consider all 8-letter possibilities .. and run more slowly.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      There are 25 letter sets that will cover 4 words of your sample dictionary:

      [19:35:10.42] C:\test>1056884-calc hdb.dict 2aaaaenot 3aaaenotw 4aeinnotw 373.12408208847 [4] einnootw : one two nine ten [4] efinnotv : five one nine ten [4] einnotvw : one two nine ten [4] eenostvw : one two seven ten [4] einnotuw : one two nine ten [4] einnostw : one two nine ten [4] eghinnot : one eight nine ten [4] ceinnotw : one two nine ten [4] eehnortw : three one two ten [4] efinnotw : one two nine ten [4] efinotvw : five one two ten [4] eiinnotw : one two nine ten [4] eeinnotw : one two nine ten [4] eghinotw : one two eight ten [4] ehinnotw : one two nine ten [4] einnortw : one two nine ten [4] einnotwx : one two nine ten [4] einostwx : six one two ten [4] einnnotw : one two nine ten [4] eginnotw : one two nine ten [4] efnortuw : one two four ten [4] einnotww : one two nine ten [4] aeinnotw : one two nine ten [4] einnottw : one two nine ten [4] einnostx : six one nine ten [3] eenostwx : one two ten ...

      375 seconds isn't as impressive, but better than 24 hours :)

      This makes use of another optimisation that only benefits when the dictionary is small like yours:

      #! perl -slw use strict; use Time::HiRes qw[ time ]; $|++; sub uniq{ my %x; @x{@_} = (); keys %x } my $start = time; my @lookup = map{ my $bits = pack 'C', $_; [ grep vec( $bits, $_, 1 ), 0 .. 7 ] } 0 .. 255; my %dict; my %alphabet; while( <> ) { chomp; next if length > 8; my $r = \%dict; undef( $alphabet{ $_ } ), $r = $r->{ $_ } //= {} for sort split '' +, $_; push @{ $r->{_} }, $_; } my @alphabet = sort keys %alphabet; my $best = [ 0, '' ]; my %stats; sub X { my( $first, $soFar, $tree ) = @_; if( @$soFar == 8 ) { my @words = uniq map { my $r = \%dict; $r = $r->{ $_ } for @{ $soFar }[ @{ $lookup[ $_ ] } ]; exists $r->{_} ? @{ $r->{_} } : (); } 0 .. 255; return unless @words > 1; print @{ $best = [ scalar @words, join '', @$soFar ] } if @wor +ds > $best->[0]; $stats{ join '', @$soFar } = \@words; return; } for( grep $_ ge $first, @alphabet ) { # next unless exists $tree->{ $_ }; X( $_, [ @$soFar, $_ ], $tree->{ $_ } ); } return; } X( 'a', [], \%dict ); print time - $start; <STDIN>; printf "[%d] %s : @{[ @{$stats{ $_ }} ]}\n", scalar @{$stats{$_}}, $_ +for sort{ @{ $stats{ $b } } <=> @{ $stats{ $a } } } keys %stats;

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      @$soFar, $_

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1057488]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2014-09-21 20:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (175 votes), past polls