Welcome to the Monastery PerlMonks

### Re^2: Challenge: 8 Letters, Most Words (2of12int.txt in 20^H^H 10 secs; Pure Perl)

by hdb (Monsignor)
 on Oct 09, 2013 at 09:40 UTC ( #1057521=note: print w/replies, xml ) Need Help??

It is quite impressive. I have not yet understood how it works. It seems not to work on the following dictionary:

```exactest
one
two
three
four
five
six
seven
eight
nine
ten

where one solution is eehnortw covering four words but your script says [1] aceesttx : exactest.

Update: After further study it looks to me, that you are checking all 8 letter classes derived from the 8 letter words in the dictionary and see how many words are captured. Which is probably giving you the correct solution for many real world dictionaries. This way, you do avoid the combinatorial explosion that makes this challenge difficult...

Still quite impressive!

Replies are listed 'Best First'.
Re^3: Challenge: 8 Letters, Most Words (2of12int.txt in 20^H^H 10 secs; Pure Perl)
by BrowserUk (Pope) on Oct 09, 2013 at 16:37 UTC

It does recurse through all 8-letter combos, but short-circuits at each level if the combination so far cannot be derived from the dictionary supplied.

```        next unless exists \$tree->{ \$_ };

If the letter combination for your 'eehnortw' existed in the dictionary, it would be found very quickly:

```[17:25:23.81] C:\test>type hdb.dict
exactest
one
two
three
four
five
six
seven
eight
nine
ten
torewhen

[17:25:47.91] C:\test>1056884-calc hdb.dict
0.00380706787109375

[5] eehnortw : torewhen three one two ten
[1] aceesttx : exactest

Comment out the short-circuit and it will consider all 8-letter possibilities .. and run more slowly.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: Challenge: 8 Letters, Most Words (2of12int.txt in 20^H^H 10 secs; Pure Perl)
by BrowserUk (Pope) on Oct 09, 2013 at 18:47 UTC

There are 25 letter sets that will cover 4 words of your sample dictionary:

```[19:35:10.42] C:\test>1056884-calc hdb.dict
2aaaaenot
3aaaenotw
4aeinnotw
373.12408208847

[4] einnootw : one two nine ten
[4] efinnotv : five one nine ten
[4] einnotvw : one two nine ten
[4] eenostvw : one two seven ten
[4] einnotuw : one two nine ten
[4] einnostw : one two nine ten
[4] eghinnot : one eight nine ten
[4] ceinnotw : one two nine ten
[4] eehnortw : three one two ten
[4] efinnotw : one two nine ten
[4] efinotvw : five one two ten
[4] eiinnotw : one two nine ten
[4] eeinnotw : one two nine ten
[4] eghinotw : one two eight ten
[4] ehinnotw : one two nine ten
[4] einnortw : one two nine ten
[4] einnotwx : one two nine ten
[4] einostwx : six one two ten
[4] einnnotw : one two nine ten
[4] eginnotw : one two nine ten
[4] efnortuw : one two four ten
[4] einnotww : one two nine ten
[4] aeinnotw : one two nine ten
[4] einnottw : one two nine ten
[4] einnostx : six one nine ten
[3] eenostwx : one two ten
...

375 seconds isn't as impressive, but better than 24 hours :)

This makes use of another optimisation that only benefits when the dictionary is small like yours:

```#! perl -slw
use strict;
use Time::HiRes qw[ time ];

\$|++;

sub uniq{ my %x; @x{@_} = (); keys %x }

my \$start = time;

my @lookup = map{
my \$bits = pack 'C', \$_;
[ grep vec( \$bits, \$_, 1 ), 0 .. 7 ]
} 0 .. 255;

my %dict;
my %alphabet;
while( <> ) {
chomp;
next if length > 8;
my \$r = \%dict;
undef( \$alphabet{ \$_ } ), \$r = \$r->{ \$_ } //= {} for sort split ''
+, \$_;
push @{ \$r->{_} }, \$_;
}
my @alphabet = sort keys %alphabet;

my \$best = [ 0, '' ];
my %stats;
sub X {
my( \$first, \$soFar, \$tree ) = @_;
if( @\$soFar == 8 ) {
my @words = uniq map {
my \$r = \%dict;
\$r = \$r->{ \$_ } for @{ \$soFar }[ @{ \$lookup[ \$_ ] } ];
exists \$r->{_} ? @{ \$r->{_} } : ();
} 0 .. 255;
return unless @words > 1;
print @{ \$best = [ scalar @words, join '', @\$soFar ] } if @wor
+ds > \$best->[0];
\$stats{ join '', @\$soFar } = \@words;
return;
}

for( grep \$_ ge \$first, @alphabet ) {
#        next unless exists \$tree->{ \$_ };
X( \$_, [ @\$soFar, \$_ ], \$tree->{ \$_ } );
}
return;
}

X( 'a', [], \%dict );

print time - \$start; <STDIN>;
printf "[%d] %s : @{[ @{\$stats{ \$_ }} ]}\n", scalar @{\$stats{\$_}}, \$_
+for
sort{ @{ \$stats{ \$b } } <=> @{ \$stats{ \$a } } } keys %stats;

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
@\$soFar, \$_
Re^3: Challenge: 8 Letters, Most Words (2of12int.txt in 20^H^H 10 secs; Pure Perl)
by LanX (Chancellor) on Oct 09, 2013 at 12:20 UTC

Even if it is not guaranteed to deliver the correct solution, it is so much faster than my attempt that also gives the same answer. My brute force appraoch already runs for more than 24 hours...

> My brute force appraoch already runs for more than 24 hours...

to calculate the dictionary word with the highest coverage??? o_O

Cheers Rolf

( addicted to the Perl Programming Language)

Create A New User
Node Status?
node history
Node Type: note [id://1057521]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2017-08-21 05:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
Who is your favorite scientist and why?

Results (317 votes). Check out past polls.

Notices?