Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^10: Challenge: 8 Letters, Most Words

by McA (Curate)
on Oct 05, 2013 at 06:30 UTC ( #1057008=note: print w/ replies, xml ) Need Help??


in reply to Re^9: Challenge: 8 Letters, Most Words
in thread Challenge: 8 Letters, Most Words

Good morning all.

I suspect L~R will be right with his assumption concerning the runtime. I do have to ask my colleagues if they mind when they loose one processor on the development machine the next week... :-)

Therefor I show my solution to receive the critics I earn.

Have a nice day
McA

#!/usr/bin/perl use strict; use warnings; use Data::Dumper; use 5.010; $| = 1; my %words; my %sorted; my %alphabet; while(defined(my $line = <>)) { chomp $line; next if $line =~ /%$/; # ignore entries with '%' at the end my $slot = length $line; $line = lc $line; next if $slot > 8; next if exists $words{$line}; $words{$line} = 1; my @chars = sort split //, $line; %alphabet = (%alphabet, map { $_ => 1 } @chars); my $characters = join('', @chars); if(defined $sorted{$characters}) { push @{$sorted{$characters}}, $line; } else { $sorted{$characters} = [$line]; } } my @sorted = keys %sorted; say "Base: " . scalar @sorted . " unique words"; my @alphabet = sort keys %alphabet; my $word; my $count = @alphabet; my $permutations = 0; my %found; my $max_found = 0; foreach (my $pos1 = 0; $pos1 < $count; $pos1++) { foreach (my $pos2 = 0; $pos2 < $count; $pos2++) { next if $pos2 < $pos1; foreach (my $pos3 = 0; $pos3 < $count; $pos3++) { next if $pos3 < $pos2; foreach (my $pos4 = 0; $pos4 < $count; $pos4++) { next if $pos4 < $pos3; foreach (my $pos5 = 0; $pos5 < $count; $pos5++) { next if $pos5 < $pos4; foreach (my $pos6 = 0; $pos6 < $count; $pos6++) { next if $pos6 < $pos5; foreach (my $pos7 = 0; $pos7 < $count; $pos7++ +) { next if $pos7 < $pos6; foreach (my $pos8 = 0; $pos8 < $count; $po +s8++) { next if $pos8 < $pos7; # Check what can be produced by this c +ombination $permutations++; say $permutations if $permutations % 1 +000 == 0; my %source; $source{$_}++ for(@alphabet[$pos1, $po +s2, $pos3, $pos4, $pos5, $pos6, $pos7, $pos8]); #say "================================ +==========="; #say "Source: ". Dumper(\%source); my @last; my $source; INNER: foreach my $word (@sorted) { #say "Word: $word"; my %source_copy = (%source); for(my $i = 0; $i < length $word; +$i++) { my $c = substr($word, $i, 1); next INNER unless $source_copy +{$c}; $source_copy{$c}--; } $source = join '', @alphabet[$pos1 +, $pos2, $pos3, $pos4, $pos5, $pos6, $pos7, $pos8]; #say join(', ', @{$sorted{$word}}) + . "' can be produced by '$source'"; push @last, @{$sorted{$word}}; } # something found which can be produce +d if(@last) { if(@last > $max_found) { %found = (); $found{$source} = [@last]; $max_found = @last; } elsif (@last == $max_found) { $found{$source} = [@last]; } } } } } } } } } } say "Found. $permutations out of $count unique characters"; say Dumper(\%found);


Comment on Re^10: Challenge: 8 Letters, Most Words
Download Code
Re^11: Challenge: 8 Letters, Most Words
by Limbic~Region (Chancellor) on Oct 05, 2013 at 13:23 UTC
    McA,
    My brain is mush so unfortunately, this critique is just the things that jumped out at me.
    %alphabet = (%alphabet, map { $_ => 1 } @chars);
    Would be better written as:
    @alphabet{@chars} = (); # No need to set values to 1 as you only ever +use keys
    I am not sure why you avoid autovivication but
    if(defined $sorted{$characters}) { push @{$sorted{$characters}}, $line; } else { $sorted{$characters} = [$line]; }
    Would work just as well as:
    push @{$sorted{$characters}}, $line;
    You waste a lot of time going through loops that you abort early
    foreach (my $pos2 = 0; $pos2 < $count; $pos2++) { next if $pos2 < $pos1;
    Would work a lot more efficiently as:
    for (my $pos2 = $pos1; $pos2 < $count; $pos2++) {
    I haven't tested it but the way you determine if one is a subset of the other would probably be better as subtracting one hash from another.

    Cheers - L~R

      Hi L~R,

      you're right with all. It is the result of putting code together when it's too late. I let the program run with Devel::NYTProf on a small subset of the dict. The result is what I assumed: Answering the question if a combination of letters is able to produce a word is the most expensive task. I tried to make it faster with a C-like implementation in perl, but it was slower than the hash-approach.

      The program is still running. Currently at iteration 879000. So, a long time to go... ;-)

      Anyway, a very very intersting puzzle. And it was fun looking at the different approaches and seeing that some of the "old" monks took that challenge.

      Best regards
      McA

        Re^12: Challenge: 8 Letters, Most Words

        ^12 just seems wrong somehow :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1057008]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (12)
As of 2014-07-31 22:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (253 votes), past polls