Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^9: Challenge: 8 Letters, Most Words

by Limbic~Region (Chancellor)
on Oct 05, 2013 at 03:24 UTC ( #1056993=note: print w/ replies, xml ) Need Help??


in reply to Re^8: Challenge: 8 Letters, Most Words
in thread Challenge: 8 Letters, Most Words

McA,
You may want to look at my solution as it will likely be the heat death of the universe before your code finishes if you wrote it in Perl. I too considered divide and conquer approach but abandoned it for simplicity and the fact that I can do millions of iterations per second in C.

Cheers - L~R


Comment on Re^9: Challenge: 8 Letters, Most Words
Re^10: Challenge: 8 Letters, Most Words
by McA (Priest) on Oct 05, 2013 at 06:30 UTC

    Good morning all.

    I suspect L~R will be right with his assumption concerning the runtime. I do have to ask my colleagues if they mind when they loose one processor on the development machine the next week... :-)

    Therefor I show my solution to receive the critics I earn.

    Have a nice day
    McA

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use 5.010; $| = 1; my %words; my %sorted; my %alphabet; while(defined(my $line = <>)) { chomp $line; next if $line =~ /%$/; # ignore entries with '%' at the end my $slot = length $line; $line = lc $line; next if $slot > 8; next if exists $words{$line}; $words{$line} = 1; my @chars = sort split //, $line; %alphabet = (%alphabet, map { $_ => 1 } @chars); my $characters = join('', @chars); if(defined $sorted{$characters}) { push @{$sorted{$characters}}, $line; } else { $sorted{$characters} = [$line]; } } my @sorted = keys %sorted; say "Base: " . scalar @sorted . " unique words"; my @alphabet = sort keys %alphabet; my $word; my $count = @alphabet; my $permutations = 0; my %found; my $max_found = 0; foreach (my $pos1 = 0; $pos1 < $count; $pos1++) { foreach (my $pos2 = 0; $pos2 < $count; $pos2++) { next if $pos2 < $pos1; foreach (my $pos3 = 0; $pos3 < $count; $pos3++) { next if $pos3 < $pos2; foreach (my $pos4 = 0; $pos4 < $count; $pos4++) { next if $pos4 < $pos3; foreach (my $pos5 = 0; $pos5 < $count; $pos5++) { next if $pos5 < $pos4; foreach (my $pos6 = 0; $pos6 < $count; $pos6++) { next if $pos6 < $pos5; foreach (my $pos7 = 0; $pos7 < $count; $pos7++ +) { next if $pos7 < $pos6; foreach (my $pos8 = 0; $pos8 < $count; $po +s8++) { next if $pos8 < $pos7; # Check what can be produced by this c +ombination $permutations++; say $permutations if $permutations % 1 +000 == 0; my %source; $source{$_}++ for(@alphabet[$pos1, $po +s2, $pos3, $pos4, $pos5, $pos6, $pos7, $pos8]); #say "================================ +==========="; #say "Source: ". Dumper(\%source); my @last; my $source; INNER: foreach my $word (@sorted) { #say "Word: $word"; my %source_copy = (%source); for(my $i = 0; $i < length $word; +$i++) { my $c = substr($word, $i, 1); next INNER unless $source_copy +{$c}; $source_copy{$c}--; } $source = join '', @alphabet[$pos1 +, $pos2, $pos3, $pos4, $pos5, $pos6, $pos7, $pos8]; #say join(', ', @{$sorted{$word}}) + . "' can be produced by '$source'"; push @last, @{$sorted{$word}}; } # something found which can be produce +d if(@last) { if(@last > $max_found) { %found = (); $found{$source} = [@last]; $max_found = @last; } elsif (@last == $max_found) { $found{$source} = [@last]; } } } } } } } } } } say "Found. $permutations out of $count unique characters"; say Dumper(\%found);
      McA,
      My brain is mush so unfortunately, this critique is just the things that jumped out at me.
      %alphabet = (%alphabet, map { $_ => 1 } @chars);
      Would be better written as:
      @alphabet{@chars} = (); # No need to set values to 1 as you only ever +use keys
      I am not sure why you avoid autovivication but
      if(defined $sorted{$characters}) { push @{$sorted{$characters}}, $line; } else { $sorted{$characters} = [$line]; }
      Would work just as well as:
      push @{$sorted{$characters}}, $line;
      You waste a lot of time going through loops that you abort early
      foreach (my $pos2 = 0; $pos2 < $count; $pos2++) { next if $pos2 < $pos1;
      Would work a lot more efficiently as:
      for (my $pos2 = $pos1; $pos2 < $count; $pos2++) {
      I haven't tested it but the way you determine if one is a subset of the other would probably be better as subtracting one hash from another.

      Cheers - L~R

        Hi L~R,

        you're right with all. It is the result of putting code together when it's too late. I let the program run with Devel::NYTProf on a small subset of the dict. The result is what I assumed: Answering the question if a combination of letters is able to produce a word is the most expensive task. I tried to make it faster with a C-like implementation in perl, but it was slower than the hash-approach.

        The program is still running. Currently at iteration 879000. So, a long time to go... ;-)

        Anyway, a very very intersting puzzle. And it was fun looking at the different approaches and seeing that some of the "old" monks took that challenge.

        Best regards
        McA

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1056993]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (9)
As of 2014-12-18 06:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (43 votes), past polls