Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re^9: Challenge: 8 Letters, Most Words

by Limbic~Region (Chancellor)
on Oct 05, 2013 at 03:24 UTC ( #1056993=note: print w/replies, xml ) Need Help??

in reply to Re^8: Challenge: 8 Letters, Most Words
in thread Challenge: 8 Letters, Most Words

You may want to look at my solution as it will likely be the heat death of the universe before your code finishes if you wrote it in Perl. I too considered divide and conquer approach but abandoned it for simplicity and the fact that I can do millions of iterations per second in C.

Cheers - L~R

Replies are listed 'Best First'.
Re^10: Challenge: 8 Letters, Most Words
by McA (Priest) on Oct 05, 2013 at 06:30 UTC

    Good morning all.

    I suspect L~R will be right with his assumption concerning the runtime. I do have to ask my colleagues if they mind when they loose one processor on the development machine the next week... :-)

    Therefor I show my solution to receive the critics I earn.

    Have a nice day

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use 5.010; $| = 1; my %words; my %sorted; my %alphabet; while(defined(my $line = <>)) { chomp $line; next if $line =~ /%$/; # ignore entries with '%' at the end my $slot = length $line; $line = lc $line; next if $slot > 8; next if exists $words{$line}; $words{$line} = 1; my @chars = sort split //, $line; %alphabet = (%alphabet, map { $_ => 1 } @chars); my $characters = join('', @chars); if(defined $sorted{$characters}) { push @{$sorted{$characters}}, $line; } else { $sorted{$characters} = [$line]; } } my @sorted = keys %sorted; say "Base: " . scalar @sorted . " unique words"; my @alphabet = sort keys %alphabet; my $word; my $count = @alphabet; my $permutations = 0; my %found; my $max_found = 0; foreach (my $pos1 = 0; $pos1 < $count; $pos1++) { foreach (my $pos2 = 0; $pos2 < $count; $pos2++) { next if $pos2 < $pos1; foreach (my $pos3 = 0; $pos3 < $count; $pos3++) { next if $pos3 < $pos2; foreach (my $pos4 = 0; $pos4 < $count; $pos4++) { next if $pos4 < $pos3; foreach (my $pos5 = 0; $pos5 < $count; $pos5++) { next if $pos5 < $pos4; foreach (my $pos6 = 0; $pos6 < $count; $pos6++) { next if $pos6 < $pos5; foreach (my $pos7 = 0; $pos7 < $count; $pos7++ +) { next if $pos7 < $pos6; foreach (my $pos8 = 0; $pos8 < $count; $po +s8++) { next if $pos8 < $pos7; # Check what can be produced by this c +ombination $permutations++; say $permutations if $permutations % 1 +000 == 0; my %source; $source{$_}++ for(@alphabet[$pos1, $po +s2, $pos3, $pos4, $pos5, $pos6, $pos7, $pos8]); #say "================================ +==========="; #say "Source: ". Dumper(\%source); my @last; my $source; INNER: foreach my $word (@sorted) { #say "Word: $word"; my %source_copy = (%source); for(my $i = 0; $i < length $word; +$i++) { my $c = substr($word, $i, 1); next INNER unless $source_copy +{$c}; $source_copy{$c}--; } $source = join '', @alphabet[$pos1 +, $pos2, $pos3, $pos4, $pos5, $pos6, $pos7, $pos8]; #say join(', ', @{$sorted{$word}}) + . "' can be produced by '$source'"; push @last, @{$sorted{$word}}; } # something found which can be produce +d if(@last) { if(@last > $max_found) { %found = (); $found{$source} = [@last]; $max_found = @last; } elsif (@last == $max_found) { $found{$source} = [@last]; } } } } } } } } } } say "Found. $permutations out of $count unique characters"; say Dumper(\%found);
      My brain is mush so unfortunately, this critique is just the things that jumped out at me.
      %alphabet = (%alphabet, map { $_ => 1 } @chars);
      Would be better written as:
      @alphabet{@chars} = (); # No need to set values to 1 as you only ever +use keys
      I am not sure why you avoid autovivication but
      if(defined $sorted{$characters}) { push @{$sorted{$characters}}, $line; } else { $sorted{$characters} = [$line]; }
      Would work just as well as:
      push @{$sorted{$characters}}, $line;
      You waste a lot of time going through loops that you abort early
      foreach (my $pos2 = 0; $pos2 < $count; $pos2++) { next if $pos2 < $pos1;
      Would work a lot more efficiently as:
      for (my $pos2 = $pos1; $pos2 < $count; $pos2++) {
      I haven't tested it but the way you determine if one is a subset of the other would probably be better as subtracting one hash from another.

      Cheers - L~R

        Hi L~R,

        you're right with all. It is the result of putting code together when it's too late. I let the program run with Devel::NYTProf on a small subset of the dict. The result is what I assumed: Answering the question if a combination of letters is able to produce a word is the most expensive task. I tried to make it faster with a C-like implementation in perl, but it was slower than the hash-approach.

        The program is still running. Currently at iteration 879000. So, a long time to go... ;-)

        Anyway, a very very intersting puzzle. And it was fun looking at the different approaches and seeing that some of the "old" monks took that challenge.

        Best regards

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1056993]
LanX understands corrector but thinks it's editor
[LanX]: don't feign ignorance
[erix]: right
[erix]: "don't pretend to be ... "
[LanX]: or literally "do not pretend to be dumber than you are"
[erix]: we look right through you ! ;)
[erix]: italy berserks
[erix]: pretty absurd to blame them for "criminal association in illegal immigration"

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2018-03-19 22:30 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (246 votes). Check out past polls.