Beefy Boxes and Bandwidth Generously Provided by pair Networks chromatic writing perl on a camel
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
jimt,
I am sad to tell you that despite providing an iterative version that need not be called more than necessary, it is terribly slow. I timed (not Benchmarked) 4 versions and unfortunately this wasn't even a contender:
  • java recursive 1 = 12 seconds
  • perl recursive 1 = 13 seconds
  • perl recursive 2 = 28 seconds
  • perl iterative 1 = 6_190 seconds (only 25% done, getting slower, and producing duplicates)

The code to generate the 3_477 line data file and the recursive java version can be found at How many words does it take?. The two recursive perl versions are below:

They both share the following code:
#!/usr/bin/perl use strict; use warnings; my %seen; for my $file (@ARGV) { open(my $fh, '<', $file) or die "Unable to open '$file' for readin +g: $!"; while (<$fh>) { my ($set) = $_ =~ /^(\w+)/; powerset($set); } } sub powerset { my $set = shift @_; return if $seen{$set}++; print "$set\n"; powerset($_) for subsets($set); }
The 13 second version has subsets() as
sub subsets { my $set = shift @_; return if length($set) == 1; my ($head, $char, $tail) = ($set, '', ''); my @ret; while ($head) { $char = chop $head; push @ret, $head . $tail; $tail = $char . $tail; } return @ret; }
The 28 second version has subsets() as
sub subsets { my $set = shift @_; return if length($set) == 1; my @list = split //, $set; my $pos = @list; my @ret; while ($pos--) { push @ret, join '', @list[grep $_ != $pos, 0 .. $#list]; } return @ret; }

I made minor modifications to your code to handle my dataset as well as produce comparable output:

# All references to $calls removed # $limbic_sets = [ ... ] # foreach my $limbic_set (@$limbic_sets) { ... } # The above two lines became open(my $fh, '<', 'phase1.data') or die $!; while ( <$fh> ) { my ($limbic_set) = $_ =~ /^(\w+)/; $limbic_set = [ split //, $limbic_set ]; # ... } # removed print "checks set @$limbic_set\n"; # my $format = "%2s" x scalar(@$padded_limbic_set) . " (%d)\n"; # printf($format, (map {defined $_ && $display->{$_} ? $_ : ' '} @$pad +ded_limbic_set), $idx); # The above 2 lines became print join '', map {defined $_ && $display->{$_} ? $_ : ''} @$padded_l +imbic_set; print "\n";

Update 1: After your 3rd update, your code finished in a respectable 78 seconds. Unfortunately it is still producing about 40% duplicates. Additionally, it doesn't produce the correct output (missing missing 92_835 strings out of 508_062). For instance 'cdglnst' does not appear at all in your output.

Update 2: After your 4th update, your code narrowly makes 3rd place with 26 seconds and correct output! I included the entire perl script I am using above to ensure we are comparing apples to apples. Admittedly, yours does scale much better with both speed and memory. Unfortunately, it still isn't quite up to the task I needed. I will have to put this in my back pocket for later though.

Cheers - L~R


In reply to Re^4: Powerset short-circuit optimization by Limbic~Region
in thread Powerset short-circuit optimization by Limbic~Region

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others meditating upon the Monastery: (7)
    As of 2014-04-19 10:22 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      April first is:







      Results (480 votes), past polls