http://www.perlmonks.org?node_id=1232781


in reply to Parallelization of multiple nested loops

I was able to produce the output file in just over 3 minutes on a crappy laptop. It's not parallel, and is not as simple as BrowserUKs suggested module, and certainly not as screamingly fast as marioroys. But I wanted to share it in case someone finds it interesting. It should be straightforward to use Getopt::Std to make the values and number levels command line options.

#!/usr/bin/perl use warnings; use strict; # values my @val = qw(0.0 0.2 0.4 0.6 0.8 1.0); my $tiers = 11; # map array indices to values my $m = {}; { my $i = int 0; map { $m->{$i++} = $_ } @val; } # first tier my $p = \@val; # create each additional tier skipping first # that's already in $p for (my $i = 2; $i <= $tiers; $i++) { my $tmp; map { $tmp->[$_] = $p; } keys %{$m}; $p = $tmp; } # output file open(my $outfile, '>', '/tmp/output.txt') or die $!; # use recursion to decend the huge matrix # build up the string at each tier my $fn; $fn = sub { my ($aref, $str) = @_; for (my $i = int 0; $i < @{$aref}; $i++) { if(ref($aref->[$i])) { $fn->($aref->[$i], $str."\t".$val[$i]); next; } # end of the line, print last tier of values print $outfile $str."\t".$_."\t1\t1\n" for @val; last; } }; # kick off the recursion, could do these in parallel # at the top-most layer for (my $i = int 0; $i < @{$p}; $i++) { $fn->($p->[$i], $val[$i]); }

Unsurprisingly the I/O seems to take up a good deal of the time. It's a 17.4GB file with 362797056 lines, but perl only seems to take about 5MB of resident memory (27MB virtual) while running. I certainly wouldn't want to keep the output in memory, but the array-refs would be just fine to pass around.

So depending on what else is being done, and how many times the parameters are changed, it might make sense to just hold the initial huge matrix of array-refs in memory and pull combinations off for further processing in batches.