Update: Added options chunk_size and bounds_only to MCE::Shared::Sequence in trunk, similar to MCE options. This allows MCE::Hobo workers to run as fast as MCE workers. Also, corrected the demonstration. Seeing this run faster than serial code made my day.
I had to try something with the upcoming MCE 1.7 release. Parallelism may be beneficial for big sequences. MCE 1.7 will ship with MCE::Hobo, a threads-like module for processes. Thus, benefiting from Copy-on-Write feature of modern OS'es. In essence, the @strings array is not copied per each worker unless written to by the worker.
Using Roy Johnson's demonstration, made the following changes to enable parallelism via MCE::Hobo workers. This requires MCE in trunk or a later dev 1.699_011 release.
...
print "Sorted. Finding matches...\n";
# Now walk through the list. The best match for each string will be th
+e
# previous or next element in the list that is not from the original s
+ubstring,
# so for each entry, just look for the next one. See how many initial
+letters
# match and track the best matches
#
# my @matchdata = (0); # (length, index1-into-strings, index2-into-str
+ings)
# for my $i1 (0..($#strings - 1)) {
# my $i2 = $i1 + 1;
# ++$i2 while $i2 <= $#strings and $strings[$i2][1] eq $strings[$i1]
+[1];
# next if $i2 > $#strings;
# my ($common) = map length, ($strings[$i1][0] ^ $strings[$i2][0]) =
+~ /^(\0*)/;
# if ($common > $matchdata[0]) {
# @matchdata = ($common, [$i1, $i2]);
# }
# elsif ($common == $matchdata[0]) {
# push @matchdata, [$i1, $i2];
# }
# }
use MCE::Hobo;
use MCE::Shared;
my $sequence = MCE::Shared->sequence(
{ chunk_size => 500, bounds_only => 1 }, 0, $#strings - 1
);
sub walk_list {
my @matchdata = (0); # (length, index1-into-strings, index2-into-str
+ings)
while ( my ( $beg, $end ) = $sequence->next ) {
for my $i1 ( $beg .. $end ) {
my $i2 = $i1 + 1;
++$i2 while $i2 <= $#strings and $strings[$i2][1] eq $strings[$i
+1][1];
next if $i2 > $#strings;
my ($common) = map length, ($strings[$i1][0] ^ $strings[$i2][0])
+ =~ /^(\0*)/;
if ($common > $matchdata[0]) {
@matchdata = ($common, [$i1, $i2]);
}
elsif ($common == $matchdata[0]) {
push @matchdata, [$i1, $i2];
}
}
}
return @matchdata;
};
MCE::Hobo->create( \&walk_list ) for 1 .. 8;
my @matchdata = (0); # (length, index1-into-strings, index2-into-strin
+gs)
for my $hobo ( MCE::Hobo->list ) {
my @ret = $hobo->join;
if ( $ret[0] > $matchdata[0] ) {
@matchdata = @ret;
}
elsif ( $ret[0] == $matchdata[0] ) {
shift @ret;
push @matchdata, @ret;
}
}
print "Best match: $matchdata[0] chars\n";
...
MCE 1.7 is nearly completed in trunk. The MCE::Shared::Sequence module is helpful. I will try to finish MCE 1.7 by the end of the month.
Regards, Mario