The following is a billion demonstration. Workers output directly to STDOUT and orderly via MCE::Relay.
Update: Reduce RAM usage by processing a sequence of numbers instead (~ 45MB per worker). For predictable output, this requires seeding the generator per each sequence. Previously, the code spawned 100 workers and 1e7 loop iterations (~ 24GB RAM).
use v5.030;
use PDL;
use MCE 1.895;
use Math::Prime::Util;
use Math::Random;
use Math::Random::MT::Auto;
CORE::srand(42); # This also, for MCE predictable results
# MCE sets internal seed = CORE::random()
PDL::srandom(42); # PDL::srand(N) v2.062 ~ v2.089
Math::Prime::Util::srand(42);
Math::Random::random_set_seed(42, 42);
Math::Random::MT::Auto::set_seed(42);
MCE->new(
max_workers => MCE::Util::get_ncpu(),
chunk_size => 1,
init_relay => 0,
posix_exit => 1,
sequence => [ 1, 1000 ],
user_func => sub {
# my ($mce, $seq, $chunk_id) = @_;
my $seq = $_;
my $output = "";
# Worker seeds generator using MCE's seed and wid value.
# For sequence of numbers, compute similarly using $seq value.
my $seed = abs(MCE->seed - ($seq * 100000)) % 1073741780;
CORE::srand($seed);
# PDL::srand($seed); # PDL v1.062 ~ 1.089
PDL::srandom($seed); # PDL v1.089_01
Math::Prime::Util::srand($seed);
Math::Random::random_set_seed($seed, $seed);
Math::Random::MT::Auto::set_seed($seed);
my $prngMT = Math::Random::MT::Auto->new();
$prngMT->srand($seed);
for ( 1 .. 1e6 ) {
# my $r = CORE::rand();
# my $r = PDL->random();
# my $r = Math::Prime::Util::drand();
# my $r = Math::Prime::Util::irand64();
# my $r = Math::Random::random_normal();
# my $r = Math::Random::random_uniform();
# my $r = Math::Random::MT::Auto::rand();
my $r = $prngMT->rand();
$output .= "$r\n";
}
MCE::relay { print $output; };
}
)->run;
Time to run:
$ time perl test.pl > out # 17GB ~ 19GB file size
CORE::rand . . . . . . . . . . . 12.275s
PDL->random . . . . . . . . . . . 2m04.810s
Math::Prime::Util::drand . . . . 13.358s
Math::Prime::Util::irand64 . . . 12.462s
Math::Random::random_normal . . . 16.341s
Math::Random::random_uniform . . 16.328s
Math::Random::MT::Auto::rand . . 15.040s
$prngMT->rand . . . . . . . . . . 13.339s
$ wc -l out
1000000000
GNU sort lacks parallel capability processing STDIN. See mcesort (GitHub Gist). Another option is GNU parallel parsort.
# Sorting consumes 17GB ~ 19GB temporary space.
# Specify a path with sufficient storage.
$ perl test.pl | LC_ALL=C sort -T/dev/shm -u | wc -l
$ perl test.pl | LC_ALL=C mcesort -T/dev/shm -j75% -u | wc -l
$ perl test.pl | LC_ALL=C parsort -T/dev/shm --parallel=12 -u | wc -l
CORE::rand . . . . . . . . . . . 1000000000
PDL->random . . . . . . . . . . . 999999510
Math::Prime::Util::drand . . . . 999999553
Math::Prime::Util::irand64 . . . 1000000000
Math::Random::random_normal . . . 816337225
Math::Random::random_uniform . . 799455502
Math::Random::MT::Auto::rand . . 999999527
$prngMT->rand . . . . . . . . . . 999999527
The mcesort variant runs on Linux including macOS, FreeBSD, and derivatives.