I'm taking CORE::rand() and PDL::random() for a spin without threads. Rather, child processes. There are 8 workers, each output 50,000 lines. A count below 400,000 indicates duplicates in the output.
use v5.030;
use PDL;
use MCE 1.894;
MCE->new(
max_workers => 8, user_func => sub {
for (1..50000) {
# my $r = CORE::rand();
my $r = PDL->random;
MCE->say("$r");
}
}
)->run;
CORE::rand()
$ perl test4.pl | LC_ALL=C sort -u | wc -l
400000
$ perl test4.pl | LC_ALL=C sort -u | wc -l
400000
$ perl test4.pl | LC_ALL=C sort -u | wc -l
400000
PDL->random
$ perl test4.pl | LC_ALL=C sort -u | wc -l
400000
$ perl test4.pl | LC_ALL=C sort -u | wc -l
400000
$ perl test4.pl | LC_ALL=C sort -u | wc -l
400000
Next, I tried 12 million unique lines and tight loop by appending to a string (i.e. no waiting for serialized output previously). Again, no duplicates.
use v5.030;
use PDL;
use MCE 1.894;
MCE->new(
max_workers => 24, user_func => sub {
my $output = "";
for (1..500000) {
# my $r = CORE::rand();
my $r = PDL->random;
$output .= "$r\n";
}
MCE->print($output);
}
)->run;
CORE::rand() and PDL->random
$ perl test5.pl | LC_ALL=C sort -u | wc -l
12000000
$ perl test5.pl | LC_ALL=C sort -u | wc -l
12000000
$ perl test5.pl | LC_ALL=C sort -u | wc -l
12000000
Sorting takes a while. There is the mcesort program with integrated mini-MCE. Copy the script to /usr/local/bin and sudo chmod +x /usr/local/bin/mcesort or bin path of your choice.
perl test5.pl | LC_ALL=C mcesort -j6 -u | wc -l
|