Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^12: PDL and srand puzzle - one billion demonstration (updated)

by marioroy (Prior)
on Jun 08, 2024 at 01:11 UTC ( [id://11159827]=note: print w/replies, xml ) Need Help??


in reply to Re^11: PDL and srand puzzle - not likely fewer random bits than rand
in thread PDL and srand puzzle

The following is a billion demonstration. Workers output directly to STDOUT and orderly via MCE::Relay.

Update: Reduce RAM usage by processing a sequence of numbers instead (~ 45MB per worker). For predictable output, this requires seeding the generator per each sequence. Previously, the code spawned 100 workers and 1e7 loop iterations (~ 24GB RAM).

use v5.030; use PDL; use MCE 1.895; use Math::Prime::Util; use Math::Random; use Math::Random::MT::Auto; CORE::srand(42); # This also, for MCE predictable results # MCE sets internal seed = CORE::random() PDL::srandom(42); # PDL::srand(N) v2.062 ~ v2.089 Math::Prime::Util::srand(42); Math::Random::random_set_seed(42, 42); Math::Random::MT::Auto::set_seed(42); MCE->new( max_workers => MCE::Util::get_ncpu(), chunk_size => 1, init_relay => 0, posix_exit => 1, sequence => [ 1, 1000 ], user_func => sub { # my ($mce, $seq, $chunk_id) = @_; my $seq = $_; my $output = ""; # Worker seeds generator using MCE's seed and wid value. # For sequence of numbers, compute similarly using $seq value. my $seed = abs(MCE->seed - ($seq * 100000)) % 1073741780; CORE::srand($seed); # PDL::srand($seed); # PDL v1.062 ~ 1.089 PDL::srandom($seed); # PDL v1.089_01 Math::Prime::Util::srand($seed); Math::Random::random_set_seed($seed, $seed); Math::Random::MT::Auto::set_seed($seed); my $prngMT = Math::Random::MT::Auto->new(); $prngMT->srand($seed); for ( 1 .. 1e6 ) { # my $r = CORE::rand(); # my $r = PDL->random(); # my $r = Math::Prime::Util::drand(); # my $r = Math::Prime::Util::irand64(); # my $r = Math::Random::random_normal(); # my $r = Math::Random::random_uniform(); # my $r = Math::Random::MT::Auto::rand(); my $r = $prngMT->rand(); $output .= "$r\n"; } MCE::relay { print $output; }; } )->run;
Time to run:
$ time perl test.pl > out # 17GB ~ 19GB file size CORE::rand . . . . . . . . . . . 12.275s PDL->random . . . . . . . . . . . 2m04.810s Math::Prime::Util::drand . . . . 13.358s Math::Prime::Util::irand64 . . . 12.462s Math::Random::random_normal . . . 16.341s Math::Random::random_uniform . . 16.328s Math::Random::MT::Auto::rand . . 15.040s $prngMT->rand . . . . . . . . . . 13.339s $ wc -l out 1000000000

GNU sort lacks parallel capability processing STDIN. See mcesort (GitHub Gist). Another option is GNU parallel parsort.

# Sorting consumes 17GB ~ 19GB temporary space. # Specify a path with sufficient storage. $ perl test.pl | LC_ALL=C sort -T/dev/shm -u | wc -l $ perl test.pl | LC_ALL=C mcesort -T/dev/shm -j75% -u | wc -l $ perl test.pl | LC_ALL=C parsort -T/dev/shm --parallel=12 -u | wc -l CORE::rand . . . . . . . . . . . 1000000000 PDL->random . . . . . . . . . . . 999999510 Math::Prime::Util::drand . . . . 999999553 Math::Prime::Util::irand64 . . . 1000000000 Math::Random::random_normal . . . 816337225 Math::Random::random_uniform . . 799455502 Math::Random::MT::Auto::rand . . 999999527 $prngMT->rand . . . . . . . . . . 999999527

The mcesort variant runs on Linux including macOS, FreeBSD, and derivatives.

Replies are listed 'Best First'.
Re^13: PDL and srand puzzle - PDL piddle iteration (updated)
by marioroy (Prior) on Jun 08, 2024 at 03:11 UTC

    I tried another variation for PDL.

    Update: Reduce RAM usage by processing a sequence of numbers instead (~ 45MB per worker). For predictable output, this requires seeding the generator per each sequence. Previously, the code spawned 100 workers and 1e7 loop iterations (~ 24GB RAM).

    use v5.030; use PDL; use MCE 1.895; CORE::srand(42); # This also, for MCE predictable results # MCE sets internal seed = CORE::random() PDL::srandom(42); # PDL::srand(N) v2.062 ~ v2.089 MCE->new( max_workers => MCE::Util::get_ncpu(), chunk_size => 1, init_relay => 0, posix_exit => 1, sequence => [ 1, 1000 ], user_func => sub { # my ($mce, $seq, $chunk_id) = @_; my $seq = $_; my $output = ""; # Worker seeds generator using MCE's seed and wid value. # For sequence of numbers, compute similarly using $seq value. my $seed = abs(MCE->seed - ($seq * 100000)) % 1073741780; # PDL::srand($seed); # PDL v1.062 ~ 1.089 PDL::srandom($seed); # PDL v1.089_01 my $pdl = PDL->random(1e6); foreach (0 .. $pdl->nelem - 1) { my $r = $pdl->at($_); $output .= "$r\n"; } MCE::relay { print $output; }; } )->run;

    Time to run:

    $ time perl test_pdl.pl > out # 17GB file size PDL->random . . . . . . . . . . . 27.211s $ wc -l out 1000000000

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11159827]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2025-02-14 15:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found