Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

PDL and srand puzzle

by syphilis (Archbishop)
on Jun 04, 2024 at 11:03 UTC ( [id://11159773]=perlquestion: print w/replies, xml ) Need Help??

syphilis has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I pondered elsewhere over the reason that the following one-liner produced a different result each time it was run:
C:\>perl -MPDL -E "srand(3); say rand();" 0.422170424674736 C:\>perl -MPDL -E "srand(3); say rand();" 0.000277820349769087
It was explained to me that the PDL module exports its own srand() function - a function that silently clobbers perl's srand() function, and does not seed perl's pseudo-random generator.

I was surprised, firstly, that PDL would do such a thing. Why don't they call their srand() function something like (eg) pdl_srand() ?

Then, secondly, I was surprised that perl would allow its srand() function to be clobbered in this way, without so much as a whimper.
Not even the loading of strict.pm or warnings.pm leads to a warning that srand() is no longer what I think it is.

Thirdly, I wondered how one might protect oneself from such a trap.
All I could think of is to replace the srand(3) call in my one-liner with CORE::srand(3) .... is there another way ?

It's going to get rather tedious If one starts sticking a "CORE::" prefix in front of every perl core function call, just in case one of the loaded modules has overridden that particular perl core function.
And I'm not seriously suggesting that I'm about to start doing that. This is the first time I've ever struck such an issue, so I reckon it's a very rare scenario.

What do others think about the points I've just raised ?

For the purpose of playing around with this I created the following module (OverRider.pm):
package OverRider; use strict; use warnings; require Exporter; our @ISA = qw(Exporter); our @EXPORT = qw(sqrt); our $VERSION = '0.01'; sub sqrt { return sprintf "%.7g", $_[0] ** 0.5; } 1;
Then I placed that file in one of my @INC directories and ran:
D:\>perl -Mstrict -Mwarnings -MOverRider -le "print sqrt(2); print COR +E::sqrt(2);" 1.414214 1.4142135623731
Cheers,
Rob

Replies are listed 'Best First'.
Re: PDL and srand puzzle
by haj (Vicar) on Jun 04, 2024 at 12:56 UTC

    The uncontrollable mass-import of core subroutine names by PDL is indeed something ... special. I guess it is fair to attribute it to the fact that big parts of it were written in the previous century, when extensive @EXPORT lists were rather popular. While it is f**cking convenient most of the times, apparently it can bite you.

    That said, the combination of srand and rand is special also by two other facts: 1) the functions are not independent of each other, and 2) PDL did override srand but not rand, it has a function random instead. This might be indeed the only pitfall of PDL's exports: I would expect all others to behave like their CORE equivalents.

    I see two ways around it (that is, without changes in PDL):

    • Use PDL::Lite instead of PDL. This imports only a handful of functions (pdl, ndarray, barf and null). All other PDL functions are still available from the PDL namespace, or as object methods on ndarrays.
    • Use random instead of rand.

    The following two examples give consistent (but different) results each:

    perl -MPDL::Lite -E "srand(3); say rand();" perl -MPDL -E "srand(3); say random();"

    Anecdote: Once I got bitten by export list the other way around: One of my programs also used Math::Trig. This overrides a list of (non-core) functions also provided by PDL, but ... of course it doesn't provide the magic (and gave strange error messages) when called with an ndarray as an argument.

Re: PDL and srand puzzle
by hippo (Archbishop) on Jun 04, 2024 at 11:22 UTC

    Thanks for the linked context. Having read that, I now see what they've done and why. It still seems questionable to me but probably has too much history to go changing the function name or how it behaves now. If the PDL srand() did what it does and then also called CORE::srand() with the same argument then I think that would be less surprising. It's the best compromise I can come up with just now.


    🦛

Re: PDL and srand puzzle
by etj (Priest) on Jun 04, 2024 at 22:52 UTC
    I'm open to renaming Primitive::srand; it was only added (as the Changes file shows, I'm sure you all looked) in "2.062 2021-11-19". I think that given PDL's random-number function is called "random" (i.e. a bit longer than Perl's "rand"), what do people think of "setrand"?

      Thank you, syphilis for this thread. I did not realize this is broken. PDL overriding "srand" and not "rand" also breaks parallel code. The results are not repeatable.

      use v5.030; use PDL; use MCE; srand(3); MCE->new( max_workers => 4, user_func => sub { MCE->say(MCE->wid, " ", rand()); } )->run;
      $ perl ex.pl | sort 1 0.455586975281225 2 0.224720416431413 3 0.993853857581602 4 0.76298729873179 $ perl ex.pl | sort 1 0.778804464106546 2 0.547937905256735 3 0.317071346406923 4 0.0862047875571115 $ perl ex.pl | sort 1 0.49638699840671 2 0.265520439556898 3 0.0346538807070864 4 0.803787321857275
      This requires a workaround in MCE. Ditto for MCE::Child and MCE::Hobo.
      # The PDL module 2.062 ~ 2.089 exports its own srand() function, that # silently clobbers Perl's srand function, and does not seed Perl's # pseudo-random generator. https://perlmonks.org/?node_id=11159773 if ( $INC{'PDL/Primitive.pm'} && PDL::Primitive->can('srand') ) { # Call PDL's random() function if exported i.e. use PDL. my $caller = caller(); local $@; $caller = caller(1) if ( $caller =~ /^MCE/ ); $caller = caller(2) if ( $caller =~ /^MCE/ ); $caller = caller(3) if ( $caller =~ /^MCE/ ); $self->{_seed} = eval "$caller->can('random')" ? int(PDL::Primitive::random() * 1e9) : int(CORE::rand() * 1e9); } else { $self->{_seed} = int(CORE::rand() * 1e9); }

      Edit 1: MCE v1.894, MCE::Shared v1.889

      if ( $INC{'PDL/Primitive.pm'} ) { ... }

      Edit 2: MCE configures an internal seed. It turns out that MCE may not know the srand or setter used by the application. Releasing MCE 1.895 and MCE::Shared 1.890. I updated the demonstration to process a sequence of numbers (lesser memory consumption). See also, Predictability Summary.

      Reverting back to the following.
      $self->{_seed} = int(CORE::rand() * 1e9);

        I will add PDL to the list in MCE, MCE::Child, and MCE::Hobo.

        # Set the seed of the base generator uniquely between workers.
        # The new seed is computed using the current seed and $_wid value.
        # One may set the seed at the application level for predictable
        # results (non-thread workers only). Ditto for PDL, Math::Prime::Util,
        # Math::Random, and Math::Random::MT::Auto.
        
        if ( !$self->{use_threads} ) { my $_wid = $_args[1]; my $_seed = abs($self->{_seed} - ($_wid * 100000)) % 2147483560; CORE::srand($_seed); # PDL 2.062 ~ 2.089 PDL::srand($_seed) if $INC{'PDL.pm'} && PDL->can('srand'); # PDL 2.089_01+ PDL::srandom($_seed) if $INC{'PDL.pm'} && PDL->can('srandom'); ... }

        This resolves calling PDL->random at the application level and expecting repeatable results.

      This is good attitude, thanks etj

      perhaps srandom ? or setseed or just rename random() to rand() so that PDL completely takes over Perl's (edit:I mean both srand() and rand()) as I believe one of syphilis's frustrations is that he is setting the seed to the wrong RNG and sees no effect. The latter would of course be not curteous unless you really know what you are doing as this can have security implications for 3rd-parties or even core modules.

        srandom makes sense to me. Then both subs follow the pattern of appending "om" to the core variants.

        If one is to override perl's rand then it should be kept to "opt-in" behaviour to avoid violating the principle of least surprise.

Re: PDL and srand puzzle
by syphilis (Archbishop) on Jun 05, 2024 at 01:56 UTC
    Interesting replies - including aspects that I had not even considered.

    I think that one lesson (for me, at least) is that I should not have expected "-MPDL" to be as innocuous as I had assumed.
    For that level of innocuity, I think I should instead have used "-mPDL".

    But I'm still a bit confused about perl's handling of "some_func" when both "main::some_func" and "CORE::some_func" exist.
    In the one-liner I gave (that loads PDL), both "main::srand" and "CORE::srand" exist - and perl decides that "srand(3)" means "main::srand(3)".
    But in this next one-liner (where both "main::sqrt" and "CORE::sqrt" exist) perl goes the other way - and decides that "sqrt(2)" means "CORE::sqrt(2)".
    D:\>perl -wle "print sqrt(2);print main::sqrt(2); print CORE::sqrt(2); +sub sqrt { return sprintf('%.6g', $_[0] ** 0.5) }" 1.4142135623731 1.41421 1.4142135623731
    Why the inconsistency ?
    And why no warnings about the ambiguity of calling "some_func()" when both "main::some_func()" and "CORE::some_func()" exist ?

    Cheers,
    Rob

      The CORE namespace take priority. This is important because otherwise you would end up in situations like you have with PDL's srand all the time. If you want to override the core function you can do it like this:

      perl -wle "use subs 'sqrt'; print sqrt(2); print CORE::sqrt(2); sub sq +rt { return sprintf('%.6g',$_[0] ** 0.5) } " 1.41421 1.4142135623731
        The CORE namespace take priority.

        Thanks for that - I now think I know what the rule is:
        <RULE>
        If main::foo and CORE::foo both exist, then "foo" calls main::foo if main::foo has been imported in from some module.
        Otherwise "foo" calls CORE::foo.
        </RULE>
        Is that the way it works ?

        I'm still a bit puzzled as to why there should be this difference - but if that's the rule, then that's the rule.

        Cheers,
        Rob

      I guess that the call to sqrt gets compiled to CORE::sqrt before the interpreter processes your subroutine declaration. You can get the warning about the ambiguity if you declare your subroutine before calling sqrt (I got rid of the percent format because it breaks shell syntax on Linux):

      perl -wle "sub sqrt { return q(more than 1.4) } print sqrt(2);print ma +in::sqrt(2); print CORE::sqrt(2);"
      That prints:
      Ambiguous call resolved as CORE::sqrt(), qualify as such or use & at - +e line 1. 1.4142135623731 more than 1.4 1.4142135623731
        Ambiguous call resolved as CORE::sqrt(), qualify as such or use & at -e line 1.

        Thanks. I had tried pre-declaring the sub but I must have stuffed up that test because I certainly didn't see that warning.
        I've just double-checked, and the warning is present for me, too.

        Update: Oh ... and for the warning to appear, the ambiguity has to be detected at compile time.

        Cheers,
        Rob.
Re: PDL and srand puzzle
by ikegami (Patriarch) on Jun 05, 2024 at 02:11 UTC

    Thirdly, I wondered how one might protect oneself from such a trap.

    Explicitly lists your imports. This has the huge added benefit that it's easy to see from where each sub originates.

    You could also use CORE::srand.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11159773]
Approved by marto
Front-paged by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (2)
As of 2025-07-12 07:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.