Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Calling an overload::Method

by hv (Prior)
on Feb 03, 2022 at 18:43 UTC ( [id://11141116]=perlmeditation: print w/replies, xml ) Need Help??

You probably won't need to know this, but here it is just in case you do. I stumbled across it while trying to add some speedups to Math-GMP, which uncovered a bug in Test-Builder; a cpangrep then found 5 other distributions trying to do the same thing, all of which also had the same bug (as well as a few that simply bundle Test::Builder).

If you use overload to make your objects behave specially - as Math::GMP does, to make the objects act more like perl's numeric scalars - you provide those behaviours by way of a hash keyed on the (occasionally cryptic) name of the behaviour, providing a coderef or a function name:

package My::SuperNumber; use overload ( # overload the subtraction operator '-' => sub { my($self, $other, $swap) = @_; my $result = $self->my_super_subtract($other); return $swap ? -$result : $result; }, );

That $swap is there to handle asymmetric operations like subtraction: if the caller asks for $super - 12 perl will call that coderef with ($super, 12, 0) as parameters, but if they ask for 34 - $super perl will instead pass ($super, 34, 1) to say that the parameters have been swapped.

Now for simplicity perl always calls these overload functions the same way, even for unary operators:

use overload ( # this is the cryptic name of the "numification" operator '0+' => sub { my($self, $other, $swap) = @_; return $self->as_number; }, );

$other and $swap are still passed in when calling this method (as undef and 0), but of course everyone ignores them and writes the function above as if it took only one parameter. And that's all fine until you want to use XS to provide the overloaded method - at the C level you can't just ignore arguments you don't care about, you have to supply a signature that matches how it is called. So you have to write something like this:

use overload ( # this is the cryptic name of the "stringification" operator '""' => op_stringify, ); # and then in the XS code: char * op_stringify(left, right, swap) SV * left SV * right bool swap CODE: RETVAL = my_stringify(left); OUTPUT: RETVAL

.. and that's all fine too - XS writers expect to have to do things in slightly more convoluted ways. However, back in the land of Perl, the overload module also provides a Method function which allows you to ask for the coderef that _would_ be called for a particular overloaded operation. You might use that to check if it's safe to use something as a string, for example:

sub as_string { my($obj) = @_; return "$obj" unless ref $obj; die "Give me a string, or something that pretends to be one" unless overload::Method($obj, '""'); return "$obj"; }

However you might also use it to invoke that method yourself, and this is the thing that everyone got wrong:

sub as_string { my($obj) = @_; return "$obj" unless ref $obj; my $method = overload::Method($obj, '""') or die "Give me a string, or something that pretends to be one"; # WRONG return $method->($obj); # RIGHT return $method->($obj, undef, 0); }

If you invoke the coderef returned by overload::Method without supplying 3 arguments, that may work fine as long as you only interact with objects whose overloads are provided by perl code, but trying to invoke an XS method that way will fall over at runtime with a message like Usage: My::SuperNumber::as_string(left, right, swap) - because they _had_ to write it with that signature to let perl call it. And since use of XS is fairly rare and use of overload is even more so, it may be a long time before you discover there's a bug. So don't do that. :)

Replies are listed 'Best First'.
Re: Calling an overload::Method
by syphilis (Archbishop) on Feb 04, 2022 at 11:36 UTC
    I stumbled across it while trying to add some speedups to Math-GMP

    Nice post ++, btw.
    I've always assumed that if you want to get the maximum performance out of Math::GMP, then you'll avoid the overloading altogether and explicitly make the method calls.
    I think this is generally true and the following script seems to support that view, though it only compares the 2 ways of performing a right shift.
    use warnings; use Benchmark; use Math::GMP; no warnings 'once'; $str1 = '123456789' x 1000; $gmp1 = Math::GMP->new($str1); timethese(100000, { 'Math::GMP 1' => '$ret1 = $gmp1 >> 1000;', 'Math::GMP 2' => '$ret2 = $gmp1->div_2exp_gmp(1000)', }); print "gmp calculation ok\n" if $ret1 == $ret2; __END__ Outputs: Benchmark: timing 100000 iterations of Math::GMP 1, Math::GMP 2... Math::GMP 1: 0 wallclock secs ( 0.19 usr + 0.00 sys = 0.19 CPU) @ 5 +34759.36/s (n=100000) (warning: too few iterations for a reliable count) Math::GMP 2: 0 wallclock secs ( 0.12 usr + 0.00 sys = 0.12 CPU) @ 8 +00000.00/s (n=100000) (warning: too few iterations for a reliable count) gmp calculation ok
    I might as well showcase my own Math::GMPz module while I'm at it, though I see it's main benefit as being that it enwraps much more of the gmp library than is covered by Math::GMP.
    use warnings; use Benchmark; use Math::GMP; use Math::GMPz qw(:mpz); no warnings 'once'; $str1 = '123456789' x 1000; $gmp1 = Math::GMP->new ($str1); $gmpz1 = Math::GMPz->new($str1); $ret4 = Math::GMPz->new(); timethese(100000, { 'Math::GMP 1' => '$ret1 = $gmp1 >> 1000;', 'Math::GMP 2' => '$ret2 = $gmp1->div_2exp_gmp(1000);', 'Math::GMPz 1' => '$ret3 = $gmpz1 >> 1000;', 'Math::GMPz 2' => 'Rmpz_div_2exp($ret4, $gmpz1, 1000);', }); print "gmp calcualtion ok\n" if $ret1 == $ret2; print "gmpz calculation ok\n" if $ret3 == $ret4; print "both gmp and gmpz agree\n" if "$ret1" eq "$ret3"; __END__ Outputs: Benchmark: timing 100000 iterations of Math::GMP 1, Math::GMP 2, Math: +:GMPz 1, Math::GMPz 2... Math::GMP 1: 0 wallclock secs ( 0.19 usr + 0.00 sys = 0.19 CPU) @ 5 +34759.36/s (n=100000) (warning: too few iterations for a reliable count) Math::GMP 2: 0 wallclock secs ( 0.12 usr + 0.00 sys = 0.12 CPU) @ 8 +00000.00/s (n=100000) (warning: too few iterations for a reliable count) Math::GMPz 1: 0 wallclock secs ( 0.09 usr + 0.00 sys = 0.09 CPU) @ +1063829.79/s (n=100000) (warning: too few iterations for a reliable count) Math::GMPz 2: 0 wallclock secs ( 0.03 usr + 0.00 sys = 0.03 CPU) @ +3225806.45/s (n=100000) (warning: too few iterations for a reliable count) gmp calcualtion ok gmpz calculation ok both gmp and gmpz agree
    Both Math::GMP and Math::GMPz were built against the very same gmp-6.2.1 library.

    Cheers,
    Rob
Re: Calling an overload::Method
by etj (Deacon) on Mar 23, 2022 at 14:49 UTC
    It seems to me the way round that at XS level is to use a varargs signature? (I've recently been battling PDL's overload methods)
    char * op_stringify(left, ...) SV * left CODE: RETVAL = my_stringify(left); OUTPUT: RETVAL

      I didn't try that - if it works, then yes I guess that's an option. It is possible that it will limit portability, since varargs stuff has traditionally been tricky in that regard - but since you're not actually looking at the additional arguments (ie not calling va_start), maybe that won't be a problem.

      I've no idea what C that will translate to: I'm curious how it will decide to invoke the typemaps, since you have not declared what type the additional arguments are. In principle that might even make it slightly faster.

        There are no portability issues since Perl XS functions don't use va_start for either varargs or non-varargs XS. Please look in a .c file generated from a .xs :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://11141116]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2024-04-19 21:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found