Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Unique numeric ID for reference?

by Jeffrey Kegler (Hermit)
on Oct 06, 2007 at 22:00 UTC ( #643175=perlquestion: print w/replies, xml ) Need Help??

Jeffrey Kegler has asked for the wisdom of the Perl Monks concerning the following question:

How do I get a unique number to identify a reference? I could stringify it to something like "ARRAY(0x18045c0)", and then pull out the hex number with a regex, but I'm in a context where speed is a big deal. I was trying to do some magic with pack("P"), then unpack, but it's not working for me.

Oh yeah, it's gotta be portable, too. That is, the number itself can change from implementation to implementation, but in every environment it has to have the property of uniqueness. In particular, the Perl trick has to be work regardless of the integer and pointer sizes of the particular implementation.

(The reason I'm asking is I'm trying to do a Guttman-Rosler-Schwartz transform and some of the elements of the array I'm sorting are references to other arrays. I could put an arbitrary ID number in the subarrays as I create them and dereference the ID number for the sort, but I'm trying to save a few cycles. )

Replies are listed 'Best First'.
Re: Unique numeric ID for reference?
by Juerd (Abbot) on Oct 06, 2007 at 23:31 UTC

    Just use the reference as a number, and it will evaluate to its address. To force numeric context, you could add 0 to it: $address = $ref + 0;

    But why not store the reference itself? That saves you the need to look it up.

    Juerd # { site => '', do_not_use => 'spamtrap', perl6_server => 'feather' }

      Actually, to my surprise, simply sticking an ID number in the refered-to object (it's an array) and dereferencing it is fastest. Here are the numbers:
      Rate String Refaddr Numeric ID Field String 747332/s -- -18% -24% -28% Refaddr 912330/s 22% -- -8% -13% Numeric 989664/s 32% 8% -- -5% ID Field 1044125/s 40% 14% 6% --
      And here's the code that did the Benchmark:

        0 doesn't really count as a unique id, does it?

      Yes, of course. Forcing the ref to numeric does it, and that is my answer. Embarrassingly easy.

      "But why not store the reference itself?" Not sure what you mean here. For my Guttman-Rosler-Schwartz sort I'm creating a sort key with pack. Given the number you just showed me how to get, I stuff it into a "J" field. How would I "store the reference itself"? And why do I want to? Since the packed keys in a GRS Transform are thrown away once the sort is done, I've no real use for an actual reference -- all I need is a unique numeric cookie. I don't need to dereference from the sort key.

      Or do I miss your point?

      jeffrey kegler

        Then i wonder why you are sorting by reference address. Is that ever useful? It's kind of randomish.

Re: Unique numeric ID for reference?
by shmem (Chancellor) on Oct 06, 2007 at 22:07 UTC
    You want refaddr() from Scalar::Util (XS version).


    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      I already looked at that, and it would work fine, but the problem is portability. If I use refaddr my code would only run on boxes with XS. I want the user to have the benefit of portability into situations where XS is not gonna happen.

      thanks, jeffrey

        refaddr() is also provided by the non-XS version, and essentially does extract the numeric part of a stringified reference and returns its decimal value. It is slower than the XS version, though.


        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Unique numeric ID for reference?
by mwah (Hermit) on Oct 06, 2007 at 23:07 UTC
    Jeffrey Kegler and then pull out the hex number with a regex, but I'm in a context where speed is a big deal. I

    As shmem pointed out, refaddr is 5.8.8 core (Scalar::Util)

    Otherwise, you could do that by hand, like:
    my @array = (1,2,3,4); my $ref = \@array; my $addr = hex +($ref=~/(?<=\()[^)]+/g)[0];
    Which should not *that* slow, especially if you
    do big business with GRT.

    To get the complete ref string back, do a simple:
    my $rs = sprintf "ARRAY(0x%x)", $addr;
    My question: why can't you take the whole
    "ref" string, including "Array(...)"?



      The competition is an indexed dereference, and my guess is a regex has to be slower than that.

      In any case, Juerd's pointing out that all I needed to do was force the reference to numeric (duh! I feel so dumb) seems to settle the matter.

      As for just using the string, the GRS transform means putting it into a sortable pack() string. There's more than one reference in the sort key, and a couple of ordinary numerics and a Boolean as well. Since the strings are variable length they require a little bit of trickery to pack and they are longer than integers. I also wonder if stringify'ing doesn't have a higher cost than forcing to numeric.

      For yucks, I probably should Benchmark the choices, but if anything beats Juerd's suggestion, I'll be a little stunned.


Re: Unique numeric ID for reference?
by Prof Vince (Friar) on Oct 07, 2007 at 13:16 UTC
    If you want to apply a Schwartzian transform to an array of references, you may be interested in using Tie::RefHash.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://643175]
Approved by almut
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2021-06-21 08:04 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (98 votes). Check out past polls.