Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Sparse arrays?

by diotalevi (Canon)
on Dec 18, 2002 at 06:01 UTC ( [id://220781]=perlquestion: print w/replies, xml ) Need Help??

diotalevi has asked for the wisdom of the Perl Monks concerning the following question:

I thought I'd read that perl does sparse arrays. So under what conditions does that happen? This simple test shows that @ary becomes a 1001 element array through this simple slice assignment. Perhaps related is that the uninitialized elements have the READONLY flag set which initialized yet undefined elements don't have. And then there's the super-wacky REFCNT value. I'd think that the READONLY flag is used by exists() and delete() but this code doesn't test that guess. Care to spread some enlightenment about sparse arrays and maybe the other part as well?

use Devel::Peek; @ary[1,100,500,1000] = (); DumpArray(@ary); __DATA__ .... Elt No. 998 0x4284 SV = NULL(0x0) at 0x4284 REFCNT = 2147483634 FLAGS = (READONLY) Elt No. 999 0x106d0 SV = NULL(0x0) at 0x106d0 REFCNT = 1 FLAGS = ()

Replies are listed 'Best First'.
Re: Sparse arrays?
by MarkM (Curate) on Dec 18, 2002 at 07:36 UTC

    Explanation for observed behaviour:

    1. Perl arrays take the form of a C pointer to a pointer array.
    2. Each element in the pointer array refers to a single scalar value, or the value NULL.
    3. Arrays are automatically extended during assignment when they are indexed with an index that does not exist.
    4. Array extension initializes the newly created elements to the value NULL.
    5. From Perl, array elements with the value NULL appear to be the undef() value.
    6. The undef() value does not have any qualifiers, and is reused. All simple undef() values referenced from Perl can refer to the same scalar data structure internally, minimizing memory usage.
    7. The special scalar data structure used to indicate the undef() value is marked as an 'immortal scalar'.
    8. Immortal scalars are given artificially large reference counts to minimize the code path that normally decrements the reference count, and if the reference count reaches zero, deallocates the scalar. Immortals live even if the reference count reaches zero, so by setting the reference count to a very large value each time it reaches zero, the reference decrement code is simplified on the critical path.

    Summary:

    Perl supports sparse arrays of a sort. Arrays that are extended, but not explicitly initialized do not require nearly as much memory as arrays that are extended and explicitly initialized. On an architecture with 32-bit C pointers, an array of length 1000000, that has not been explicitly initialized, will require only 4000000 bytes of RAM.

Re: Sparse arrays?
by submersible_toaster (Chaplain) on Dec 18, 2002 at 06:18 UTC

    I point out immediatly that I have NFI about these kind of internals, but I certainly wonder or boggle at the refcount. Now when I execute this code on linux p5.6.1, I get that same value for refcount , 2147483634.

    So just for the hell of it, did the same on IRIX , and all the 'autovivified' but empty elements have 2147483566 as their refcount.

    This will probably never effect how I code perl in the slightest, but now I just HAVE to know - WTF? , what is this magic number all about? How is it different accross architectures ? (ok that's a bit of a dud question, since most numbers get handled differently) .. but eh!?

    submersible_toaster floats off to try and make 2 billion odd references to $i='JAPH';

    Update: ++ everyone , nice explanation MarkM.

      Bad Things would happen if that particular scalar were to be garbage collected.

        sadly I will never find out : ( .. . . .

        unless
        maybe if I take the proxy offline xmas day..

      Notice that the recount values you are examining only occur when the SV is NULL. The refcount doesn't matter when the SV is NULL and probably just hasn't been initialized. Depending on the platform and compiler, the refcount might be anything but it's a moot point because it isn't used.

      -sauoq
      "My two cents aren't worth a dime.";
      
      Actually the refcount for NULL is *set* to be that high. You would be very disappointed if all your undef's would be garbage collected :-0

      That number is 2^32 - minus the times that thing (NULL) has been "freed" (decreased refcount that is...)

      regards,

      janx

Re: Sparse arrays?
by BrowserUk (Patriarch) on Dec 18, 2002 at 06:27 UTC

    Isn't the point that you have allocated space for 1001 pointers to data elements, but no space for the elements themselves?

    My best guess on the REFCNT is that it doesn't get initialised until something is assigned to that element.


    Examine what is said, not who speaks.

      Update: If you didn't catch it, I was being sarcastic on the "wasted" space. It's itsy-bitsy by perl standards.

      Exactly! I was missing the forest for the trees. Take a close look at the memory addresses being used here. I swapped the undef elements for numeric 1 just to see a real SvIV. I think what actually happens here is that the ARRAY pointer array is fully allocated out to the limit of @ary. The unused elements appear to be null pointers or at least all share a single empty Sv struct with no allocated svpv structures. So actually... that array with 1001 elements appears to have consumed enough memory for five elements plus an additional 994 32 bit pointers for a whopping 3K of "wasted" space. If I'm wrong on what this actually looks like then that'd be really good to know.

      My interpretation is that arrays are sparse by default when there are uninitialized elements. Getting a sparse array to work nicely under iteration operators appears to require some additional footwork. This looks like the sort of thing you'd want to restrict to knowledgable perl users since it doesn't quite work with the normal perl idioms and is highly likely to confuse (but is a useful thing to have). If anyone knows of a different way to go directly from real element to real element that'd really keen.

      @ary[1,100,500,1000] = (1) x 4; # Iterate a sparse array exists($ary[$_]) and print "$_\n" for 0 .. $#ary; # Iterate a normal array print "$_\n" for @ary
      Elt No. 994 0x4284 SV = NULL(0x0) at 0x4284 REFCNT = 2147483634 FLAGS = (READONLY) Elt No. 995 0x4284 SV = NULL(0x0) at 0x4284 REFCNT = 2147483634 FLAGS = (READONLY) Elt No. 996 0x4284 SV = NULL(0x0) at 0x4284 REFCNT = 2147483634 FLAGS = (READONLY) Elt No. 997 0x4284 SV = NULL(0x0) at 0x4284 REFCNT = 2147483634 FLAGS = (READONLY) Elt No. 998 0x4284 SV = NULL(0x0) at 0x4284 REFCNT = 2147483634 FLAGS = (READONLY) Elt No. 999 0x706c SV = IV(0x12094) at 0x706c REFCNT = 1 FLAGS = (IOK,pIOK) IV = 1

      Fun Fun Fun in the Fluffy Chair

        If anyone knows of a different way to go directly from real element to real element that'd really keen.

        Easy-peasy:
        foreach my $element (@sparse) { if ( defined $element ) { # YOUR CODE HERE # } }
        defined() should be fast enough that calling it even a million times is no big deal.
        --
        Spring: Forces, Coiled Again!
Re: Sparse arrays?
by Enlil (Parson) on Dec 18, 2002 at 06:27 UTC
      I have always thought that sparse arrays was just another term for a hash.

      No. Hashes are unordered collections. You can impose an order on them by requiring ordered keys and then use a hash to implement a sparse array. Additional information, such as a normal dense array containing the sorted hash keys, are needed in order to allow iteration though.

      -sauoq
      "My two cents aren't worth a dime.";
      
        Additional information, such as a normal dense array containing the sorted hash keys, are needed in order to allow iteration though.

        Of course you can obtain that on the fly trivially with foreach( sort { $a <=> $b } keys %sparse ). It's really more of an efficiency question whether you want to keep that seperately or do it each time (and either way could be done transparently with a tied array).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://220781]
Approved by Enlil
Front-paged by Courage
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2024-03-19 05:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found