Re: Sparse arrays?
by MarkM (Curate) on Dec 18, 2002 at 07:36 UTC
|
Explanation for observed behaviour:
- Perl arrays take the form of a C pointer to a pointer array.
- Each element in the pointer array refers to a single scalar value, or the value NULL.
- Arrays are automatically extended during assignment when they are indexed with an index that does not exist.
- Array extension initializes the newly created elements to the value NULL.
- From Perl, array elements with the value NULL appear to be the undef() value.
- The undef() value does not have any qualifiers, and is reused. All simple undef() values referenced from Perl can refer to the same scalar data structure internally, minimizing memory usage.
- The special scalar data structure used to indicate the undef() value is marked as an 'immortal scalar'.
- Immortal scalars are given artificially large reference counts to minimize the code path that normally decrements the reference count, and if the reference count reaches zero, deallocates the scalar. Immortals live even if the reference count reaches zero, so by setting the reference count to a very large value each time it reaches zero, the reference decrement code is simplified on the critical path.
Summary:
Perl supports sparse arrays of a sort. Arrays that are extended, but not explicitly initialized do not require nearly as much memory as arrays that are extended and explicitly initialized. On an architecture with 32-bit C pointers, an array of length 1000000, that has not been explicitly initialized, will require only 4000000 bytes of RAM.
| [reply] [Watch: Dir/Any] |
Re: Sparse arrays?
by submersible_toaster (Chaplain) on Dec 18, 2002 at 06:18 UTC
|
I point out immediatly that I have NFI about these kind of
internals, but I certainly wonder or boggle at the refcount.
Now when I execute this code on linux p5.6.1, I get that same value for refcount , 2147483634.
So just for the hell of it, did the same on IRIX , and all the 'autovivified' but
empty elements have 2147483566 as their refcount.
This will probably never effect how I code perl in the slightest, but now I just HAVE
to know - WTF? , what is this magic number all about? How is it different accross architectures ? (ok that's
a bit of a dud question, since most numbers get handled differently) .. but eh!?
submersible_toaster floats off to try and make 2 billion odd references to $i='JAPH';
Update: ++ everyone , nice explanation MarkM.
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] |
|
sadly I will never find out : ( .. . . .
unless
maybe if I take the proxy offline xmas day..
| [reply] [Watch: Dir/Any] |
|
|
-sauoq
"My two cents aren't worth a dime.";
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
Re: Sparse arrays?
by BrowserUk (Patriarch) on Dec 18, 2002 at 06:27 UTC
|
Isn't the point that you have allocated space for 1001 pointers to data elements, but no space for the elements themselves?
My best guess on the REFCNT is that it doesn't get initialised until something is assigned to that element.
Examine what is said, not who speaks.
| [reply] [Watch: Dir/Any] |
|
Update: If you didn't catch it, I was being sarcastic on the "wasted" space. It's itsy-bitsy by perl standards.
Exactly! I was missing the forest for the trees. Take a close look at the memory addresses being used here. I swapped the undef elements for numeric 1 just to see a real SvIV. I think what actually happens here is that the ARRAY pointer array is fully allocated out to the limit of @ary. The unused elements appear to be null pointers or at least all share a single empty Sv struct with no allocated svpv structures. So actually... that array with 1001 elements appears to have consumed enough memory for five elements plus an additional 994 32 bit pointers for a whopping 3K of "wasted" space. If I'm wrong on what this actually looks like then that'd be really good to know.
My interpretation is that arrays are sparse by default when there are uninitialized elements. Getting a sparse array to work nicely under iteration operators appears to require some additional footwork. This looks like the sort of thing you'd want to restrict to knowledgable perl users since it doesn't quite work with the normal perl idioms and is highly likely to confuse (but is a useful thing to have). If anyone knows of a different way to go directly from real element to real element that'd really keen.
@ary[1,100,500,1000] = (1) x 4;
# Iterate a sparse array
exists($ary[$_]) and print "$_\n" for 0 .. $#ary;
# Iterate a normal array
print "$_\n" for @ary
Elt No. 994 0x4284
SV = NULL(0x0) at 0x4284
REFCNT = 2147483634
FLAGS = (READONLY)
Elt No. 995 0x4284
SV = NULL(0x0) at 0x4284
REFCNT = 2147483634
FLAGS = (READONLY)
Elt No. 996 0x4284
SV = NULL(0x0) at 0x4284
REFCNT = 2147483634
FLAGS = (READONLY)
Elt No. 997 0x4284
SV = NULL(0x0) at 0x4284
REFCNT = 2147483634
FLAGS = (READONLY)
Elt No. 998 0x4284
SV = NULL(0x0) at 0x4284
REFCNT = 2147483634
FLAGS = (READONLY)
Elt No. 999 0x706c
SV = IV(0x12094) at 0x706c
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 1
Fun Fun Fun in the Fluffy Chair | [reply] [Watch: Dir/Any] [d/l] [select] |
|
If anyone knows of a different way to go directly from real element to real element that'd really keen.
Easy-peasy:
foreach my $element (@sparse) {
if ( defined $element ) {
# YOUR CODE HERE #
}
}
defined() should be fast enough that calling it even a million times is no big deal.
-- Spring: Forces, Coiled Again!
| [reply] [Watch: Dir/Any] [d/l] |
|
Re: Sparse arrays?
by Enlil (Parson) on Dec 18, 2002 at 06:27 UTC
|
| [reply] [Watch: Dir/Any] |
|
I have always thought that sparse arrays was just another term for a hash.
No. Hashes are unordered collections. You can impose an order on them by requiring ordered keys and then use a hash to implement a sparse array. Additional information, such as a normal dense array containing the sorted hash keys, are needed in order to allow iteration though.
-sauoq
"My two cents aren't worth a dime.";
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] [d/l] |