Allocation of anonymous arrays

OwlHoot has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Allocation of anonymous arrays
by dsheroh (Monsignor) on Feb 07, 2014 at 10:53 UTC

extremely

Hash keys are strings, not scalars. So, when you try to use an array ref as a hash key, it gets stringified and your hash key is the resulting string (e.g., the literal text "ARRAY(0xdeadbeef)"), not the original reference. If he really wants 'random' unique keys for the hash, he could just as well use [] for all the keys. (Or, really, if you're going to use random garbage keys for your hash, you may as well use an array instead, since you won't be able to do key-based lookups anyhow.)

As for your actual question, meditate upon this:

$ perl -e 'print [] . "\n" . [] . "\n" . [] . "\n";'
ARRAY(0x9e557ec)
ARRAY(0x9e6ef80)
ARRAY(0x9e6efbc)
[download]

[reply]
[d/l]
[select]

Re: Allocation of anonymous arrays
by Athanasius (Archbishop) on Feb 07, 2014 at 09:46 UTC

Hello OwlHoot, and welcome to the Monastery!

Consider:

my @foo = (42, 43, 45);
my @bar = (42, 43, 45);
[download]

Would Perl see that both arrays contain the same values, and therefore “optimise” the storage by having @foo and @bar refer to the same set of storage locations? No, because these arrays must be allowed to change independently. For example, incrementing $foo[0] should have no impact on the value of $bar[0].

Now, an anonymous array is just an array which is accessed by a reference rather than a name. Two anonymous arrays which happen to share the same data cannot be optimised to refer to common storage, because either is free to change independently of the other as the script runs.

So, I think your colleague is correct. But — please explain the motivation for using array references as hash keys in this way?!?

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re^2: Allocation of anonymous arrays

by OwlHoot (Novice) on Feb 07, 2014 at 11:19 UTC

Thanks for reply Athanasius (and to everyone who has replied so promptly)

With reference to your first example, in theory I would have thought perl (or "a Similar Language") could indeed have @foo and @bar refer to the same storage initially. Then as soon as any code wanted to amend one of them, or even define a reference to them, perl could create a copy and start using that.

(Granted this probably wouldn't be a good idea in practice because most arrays defined explicitly will usually be amended, or references to them defined, during the subsequent running of the program.)

Literal array references not initially assigned to a variable seem even more suitable for this "copy on write" treatment, as they are initially and quite likely never amended or pointed to explicitly. But I can see that in principle one could have something like:

  my %fred =
  (
    [1, 2, 3] = [1, 1, 0],
    [3, 4, 2] = [1, 0, 1],
    [2, 2, 1] = [1, 1, 0],
  );

  my $i = 0;

  foreach my $ref (keys %fred)
  (
    $ref->[2] = ++$i;
  )
[download]

So I'm still not entirely convinced one way or the other!

As for why my colleague chose this system, it is a test script in which each key array represents a test case in compact form and the value represents outcome flags. (I would have represented these in a rather different way, all in one array for example, but each to their own.)

Regards

John R Ramsden

[reply]
[d/l]

Re^3: Allocation of anonymous arrays

by andal (Hermit) on Feb 07, 2014 at 13:14 UTC

So I'm still not entirely convinced one way or the other!

Sorry guys (you and your colleague), you are arguing about silly things. Of course, your colleague is right. Perl will never optimize away arrays with the same content. It is just silly for perl to run through all arrays trying to see if they have the same content. But your colleague is completely wrong using array references as keys.

Have you actually tried to run the code you gave above? If you run it with "use strict;" then you'll get error "Can't use string ("ARRAY(0xc80de8)") as an ARRAY ref while "strict refs" in use". This is because array that you have given when you created hash is already gone, so perl would try to create new NAMED array and the name for that array would be ARRAY(0xc80de8) or whatever is the unique string that has identified original array.

[reply]

Re^3: Allocation of anonymous arrays

by AnomalousMonk (Archbishop) on Feb 07, 2014 at 14:22 UTC

... perl ... could indeed have @foo and @bar refer to the same storage initially. Then as soon as any code wanted to amend one of them, or even define a reference to them, perl could create a copy and start using that.

(Granted this probably wouldn't be a good idea ...)

You're right: Perl could do that, and it's a bad idea, and so Perl doesn't do that.

... each key array represents a test case in compact form ...

The representational form is compact, indeed: it is the nullity; it has ceased to be; bereft of life, it rests in peace. As soon as the anonymous array constructor [ ... ] finishes its job, the reference it returns is immediately converted into a string and ceases to exist as a reference. Because the referent can no longer be accessed in any way whatsoever (because it has no reference), it is marked for garbage collection (its reference count is zero) and, in the fullness of time, it softly and silently vanishes away.

[reply]
[d/l]

Re^3: Allocation of anonymous arrays

by OwlHoot (Novice) on Feb 07, 2014 at 11:33 UTC

  foreach my $ref (keys %fred)
  (
    $fred{$ref}->[2] = ++$i;
  )
[download]

[reply]
[d/l]

Re^4: Allocation of anonymous arrays

by Anonymous Monk on Feb 07, 2014 at 11:49 UTC

Re: Allocation of anonymous arrays
by Discipulus (Canon) on Feb 07, 2014 at 09:38 UTC

@widowzDoubleQuotation> perl -MData::Dumper -e "%h = ([1, 2, 3] => [1,
+ 1, 0],[3, 4, 5] => [1, 1, 0],[1, 1, 0] => [1, 2, 1],[0, 2, 4] => [1,
+ 2, 1],); print Dumper \%h; print map {qq!$_ is a !.ref($_).qq!\n!} k
+eys %h"

__OUTPUT__
$VAR1 = {
          'ARRAY(0x1d46694)' => [
                                  1,
                                  2,
                                  1
                                ],
          'ARRAY(0x1ca21d4)' => [
                                  1,
                                  2,
                                  1
                                ],
          'ARRAY(0x1ca15f4)' => [
                                  1,
                                  1,
                                  0
                                ],
          'ARRAY(0x6eb01c)' => [
                                 1,
                                 1,
                                 0
                               ]
        };
ARRAY(0x1d46694) is a
ARRAY(0x1ca21d4) is a
ARRAY(0x1ca15f4) is a
ARRAY(0x6eb01c) is a ##undef ie is not a ref but a bare string
[download]

There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

[reply]
[d/l]

Re: Allocation of anonymous arrays
by shmem (Chancellor) on Feb 07, 2014 at 12:56 UTC

The code accesses this using the array references as a key, and as the "key" arrays are all distinct these references must be unique. So no problem there.

No problem except that the keys of that hash aren't anonymous arrays. They are strings, keys of hashes are always strings. You cannot get get back at the anonymous array from its string representation, which is there only as a label. Instead of the hex number attached to it (which actually is the address of a C structure), it could also have the md5 checksum over the array members attached.

my %fred =
 (
   [1, 2, 3] => [1, 1, 0],
   [3, 4, 5] => [0, 1, 0],
   [0, 2, 4] => [1, 2, 1],
 );
for my $k (keys %fred) {
   print "$k element 0: '",$k->[0],"'\n";
   print "array: (",join(",",@$k),")\n";
}
__END__
ARRAY(0x17ba658) element 0: ''
array: ()
ARRAY(0x17b3b80) element 0: ''
array: ()
ARRAY(0x1796998) element 0: ''
array: ()
[download]

You might suspect, that the anonymous arrays went out of scope after the keys were generated out of them, so storing the arrays somewhere would keep them alive. That's true, but even so, the original arrays are not accessible via their string representation:

my @ary = ([1, 2, 3],[3, 4, 5],[0, 2, 4]);
my %fred =
 (
   $ary[0] => [1, 1, 0],
   $ary[1] => [0, 1, 0],
   $ary[2] => [1, 2, 1],
 );
for my $k (keys %fred) {
   print "$k element 0: '",$k->[0],"'\n";
   print "array: (",join(",",@$k),")\n";
}
__END__
ARRAY(0x19cca60) element 0: ''
array: ()
ARRAY(0x19ccb80) element 0: ''
array: ()
ARRAY(0x19af998) element 0: ''
array: ()
[download]

If you want a reversible hash which allows you to get at the values of the arrays identified by their stringy names, you need to construct two hashes - one to map the strings to the arrays, and the hash with them strings as keys and values:

my @ary =  (
   [1, 2, 3], [1, 1, 0],
   [3, 4, 5], [0, 1, 0],
   [0, 2, 4], [1, 2, 1],
);

my (@arystrings, %aryhash);

for my $ary ( @ary ) {
    $aryhash{$ary} = $ary;
    push @arystrings, scalar $ary; # same string as the key above
}

my %fred = @arystrings; # treats @arystrings as (key,value,key,value,.
+..) list

print_stuff();

%fred = reverse %fred; # reverse hash
print_stuff();

sub print_stuff {
    for my $k (keys %fred) {
        print "$k element 0: '",$aryhash{$k}->[0],"'\n";
        print "array: (",join(",",@{$aryhash{$k}}),")\n";
    }
}
__END__
ARRAY(0x25e6998) element 0: '1'
array: (1,2,3)
ARRAY(0x2603b80) element 0: '3'
array: (3,4,5)
ARRAY(0x260f948) element 0: '0'
array: (0,2,4)
ARRAY(0x260f8d0) element 0: '0'
array: (0,1,0)
ARRAY(0x260f9c0) element 0: '1'
array: (1,2,1)
ARRAY(0x2603a60) element 0: '1'
array: (1,1,0)
[download]

Note that keys produces the keys of a hash in random order.

perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

[reply]
[d/l]
[select]

Re: Allocation of anonymous arrays (ref addr)
by Anonymous Monk on Feb 07, 2014 at 09:36 UTC

Any ideas?

An explanation of what problem you're trying to solve would be helpful

At first look (at this late hour), the way that hash is populated/used look kinda silly and poinless :)

Perl will hapilly reuse reference addresses ... since the hash keys are strings, as soon the references are gone, their refaddr's are free for reuse by perl

If you want unique keys you're better off using a UUID or some such like Session::Token - Portable, secure, efficient, simple random session token generation that satisfies those OWASP recommendations

[reply]

Re^2: Allocation of anonymous arrays (ref addr repeats)

by Anonymous Monk on Feb 07, 2014 at 09:53 UTC

fw() for 1 .. 10;
sub fw {
    my %uniq;
    for my $ix ( 0 .. 100 ){
        for my $key ( wf() ){
            my $count = $uniq{$key}++;
            if( $count > 1 ){
                my $keycount = keys %uniq;
                my $buckets  = %uniq;
                print "started repeating at iteration $ix after only $
+keycount in $buckets buckets \n";
                return;
            }
        }
    }
}
sub wf {
    my %f;
    $f{ [] } = [0];
    $f{ [] } = [1];
    $f{ [] } = [2];
    $f{ [] } = [3];
    return keys %f;
}
__END__
started repeating at iteration 8 after only 26 in 20/32 buckets
started repeating at iteration 4 after only 14 in 13/32 buckets
started repeating at iteration 5 after only 17 in 14/32 buckets
started repeating at iteration 5 after only 17 in 13/32 buckets
started repeating at iteration 5 after only 19 in 15/32 buckets
started repeating at iteration 5 after only 18 in 16/32 buckets
started repeating at iteration 6 after only 18 in 16/32 buckets
started repeating at iteration 7 after only 20 in 15/32 buckets
started repeating at iteration 6 after only 18 in 14/32 buckets
started repeating at iteration 7 after only 26 in 21/32 buckets
[download]

[reply]
[d/l]

Re^3: Allocation of anonymous arrays (ref addr repeats)

by Discipulus (Canon) on Feb 07, 2014 at 11:36 UTC

There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

[reply]
[d/l]

Re^4: Allocation of anonymous arrays (ref addr repeats)

by Anonymous Monk on Feb 07, 2014 at 11:46 UTC

Re: Allocation of anonymous arrays
by LanX (Saint) on Feb 07, 2014 at 15:23 UTC

But the deeper problem like others already mentioned, must be stressed out:

Perl has no way to get a reference from it's stringification!
It's a one way street...

Then reversing this hash is like treating a marijuana addict with cocaine.

This whole technique is pointless as long as you don't manually store a lookup-hash to be able to transform key-string to ref.

  DB<107> $aref=[1,2,3]; $lookup{$aref}=$aref
 => [1, 2, 3]

  DB<108> \%lookup
 => { "ARRAY(0x8ffd450)" => [1, 2, 3] }
[download]

Tell your colleague there is no way to use literal arrays here cause the information gets lost.�

Cheers Rolf

( addicted to the Perl Programming Language)

PS: as a side note, Python allows other data-types to be keys, but only if they are immutable ... like literal strings are.

update

�) As long as he doesn't use a tied hash from a fancy CPAN module

[reply]
[d/l]

Re^2: Allocation of anonymous arrays

by AnomalousMonk (Archbishop) on Feb 07, 2014 at 15:44 UTC

This whole technique is pointless as long as you don't manually store a lookup-hash to be able to transform key-string to ref.

But then, in the example given, you have to have the value of $aref (i.e., the reference) in order to stringize it and use it look up the value of $aref — which also seems pointless.

[reply]
[d/l]
[select]

Re^3: Allocation of anonymous arrays

by LanX (Saint) on Feb 07, 2014 at 16:11 UTC

you don't have to keep all $arefs after construction.

  DB<112> for $aref ( [1,2,3],[4,5,6],[7,8,9] ) { 
                      $lookup{$aref} = $aref; 
                      $hash{$aref}   = [ reverse @$aref ]; 
                      }
 => ""

  DB<113> \%hash
 => {
  "ARRAY(0xa547e40)" => [9, 8, 7],
  "ARRAY(0xa5c2e68)" => [6, 5, 4],
  "ARRAY(0xa5c3188)" => [3, 2, 1],
}

  DB<114> \%lookup
 => {
  "ARRAY(0xa547e40)" => [7, 8, 9],
  "ARRAY(0xa5c2e68)" => [4, 5, 6],
  "ARRAY(0xa5c3188)" => [1, 2, 3],
}

  DB<115> print "@{$lookup{$_}}\n" for keys %hash
7 8 9
4 5 6
1 2 3
[download]

Hiding all of this behind a tied hash should be feasible, (depending on implementation details of Tie::Hash , IAW when, where and how "stringization" happens )

Anyway I didn't try to find such implementations on CPAN.

One use case could be to implement sets of complex data structures including set operations

Cheers Rolf

( addicted to the Perl Programming Language)

updates

[reply]
[d/l]
[select]

Re^4: Allocation of anonymous arrays

by AnomalousMonk (Archbishop) on Feb 08, 2014 at 16:07 UTC

Re^5: Allocation of anonymous arrays

by LanX (Saint) on Feb 08, 2014 at 16:19 UTC

Re: Allocation of anonymous arrays
by sundialsvc4 (Abbot) on Feb 07, 2014 at 15:20 UTC

No, Perl won’t put two arrays into the same storage just because (at the moment ...) their values are identical. But ... in practice, it sure is easy to come into a situation where you think that it did ... where you think that you’re just changing a value “over here,” and ... (gasp!) ... a value “over there” just changed, too!

What will actually turn out to have happened, in those cases, is that you had a reference to the same list of values in two or more places ... that you thought that you were telling Perl to move (to duplicate) an array or a hash (something that requires to be “referenced” ...). But what Perl was actually doing was creating references. It seemed to work, until you changed what you thought was an independent, isolated value, and saw that the changes had apparently propagated. Easy to do. And the [erroneous ...] explanation, that the optimizer did something wrong, is also an intuitively-appealing assumption.

Perl tries hard to be a “DWIM = Do What I Mean, TMTOWTDI = There’s More Than One Way To Do It™” language, and so its very-flexible syntax is occasionally misleading. If you are used to dealing with strongly-typed and/or compiled languages where there’s really only one way to do it and where any deviations from that will be caught for you at compile time ... Perl isn’t like that. Its design is not like that, which is neither right nor wrong.

Re^2: Allocation of anonymous arrays

by Bloodnok (Vicar) on Feb 08, 2014 at 10:56 UTC

A user level that continues to overstate my experience :-))

[reply]

Re: Allocation of anonymous arrays
by sundialsvc4 (Abbot) on Feb 07, 2014 at 17:11 UTC

Even though, sometimes, the value that is the “total key to” a particular value, suitable for use in a hash-key that singularly leads to one value ... it might be almost-as-well to formulate a hash-key that potentially leads to a list of values, through which (an appropriate accessor-method) will have to search. A well-made accessor makes it easy / invisible. For that matter, you could have several hashrefs and multiple references. Basically, like what you have with the indexes that point into a DB table.


Just another Perl shrine
	PerlMonks