Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Hashes do preserve insertion order after all

by kikuchiyo (Pilgrim)
on Jul 31, 2019 at 16:00 UTC ( #11103651=perlmeditation: print w/replies, xml ) Need Help??

It is often repeated that hashes in Perl do not preserve insertion order. Yet:

#!/usr/bin/perl use strict; use warnings; use feature qw/say/; my %hoh = ( foo => { value => 'first' }, bar => { value => 'second' }, baz => { value => 'third' }, ); for my $elem (sort values %hoh) { say "value => " . $elem->{value}; }

Output:

value => first value => second value => third

Edit and clarification: I'm not saying that you should ever use this (in fact, I'm saying that you shouldn't, the comments below describe why). It looks like as if it really preserved insertion order, but actually the trick relies on implementation details (memory allocation) you can't (or shouldn't want to) control.

Replies are listed 'Best First'.
Re: Hashes do preserve insertion order after all
by roboticus (Chancellor) on Jul 31, 2019 at 16:16 UTC

    kikuchiyo:

    No, it's letting you sort by the address, as you mention. So it's giving you the creation order instead of the insertion order.

    #!/usr/bin/perl use strict; use warnings; use feature qw/say/; my %hoh = ( foo => { value => 'first' }, bar => { value => 'second' }, baz => { value => 'third' }, ); my %h2; $h2{first} = $hoh{baz}; $h2{second} = $hoh{foo}; $h2{third} = $hoh{bar}; for my $elem (sort values %h2) { say "value => " . $elem->{value}; }

    Here, I'm inserting the same items into another hash in a different order, yet it returns the same order as the first.

    $ perl p.pl value => first value => second value => third

    The creation order isn't guaranteed, either. Once you start adding and deleting objects over a longer time period, deleted chunks of memory can be reused later breaking the correlation of address and creation time.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: Hashes do preserve insertion order after all
by LanX (Archbishop) on Jul 31, 2019 at 16:20 UTC
    I've seen quite often that Perl was reusing reference addresses after garbage collection had them released before.

    So how do you want to be sure they really grow chronologically?

    IIRC there are sufficient solutions on CPAN for an "ordered" hash by tieing an array to the hash.

    What's wrong with them?

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

      I've seen quite often that Perl was reusing reference addresses after garbage collection had them released before.

      I was curious as to how often this address reusing happens:

      #!/bin/bash for i in {1..15}; do perl -Mstrict -wle' use strict; use warnings; my %addr; my $count = 0; my $tries = 2**$ARGV[0]; for my $try (1..$tries) { my %h; for (reverse 1..1000) { $h{$_} = { value => $_}; $count++; } $addr{$_}++ for values %h; } my $reused = scalar grep $addr{$_}>1, keys %addr; my $addrs = scalar keys %addr; print "Address of reference reused $reused times, all addresses $addrs +, in $count tries";' $i done
      Address of reference reused 1 times, all addresses 1999, in 2000 tries Address of reference reused 1001 times, all addresses 2999, in 4000 tr +ies Address of reference reused 2995 times, all addresses 3007, in 8000 tr +ies Address of reference reused 3011 times, all addresses 3023, in 16000 t +ries Address of reference reused 3043 times, all addresses 3055, in 32000 t +ries Address of reference reused 3107 times, all addresses 3119, in 64000 t +ries Address of reference reused 3235 times, all addresses 3247, in 128000 +tries Address of reference reused 3491 times, all addresses 3503, in 256000 +tries Address of reference reused 4003 times, all addresses 4015, in 512000 +tries Address of reference reused 5027 times, all addresses 5039, in 1024000 + tries Address of reference reused 7075 times, all addresses 7087, in 2048000 + tries Address of reference reused 11171 times, all addresses 11183, in 40960 +00 tries Address of reference reused 19363 times, all addresses 19375, in 81920 +00 tries Address of reference reused 35747 times, all addresses 35759, in 16384 +000 tries Address of reference reused 68515 times, all addresses 68527, in 32768 +000 tries

      These numbers seem to be stable on my machine across multiple runs.

      The bottom line is that address reusing happens all the time.

        I suppose Perl is keeping its memory footprint low by reusing them.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

       What's wrong with them?

      efficiency?

        Elaborate?

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Re: Hashes do preserve insertion order after all
by swl (Deacon) on Jul 31, 2019 at 22:49 UTC
Re: Hashes do preserve insertion order after all
by bliako (Vicar) on Aug 01, 2019 at 00:27 UTC

    Allegedly the solution to a common headache I had recently: save control points to a curve path where order of insertion is important - shape depends on it. But also delete a control point in the middle of the path (by key): meaning retain insertion order and at the same time search/insert/delete O(1) by key ... hmmm

    kikuchiyo please prepare a test case to prove your hack's worth (especially when memory is not large - as others here said already).

    Here is a thought:

    #!/usr/bin/perl use strict; use warnings; use feature qw/say/; my %hoh = ( foo => { value => 'second', time => time}, bar => { value => 'first', time => time}, baz => { value => 'third', time => time}, ); for my $elem (sort {$a->{time} <=> $b->{time}} values %hoh) { say "value => " . $elem->{value}; }

    which does not work because time() granularity is too thick/un-fine. Using DateTime::HiRes is not a guarantee either.

    Should one be using instruction-based counting?

    Perhaps Perl has a monotonically increasing counter, sort of like an incrementing-only program counter/instruction pointer? Alas, even than can not guarantee one instruction to insert only one hash element (can it?).

    A very interesting problem which has solutions but not EFFICIENT solutions. This is one of the most ingenious I have seen.

    bw, bliako

    (edit: aesthetical edits a minute after posting)

      BTW, of course this works...:

      #!/usr/bin/perl use strict; use warnings; use feature qw/say/; my $HC = 0; my %hoh = ( foo => { value => 'second', time => $HC++}, bar => { value => 'first', time => $HC++}, baz => { value => 'third', time => $HC++}, ); for my $elem (sort {$a->{time} <=> $b->{time}} values %hoh) { say "value => " . $elem->{value}; }

      bw, bliako

      All the criticism against this hack brought up in the other replies is valid.

      I see one context where it might still have a viable use case: in very short, one shot, one liner scripts that build one data structure, transform or filter it some way then print the result. This is the context where I've discovered it by the way.

      Example: I had a complex, multilevel JSON, and I wanted a flat list of some of the values from a sub-hash. The json_xs command line script from the JSON::XS module has a mode to execute arbitrary code before printing the result, so I used the following one liner:

      json_xs -t string -e 'my $x; for (sort values %{ $_->{database}->{data +bases} }) {$x .= $_->{dbname}. $/;} $_=$x' < conf/test.json

      In this (and only this) context this hack can be used quite reliably, and it has the advantage that it is concise and does not require any external modules (and presumably it is fast, because all it uses is the built-in reference stringification and sort's default lexical sorting).

Re: Hashes do preserve insertion order after all
by Anonymous Monk on Aug 04, 2019 at 21:51 UTC
    When you start to delete things you'll see that the ordering quickly breaks and once broken does not return.
      The OP was explicitly sorting by ref address of the values.

      No matter how many hash entries you delete this order won't change.

      The real problems with this approach were already addressed.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://11103651]
Approved by choroba
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2019-09-19 03:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The room is dark, and your next move is ...












    Results (238 votes). Check out past polls.

    Notices?