Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Creating hash of arrays (in a faster way)

by Anonymous Monk
on Nov 01, 2013 at 09:27 UTC ( #1060727=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I'm just wondering whether it will be possible to create an hash of arrays using a faster method than the second solution provided in the code below:
# WRONG { my %hash; my @cols = qw(min max sum); my @values = qw(1 3 4); @hash{@cols} = @values; use Data::Dumper; print Dumper \%hash; } # CORRECT (but slow) { my %hash; my @cols = qw(min max sum); my @values = qw(1 3 4); foreach my $col (@cols) { push @{$hash{$col}}, shift @values; } use Data::Dumper; print Dumper \%hash; }
Thanks in advance.

Replies are listed 'Best First'.
Re: Creating hash of arrays (in a faster way)
by BrowserUk (Pope) on Nov 01, 2013 at 11:00 UTC

    I'd do it this way. I doubt it is any faster than your's, but destroying the input array by a thousand paper cuts seems gratuitous:

    @cols = qw(min max sum min max sum);; @values = qw(1 3 4 7 8 9);; push @{ $hash{ $cols[ $_ ] } }, $values[ $_ ] for 0 .. $#values;; pp \%hash;; { max => [3, 8], min => [1, 7], sum => [4, 9] }

    I think the real question here is: where do the inputs arrays come from? Is it necessary to build them before constructing the array, or could the hash be built directly from wherever the arrays are populated from?

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Creating hash of arrays (in a faster way)
by choroba (Chancellor) on Nov 01, 2013 at 09:35 UTC
    You can use map to fix the first method.
    @hash{@cols} = map [$_], @values;

    The speed can be measured by Benchmark.

    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      I would assume from the data structure that the arrays might be intended to contain more than one value, although that has not been demonstrated in the code of the question.

      My approach, keeping with the spirit of the push as I perceive it, is below. It's less ugly than what I thought it would be:

      for my $row ( @hash{ @cols }) { push @$row, shift @values }

      I also thought about an approach that does avoid the loop, but my solutions only shuffles the loop around, and doing weirdo aliasing tricks through @_ did not lead to a good solution for me - it seems that array-assigning to the aliased @_ does not do individual assignment to the slots:

      sub alias { \@_ }; { my %hash; my @cols = qw(min max sum); my @values = qw(1 3 4); @{ alias map { $_->[ 0+ @$_ ] } @hash{ @cols } }= @values; print Dumper \%hash; }

      Avoiding the magic aliasing and passing around references does not really improve the situation, even though it works.

      { my %hash; my @cols = qw(min max sum); my @values = qw(1 3 4); for( map { \$_->[ 0+@$_ ] } @hash{ @cols } ) { $$_= shift @values; }; print Dumper \%hash; }
        You're correct. The arrays are indented to contain more than one value.
        Iterating over the values of the hash seems to be quite a bit faster. Thank you very much.
Re: Creating hash of arrays (in a faster way)
by Laurent_R (Abbot) on Nov 01, 2013 at 11:20 UTC

    To start with, just one (possibly silly) question: are you sure that the data structure you obtain with the "correct but slow" code is really what you want? I am asking that because the data structure you get with that does not seem terribly useful (at least in the limited context). A simple hash would appear to be more useful, and, in that case, the first attempt (which you qualify as wrong) seems OK.

    Now, assuming that you really want a hash of arrays such as the one you build in your second code snippet, if anything is slow, it is probably not so much the foreach construct, which is pretty fast, but the fact that you are shifting the @values each time, meaning that Perl has to recalculate the @values array at each iteration.

    The map version proposed by choroba is likely to be faster not so much because map is faster than foreach (most benchmarks that I have done show that the difference between the two constructs is usually quite small and foreach is quite often slightly faster, at least when they allow similar syntax constructs), but because it does not shift the @values array each time.

    I did not do benchmarks and I may turn out to be wrong in that specific case, but I just wanted to call your attention on that for your consideration if you are going to benchmark various solutions.

    Update: I was interrupted while writing the above and had to do something else. When I started to write this, there was only one answer (choroba's), I would probably not have written the above if I had seen all the other useful answers that came in between, since it turns out I am probably not saying very much new.

Re: Creating hash of arrays (in a faster way)
by LanX (Chancellor) on Nov 01, 2013 at 13:21 UTC
    looks for me like you want to transpose a matrix with headlines (aka table) to get hashes of columns.

    just for fun a solution with List::MoreUtils :

    (though I doubt it's long you are not slurping a text table =)

    DB<201> \@tab => [ ["A", "B", "C", "D"], ["a1", "b1", "c1", "d1"], ["a2", "b2", "c2", "d2"], ["a3", "b3", "c3", "d3"], ] DB<202> @head= @{$tab[0]} => ("A", "B", "C", "D") DB<203> use List::MoreUtils qw/part/ DB<204> $i=-1; %col=(); DB<205> @col{@head} = part {$i++; $i %= @head } map {@$_} @tab[1..3] + => ( ["a1", "a2", "a3"], ["b1", "b2", "b3"], ["c1", "c2", "c3"], ["d1", "d2", "d3"], ) DB<206> \%col => { A => ["a1", "a2", "a3"], B => ["b1", "b2", "b3"], C => ["c1", "c2", "c3"], D => ["d1", "d2", "d3"], }

    Cheers Rolf

    ( addicted to the Perl Programming Language)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1060727]
Approved by Happy-the-monk
Front-paged by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2017-03-28 06:40 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (327 votes). Check out past polls.