Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Processing arrays 2 elements at a time (TIMTOWTDI)

by grinder (Bishop)
on Aug 02, 2002 at 15:14 UTC ( #187120=perlmeditation: print w/ replies, xml ) Need Help??

I had a reference to an array the other day, that looked something like this:

  my $r = [ 1, 2, 3, 4, 5, 6 ];

the problem being that I wanted to restructure the values, by taking them by pairs, hence:

  my $r = [ [1, 2], [3, 4], [5, 6] ];

Looks easy enough, and certainly in Perl 6 it will be a snap, since you can iterate over lists taking n at a time (notwithstanding any glaring syntax errors due to my ignorance):

for @$r -> $x, $y { push @$pairs, [$x, $y]; }

But in Perl 5 things are not quite so simple. I'm pretty sure this topic has been covered before, but after searching for a while on obvious things (qw/array element processing time/), all I found was Processing Multiple Array Elements and Breaking output lines into N-element chunks but I'm sure there's another one out there somewhere. So I turned to the chatterbox, and with a little help from my friends™ I managed to get it figured out, and so I thought I'd preserve the solutions here for posterity.

My first idea (no giggling please) was to either push a new anonymous array onto the end of the array or push an element to the last array of the array, which gives:

my $r2; my $flipper = 0; for( @$r ) { if( $flipper++ % 2 ) { push @{$r2->[$#{$r2}]}, $_; } else { push @$r2, [ $_ ]; } }

It looks pretty silly, but it doesn't destroy the contents of $r, and it can scale beyond more than 2 elements at a time (by saying $flipper++ % 2 > . At this point other people chimed in with different suggestions.

domm suggested treating the array as a hash and iterating over keys and values:

my %h=(@$r); for (keys %h) { push @$r2, [$_, $h2{$_} ] } # or possibly # push @$r2, [$_, $h2{$_} ] for keys %h

But that has a number of problems

  • it doesn't preserve the order of the array;
  • all array elements that become keys must be unique;
  • the keys are stringified;
  • it doesn't scale beyond 2 elements at a time.

BrowserUk suggested the following map:

my $i = 0; my $r2 = map {[ $r->[$i*2], $r->[$i++*2+1] ]} (0 .. $#{$r}/2 - 1); # or (0 .. scalar(@$r)/2 - 1)

It does the job, but there's too much happening for it to be clear what is going on, specifically the limit-1 and postincrement. Plus it's really hard to format the whitespace. This was later amended to:

$r2 = [ map{ [ $r->[$_*2], $r->[$_*2+1] ] } (0 .. $#{$r}/2) ]

which gets rid of the icky index variable, and then for an encore rearranged the code to use array slices instead: $r2->[$_] = [ @$r[$_*2, $_*2+1] ] for (0..$#{$r} / 2);

sauoq suggested a good olde C-like approach:

for( my $i=0; $i < scalar @$r; $i += 2 ) { $r2->[$i/2] = [ @$r[$i, $i+1] ] }

To a Perl beginner, it will be pretty obvious what's going on.

Someone else suggested (and I forget who (and that's maybe just as well)) doing the following:

my @copy = @$r; for( @copy ) { push @$r2, [ $_, shift @copy ]; }

But using shift in a foreach is evil. Sometimes you can get away with it, but in this case the result is wrong. It's off-by-one in a bizarre way. And of course it destroys the source array.

But I think this prodded someone else into getting this approach done the right way, for they (again I forget who) suggested:

my @copy = @$r; while( @copy ) { push @$r2, [ shift @c, shift @c ]; }

and in actual fact that is the code that I adopted (despite the fact that it is a destructive read), the main criterion being how easy is it for a non-Perl expert to understand. The code already slings references to anonymous structures all about, so this loop has a certain appealing clarity.

jmcnamara chimed in late with:

$r2 = [ map { [$r->[2*$_], $r[2*$_+1]] } 0 .. scalar @$r / 2 - 1 ]

which could, in light of the other solutions, be amended to

$r2 = [ map { [@$r[2*$_, 2*$_+1]] } 0 .. scalar @$r / 2 - 1 ]

In one of the threads I found, merlyn suggests using splice, which would give something like

@copy = @$r; while (my @pair = splice @copy, 0, 2) { push @$r2, [@pair]; }

This particular variant has the property of having the simplest and clearest method of extending to an arbitrary number of tuples. Just increase the value of the splice's third parameter.

Well all that's well and good, but what would a TIMTOWTDI meditation be without a Benchmark

Array with 8 elements

Benchmark: timing 50000 iterations of aslice, c_loop, flipper,
 map_i, map_jm, map_jm2, map_pure, shifter, splicer...
    aslice: 11 wallclock secs ( 7.29 usr +  0.00 sys =  7.29 CPU) @ 6858.71/s (n=50000)
    c_loop: 10 wallclock secs ( 5.91 usr +  0.00 sys =  5.91 CPU) @ 8460.24/s (n=50000)
   flipper: 18 wallclock secs ( 9.79 usr +  0.00 sys =  9.79 CPU) @ 5107.25/s (n=50000)
     map_i:  9 wallclock secs ( 5.37 usr +  0.00 sys =  5.37 CPU) @ 9310.99/s (n=50000)
    map_jm:  9 wallclock secs ( 5.06 usr +  0.00 sys =  5.06 CPU) @ 9881.42/s (n=50000)
   map_jm2:  8 wallclock secs ( 5.00 usr +  0.00 sys =  5.00 CPU) @ 10000.00/s (n=50000)
  map_pure:  7 wallclock secs ( 5.30 usr +  0.00 sys =  5.30 CPU) @ 9433.96/s (n=50000)
   shifter:  6 wallclock secs ( 3.99 usr +  0.00 sys =  3.99 CPU) @ 12531.33/s (n=50000)
   splicer:  8 wallclock secs ( 5.91 usr +  0.00 sys =  5.91 CPU) @ 8460.24/s (n=50000)
            Rate flipper aslice c_loop splicer map_i map_pure map_jm map_jm2 shifter
flipper   5107/s      --   -26%   -40%    -40%  -45%     -46%   -48%    -49%    -59%
aslice    6859/s     34%     --   -19%    -19%  -26%     -27%   -31%    -31%    -45%
c_loop    8460/s     66%    23%     --     -0%   -9%     -10%   -14%    -15%    -32%
splicer   8460/s     66%    23%     0%      --   -9%     -10%   -14%    -15%    -32%
map_i     9311/s     82%    36%    10%     10%    --      -1%    -6%     -7%    -26%
map_pure  9434/s     85%    38%    12%     12%    1%       --    -5%     -6%    -25%
map_jm    9881/s     93%    44%    17%     17%    6%       5%     --     -1%    -21%
map_jm2  10000/s     96%    46%    18%     18%    7%       6%     1%      --    -20%
shifter  12531/s    145%    83%    48%     48%   35%      33%    27%     25%      --

Array with 1000 elements

Benchmark: timing 500 iterations of aslice, c_loop, flipper,
 map_i, map_jm, map_jm2, map_pure, shifter, splicer...
    aslice:  8 wallclock secs ( 6.26 usr +  0.00 sys =  6.26 CPU) @ 79.87/s (n=500)
    c_loop:  6 wallclock secs ( 4.89 usr +  0.00 sys =  4.89 CPU) @ 102.25/s (n=500)
   flipper: 14 wallclock secs ( 9.55 usr +  0.00 sys =  9.55 CPU) @ 52.36/s (n=500)
     map_i:  9 wallclock secs ( 4.83 usr +  0.00 sys =  4.83 CPU) @ 103.52/s (n=500)
    map_jm:  9 wallclock secs ( 4.57 usr +  0.00 sys =  4.57 CPU) @ 109.41/s (n=500)
   map_jm2:  8 wallclock secs ( 4.62 usr +  0.00 sys =  4.62 CPU) @ 108.23/s (n=500)
  map_pure:  6 wallclock secs ( 4.81 usr +  0.00 sys =  4.81 CPU) @ 103.95/s (n=500)
   shifter:  4 wallclock secs ( 3.53 usr +  0.00 sys =  3.53 CPU) @ 141.64/s (n=500)
   splicer:  8 wallclock secs ( 5.50 usr +  0.00 sys =  5.50 CPU) @ 90.91/s (n=500)
           Rate flipper aslice splicer c_loop map_i map_pure map_jm2 map_jm shifter
flipper  52.4/s      --   -34%    -42%   -49%  -49%     -50%    -52%   -52%    -63%
aslice   79.9/s     53%     --    -12%   -22%  -23%     -23%    -26%   -27%    -44%
splicer  90.9/s     74%    14%      --   -11%  -12%     -13%    -16%   -17%    -36%
c_loop    102/s     95%    28%     12%     --   -1%      -2%     -6%    -7%    -28%
map_i     104/s     98%    30%     14%     1%    --      -0%     -4%    -5%    -27%
map_pure  104/s     99%    30%     14%     2%    0%       --     -4%    -5%    -27%
map_jm2   108/s    107%    35%     19%     6%    5%       4%      --    -1%    -24%
map_jm    109/s    109%    37%     20%     7%    6%       5%      1%     --    -23%
shifter   142/s    171%    77%     56%    39%   37%      36%     31%    29%      --

The code

#! /usr/bin/perl -w use strict; use Benchmark qw/cmpthese/; use Data::Dumper; $Data::Dumper::Indent = 0; my $iters = shift || 1000; my $len = shift || 8; sub flipper { my @r2; my $flipper = 0; for( @_ ) { if( $flipper++ % 2 ) { push @{$r2[$#r2]}, $_; } else { push @r2, [ $_ ]; } } \@r2; } sub map_i { my $i = 0; [ map { [$_[$i*2], $_[$i++*2+1]] } (0 .. scalar(@_)/2 - 1) ]; } sub shifter { my @r2; while( @_ ) { push @r2, [ shift @_, shift @_ ]; } \@r2; } sub c_loop { my @r2; for( my $i=0; $i < scalar @_; $i += 2 ) { push @r2, [ @_[$i, $i+1] ] } \@r2; } sub map_pure { [ map{ [$_[$_*2], $_[$_*2+1]] } (0 .. scalar @_ / 2 - 1) ]; } sub aslice { my @r2; $r2[$_] = [ @_[$_*2, $_*2+1] ] for (0 .. scalar @_ / 2 - 1); \@r2; } sub map_jm { [ map { [$_[2*$_], $_[2*$_+1]] } 0 .. scalar @_ / 2 - 1 ]; } sub map_jm2 { [ map { [@_[2*$_, 2*$_+1]] } 0 .. scalar @_ / 2 - 1 ]; } sub splicer { my @r2; while (my @pair = splice @_, 0, 2) { push @r2, [@pair]; } \@r2; } ###################################################### my $r = [ 1 .. $len ]; for my $sub( qw/ flipper shifter splicer c_loop aslice map_i map_pure map_jm map_jm2 /) { no strict 'refs'; # test # print "$sub ", Dumper( &$sub( @$r )), "\n"; } cmpthese( $iters, { flipper => sub { flipper( @$r )}, shifter => sub { shifter( @$r )}, splicer => sub { splicer( @$r )}, c_loop => sub { c_loop( @$r )}, aslice => sub { aslice( @$r )}, map_i => sub { map_i( @$r )}, map_pure => sub { map_pure( @$r )}, map_jm => sub { map_jm( @$r )}, map_jm2 => sub { map_jm2( @$r )}, }, ); __END__

I find it interesting the code that I consider as being the easiest for a non-expert to read is also the fastest... And when you see how bad my initial idea was, you know understand why I went asking in the CB (I blame lack of coffee/lack of sleep/early morning).


print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

Comment on Processing arrays 2 elements at a time (TIMTOWTDI)
Select or Download Code
Re: Processing arrays 2 elements at a time (TIMTOWTDI)
by dimmesdale (Friar) on Aug 02, 2002 at 15:44 UTC
    Well, having required something like this recently, I feel obliged to comment.

    I particurly liked merlyn's solution:

    @copy = @$r; while (my @pair = splice @copy, 0, 2) { push @$r2, [@pair]; }

    However, I'll modify it slightly to show everyone what I came up with. (Maybe it's just me, but bit-wise operators always seem to pop-up in my code ... probably where they shouldn't :)

    @copy = @$r; # I guess you know that your array will work for this, but # it may be desirable to check (especially for n-'tuples') # that the array is divisible by 2 (or n). I guess that # depending on what you're expecting this can be done on a # all or nothing basis, or inside the loop while (0..$#copy) { next if $_&1; # this skips all odds (you used a mod in # your original I think). To expand to more # 'tuples', you'd have to change this to # next if $_%n push @$r2, [$copy[$_],$copy[$_+1]]; }
Re: Processing arrays 2 elements at a time (TIMTOWTDI)
by petral (Curate) on Aug 02, 2002 at 15:47 UTC
        the code that I consider as being the easiest for a non-expert to read is also the fastest

    Although, note that it probably would not be the fastest inline:
    my @copy = @$r; while( @copy ) { push @$r2, [ shift @copy, shift @copy ]; }
    It is only the artifact of using @_ in the sub call which avoids the copy.

      p
      So don't make your copy first. It's a reference after all, which in this case makes life easy.
      my @copy; while (@$r) { push @copy, [ shift @$r, shift @$r ]; } $r = \@copy;
      This way we build our new array (while the one gets bigger, the other gets smaller), then we simply change what the reference points to.
        That's very neat.   My only point was that the way grinder was presenting it (non-destructive of the original array), it would probably not be the fastest if the array copy was not hidden in the overhead for the subroutine call.

          p
Re: Processing arrays 2 elements at a time (TIMTOWTDI)
by FoxtrotUniform (Prior) on Aug 02, 2002 at 15:59 UTC

    Update: D'oh! Silly coder, read the whole article before posting. Ignore me, I'm being redundant.


    I'm surprised that nobody mentioned splice, as merlyn suggested here:

    while(my @pair = splice @$r, 0, 2) { push @$r2, \@pair; }

    --
    The hell with paco, vote for Erudil!
    :wq

Re: Processing arrays 2 elements at a time (TIMTOWTDI)
by shotgunefx (Parson) on Aug 03, 2002 at 01:40 UTC
    Processing Multiple Array Elements is much slower than any of these but to be fair I wrote it a long time ago. :)

    I would most likely use a closure now or a closure hiding behind a regular function. It also works a little different. You can nest them. and continue were you left off.

    -Lee

    "To be civilized is to deny one's nature."
Re: Processing arrays 2 elements at a time (TIMTOWTDI)
by Juerd (Abbot) on Aug 03, 2002 at 10:52 UTC

    Here's a somewhat more efficient splicer that is almost as fast as the shifter function:

    sub s2 { my @r2; while (@_) { push @r2, [ splice @_, 0, 2 ]; } return \@r2; }
    This is faster than splicer() because it doesn't have the overhead of assigning the chunks to a temporary array.

    By removing the overhead of entering a block on each iteration, it'll be even faster:

    sub s2 { my @r2; push @r2, [ splice @_, 0, 2 ] while @_; return \@r2; }
    As fast as the unaltered shifter() to be precise (but of course shifter() would still win if it were altered in the same way).

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

Re: Processing arrays 2 elements at a time (TIMTOWTDI)
by domm (Chaplain) on Aug 03, 2002 at 18:40 UTC
    Just for reference, I was the one who suggested the "shifter".

    And, to be honest, I suggested the evil foreach-shift-thingy, too.

    -- #!/usr/bin/perl for(ref bless{},just'another'perl'hacker){s-:+-$"-g&&print$_.$/}

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://187120]
Approved by rob_au
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (9)
As of 2014-08-20 06:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (105 votes), past polls