Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Non-destructive array processing

by pdcawley (Hermit)
on Jan 20, 2003 at 21:59 UTC ( #228506=note: print w/ replies, xml ) Need Help??


in reply to Non-destructive array processing

I choose neither of the above.

my @array = 1..10; my @ary_copy = (@array); while (my @chunk = splice @ary_copy, 0, 2) { print "Chunk: @chunk\n"; } print "Original array is still intact! (@array)\n";
straightforward, easy to understand and not overly clever. (But the closure trick is very clever. I wonder which is faster)


Comment on Re: Non-destructive array processing
Download Code
Re: Re: Non-destructive array processing
by Ovid (Cardinal) on Jan 20, 2003 at 22:27 UTC

    Yes, the closure trick is clever, but I don't wonder which is faster. Aside from my assumption that it must be slower due to the function overhead in Perl, the obfuscation factor alone would cause me to eschew it. However, there's a more subtle problem at work that's going to kill many programmers. Since @_ aliases the argument list, the following two lines are equivalent:

    my $r1 = sub { \@_ }->(@array); my $r2 = \@array;

    What that means is that any processing on $r is going to affect @array. The following snippet will clarify.

    use Data::Dumper; my @array = 1..10; my $aref = sub {\@_}->(@array); $_++ foreach @$aref; print Dumper \@array; my $r2 = \@array; print \@array,"\n",$r2;

    The way to get around that with a closure is to do this:

    my $r = sub {my @a = @_; \@a}->(@array);

    Clearly that's not going to be faster than simply copying the array.

    Cheers,
    Ovid

    New address of my CGI Course.
    Silence is Evil (feel free to copy and distribute widely - note copyright text)

      Heh. I'm so used to slinging objects around rather than simple scalars I just took it as read that it would be a shallow copy. Note that, in the original case that's not a problem because assigning to @chunk makes a copy of the value.

        Whoa! You're right. That doesn't seem to be very DWIM. The following snippet does not alter the elements of @array.

        my @array = 1..10; my $r1 = sub { \@_ }->(@array); while (my @chunk = splice @$r1, 0, 2) { print "Chunk: @chunk\n"; } $_++ foreach @$r1; print "Original array is still intact! (@array)\n";

        However, by moving the auto-increment line above the while loop:

        my @array = 1..10; my $r1 = sub { \@_ }->(@array); $_++ foreach @$r1; while (my @chunk = splice @$r1, 0, 2) { print "Chunk: @chunk\n"; } print "Original array is not intact! (@array)\n";

        This looks like some weird "copy on write" behavior that I was not aware of. Is this new, or is it Yet Another Feature that I didn't know about? :)

        Update: Okay, I see that adrianh answered the question.

        Cheers,
        Ovid

        New address of my CGI Course.
        Silence is Evil (feel free to copy and distribute widely - note copyright text)

      Since @_ aliases the argument list, the following two lines are equivalent:
      my $r1 = sub { \@_ }->(@array); my $r2 = \@array;

      No they're not :-)

      The first is a reference to an array that has every element aliased to every element of @array.

      The second is a reference to @array.

      The "trick" wouldn't work otherwise, since changing $r2 will change @array. For example.

      my @array = (1..10); my $r1 = sub { \@_ }->(@array); pop @$r1; print "unchanged @array\n"; my $r2 = \@array; pop @$r2; print "changed @array\n";

      gives us

      unchanged 1 2 3 4 5 6 7 8 9 10 changed 1 2 3 4 5 6 7 8 9
        The first is a reference to an array that has every element aliased to every element of @array.

        Very very true. (I was going to say if it wasnt already said...) This has some devious implications. Consider the consequences of this:

        $\="\n"; my ($x,$y,$z)=(1,2,3); my $r=sub{ \@_ }->($x,$y,$z,$z,$y,$x); print $r->[0]; # 1 $r->[-1]=10; print $r->[0]; # 10
        So personally I wouldnt use this approach at all (for this anyway). Theres waaaay too much chance that somebody would come along and morph the code without understanding the deeper implications and then run around screaming about bizarre bugs.

        --- demerphq
        my friends call me, usually because I'm late....

Re: Re: Non-destructive array processing
by demerphq (Chancellor) on Jan 21, 2003 at 13:23 UTC
    Im so glad I read the other replys before posting mine. This is exactly what I would have posted.

    The cleverness in both those examples from Juerd is ok for personal and even module code for CPAN, but IMO generally unusable within a work/production context. First off they dont really look like they do what they do, second they are confusing and error prone. Wheras yours looks exactly like what it does. No maintenance programmer is going to get confused years after ive left the company.

    ++

    --- demerphq
    my friends call me, usually because I'm late....

      No maintenance programmer is going to get confused years after I've left the company.
      Now there's a motto to live by.

      unusable within a work/production context

      I wouldn't handle large data sets in production code. This is primarily for one-time hacks, but I wondered what other people would prefer.

      No maintenance programmer is going to get confused years after ive left the company.

      Note that if code like this ever goes into production, I do of course add proper comments, including a note that you shouldn't use @$r elsewhere (re your other post).

      Juerd
      - http://juerd.nl/
      - spamcollector_perlmonks@juerd.nl (do not use).
      

        I do of course add proper comments

        I think the point is that its better not to have to comment at all. If the code looks like it does what it does then you dont need a comment. Plus comments get out of date sometimes and then cause confusion. I realize that a well chosen comment can be extremely useful, but I think you know what I mean.

        *shrug* Its a nice trick though for some things. :-) Although Ive usually used it as named routine.

        sub aliased_array { \@_ }

        --- demerphq
        my friends call me, usually because I'm late....

Re: Re: Non-destructive array processing
by Juerd (Abbot) on Jan 21, 2003 at 19:20 UTC

    my @ary_copy = (@array); straightforward, easy to understand

    I agree, and it is exactly how I usually do this. Unfortunately, I ran into some huge data and couldn't copy without installing additional RAM :)

    But the closure trick is very clever. I wonder which is faster

    Wonder no more, unless my benchmark is wrong, of course :)

    #!/usr/bin/perl -w use strict; use Benchmark qw(cmpthese); our @array; sub pdcawley { my @copy = @array; while (my @chunk = splice @copy, 0, 2) { } } sub juerd { my $refs = sub { \@_ }->(@array); while (my @chunk = splice @$refs, 0, 2) { } } sub bench { printf "\n\e[1m%s\e[0m\n", shift; cmpthese(-10, { pdcawley => \&pdcawley, juerd => \&juerd }); } @array = (1) x 32767; bench "Long array, tiny values"; @array = ("x" x 32) x 32767; bench "Long array, small values"; @array = (1) x 32; bench "Short array, tiny values"; @array = ("x" x 32) x 32; bench "Short array, small values"; @array = ("x" x (2**20)) x 32; bench "Short array, large values"; @array = ("x" x (8 * 2**20)) x 32; bench "Short array, huge values";

    (Note: stripped)

    Long array, tiny values pdcawley 26.1/s -- -17% juerd 31.4/s 20% -- Long array, small values pdcawley 12.9/s -- -38% juerd 20.7/s 60% -- Short array, tiny values pdcawley 32909/s -- -1% juerd 33197/s 1% -- Short array, small values pdcawley 19203/s -- -17% juerd 23084/s 20% -- Short array, large values pdcawley 1.83/s -- -53% juerd 3.89/s 112% -- Short array, huge values pdcawley 4.32 -- -53% juerd 2.04 112% --

    I'd like to test it with an array of 32 elements of 20 MB each, but the copy doesn't fit in memory.

    Anyhow, it seems that using the array of aliases is much more efficient than using a copy, especially with large data sets.

    Juerd
    - http://juerd.nl/
    - spamcollector_perlmonks@juerd.nl (do not use).
    

      Crumbs. Clarity really costs in some cases doesn't it?

      I normally fight shy of commenting code if I can possibly help it, generally preferring to sweat over making the code as clear as possible, but if I found myself having to use that trick then I'd definitely fence it around with comments.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://228506]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (9)
As of 2014-12-17 23:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (40 votes), past polls