Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

How to swap scalar values without copies

by Anonymous Monk
on Feb 19, 2004 at 23:22 UTC ( #330403=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am trying to solve an object initialisation problem: suppose this is my object constructor:
sub new { my ($pkg) = $_[0]; my $this = bless { data => $_[1], }, $pkg; return $this; }
The non-trivial argument ($_[1], a scalar) passed to the constructor can be a VERY large buffer, and I want to store it into the object. I have the following constraints:
  • there must NEVER be, at any time, two copies of this buffer (because it is very memory-expensive)
  • $this->{data} must have scalar semantics, i.e. I don't want to save a reference
  • The piece of code invoking the constructor must not be able to access the buffer through the actual argument after the invocation

Ideally, the best solution should be an operator like swap $a, $b, exchanging the underlying data objects "pointed to" by the two scalars $a and $b. In this way I could initialise $this->{data} to, say, a null string, and then exchange its value with that of $_[1].

I was able to find the module Data::Swap on CPAN, which does exactly this, but I would like to know if this can be done natively in Perl.

Thanks to everybody in advance,

Replies are listed 'Best First'.
Re: How to swap scalar values without copies
by davido (Cardinal) on Feb 20, 2004 at 05:51 UTC
    I dug for an hour or so into the concept of using a set of tied scalars whos "STORE" methods only store references to $_[1], so that copies are never made. My goal was to encapsulate all use of references within the module that handles the tied scalars, so that the scalars themselves just look like normal plain-vanilla scalars, and yet they never actually create a copy of the original data (they should only internally store references to the original data).

    Unfortunately, my attempts, while producing functional code, seem to still make copies. diotalevi suggests that this is because the ( $x, $y ) = ( $y, $x ) construct has its own optimizations that create the copies internally no matter what you do. I also suspect that my FETCH method's ${$self->{VRef}} construct is creating a copy.

    So my attempt at a pure Perl solution to swapping the values of two lexically scoped scalars without creating (explicitly or internally) copies, and without explicitly using references in the main code, is a failure.

    Nevertheless, I thought that the attempt was worth demonstrating to see if anyone else could do anything with it. I really thought I was on to something, but it seems that there isn't a pure-Perl solution to the OP's quandry. Before I paste it here, I just wanted to quickly thank blokhead for helping me test it. Here it is:

    package Tie::Scalar::NoCopies; # This package compiles, runs, and implements a tied # scalar. But it doesn't do as it advertises... ie., # it doesn't suppress copies. use strict; use warnings; sub TIESCALAR { my ( $class ) = @_; my $self = {}; $self->{VRef} = undef; bless $self, $class; } sub STORE { my $self = shift; $self->{VRef} = \$_[0]; return ${$self->{VRef}}; } sub FETCH { my $self = shift; return ${$self->{VRef}}; } sub DESTROY { my $self = shift; } 1; package main; use strict; use warnings; my ($var1, $var2); tie $var1, "Tie::Scalar::NoCopies"; tie $var2, "Tie::Scalar::NoCopies"; $var1 = 'a' x 10; $var2 = 'b' x 10; print "$var1\t$var2\n"; ( $var1, $var2 ) = ( $var2, $var1 ); print "$var1\t$var2\n";



      >> I also suspect that my FETCH method's ${$self->{VRef}} construct is creating a copy.

      I definitely think that ${....} creates copies. I set up a shell with virtual memory restricted to a value chosen by me (with ulimit -v ..... This value is such that a small perl program can allocate a large buffer, but not two copies of the buffer. Now, the following code:

      $this->{data} = \$large_buffer; print length ${$this->{data}} . "\n";
      cannot be executed (it dies of "Out of memory"). So it seems that the reference way (and the function returning the lvalue) are not viable, because the act of dereferencing seems to allocate a copy of the object.

      Another indication that the dereferentiation is nasty, is that running a program where I repeatedly dereferentiate a reference to a large buffer is way slower than a program which accesses the buffer directly (orders of magnitude slower than what the dereferentiation overhead per se could justify).

      Can you tell me if this is true? or who is eating memory then?

Re: How to swap scalar values without copies
by BrowserUk (Patriarch) on Feb 20, 2004 at 03:21 UTC

    This is possible in pure perl, if you can live with using a global var?

    $buffer = 'x' x 10000000; # memory used 12,200kb *x = \$buffer; # mem used 12.200kb print length $x; 10000000 print substr $x, 0, 30; xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

    $x is now an alias for the buffer and can be used as any scalar can be. However, I don't think that there is any way to use a glob assignment with a hash element.

    A possible alternative would be to wrap an lvalue sub (or method) around the $buffer.

    sub test : lvalue { $buffer }; print length test; 10000000 print substr test, 0, 10; xxxxxxxxxx substr( test, 3, 3 ) = 'ABC'; print substr test, 0, 10; xxxABCxxxx print length test; 10000000

    This behaves like a scalar for most purposes albeit that the syntax looks a little strange. As a method, the syntax would be

    print substr obj->test, 20_000, 5; substr( obj->test, 20_000, 5 ) = 'hello';

    which isn't too aweful. It would mean utilising a global with the risk of being stomped on elsewhere, but using a suitably obscure name should minimise the risk of that.

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Timing (and a little luck) are everything!
      I think that the second solution (using a method returning an lvalue) solves two of my three problems. Thank you very much!
Re: How to swap scalar values without copies
by asarih (Hermit) on Feb 19, 2004 at 23:47 UTC
    Is this not what you need?
    $a=\$x; $b=\$y; print "$a $b\n" ($a,$b)=($b,$a); print "$a $b\n";

    update: changed $a and $b to references.

      (I see now the update, and I don't know how to update my reply in turn ...).

      Swapping the values of the references does not waste memory, but it violates rule n.2, namely that I want $a to be a string, not a reference to a string.

        I don't think you can have it both ways. You either need to work around having a reference or live with the major memory requirement. I'm not an XS-guru, but it appears Data::Swap just does the reference stuff for you.

        : () { :|:& };:

        Note: All code is untested, unless otherwise stated

        That is a very stupid requirement. You're creating a problem where one doesn't exist.
      No, this violates the no-copy rule. If $a is a 10Mb string instead of 'a', then, during the assignement ($a,$b)=($b,$a) the process allocates 10 more Mb. I have verified the following resident memory counts:
      print "stage 0\n"; sleep 5; # memory usage now: 1800 Kb $a='a' x 10000000; $b='b'; print "stage 1\n"; sleep 5; # memory usage now: 21336 Kb ($a,$b)=($b,$a); print "stage 2\n"; sleep 5; # memory usage now: 31104 Kb
        One method I think possible is at the XS level. A simple scalar is stored as SvPV internally by Perl, which looks like this:
        SV xpv +--------+ +-----+ | ANY |--->| PVX |---> char[] | REFCNT | | CUR | | FLAGS | | LEN | +--------+ +-----+
        To do an immediate swap without copying strings, at XS level, do this -
        SV xpv +--------+ +-----+ | ANY |-+ ->| PVX |---> char[] | REFCNT | \ / | CUR | | FLAGS | \ / | LEN | +--------+ \/ +-----+ /\ SV / \ xpv +--------+ / \ +-----+ | ANY |-+ ->| PVX |---> char[] | REFCNT | | CUR | | FLAGS | | LEN | +--------+ +-----+

        I can provide an XS example if you are not sure how to do this at XS level.

Re: How to swap scalar values without copies
by davido (Cardinal) on Feb 20, 2004 at 03:53 UTC
    My curiosity has finally got the best of me, and I can no longer resist asking (though I really tried to let it go) why references are out of the question.

    As an OO programmer you're certanly comfortable with references already, so there must be a good reason why they can't be used in this case.

    I'm unable to restrain myself from posing the question. I hope I'm about to learn something new. ;)


      Well, I'm currently confortable with C++ references. Although the name is the same, the syntax is different. In C++ I can write:
      int &pluto = pippo;
      and, from now on (up to the end of its scope), pluto is just a synonim for pippo. I'am not asserting I can do this for an object member.

      In C++, if I have a class member named data, in a method of that class I can simply refer to it as data, which is the same as this->data. In Perl, the same member must be referred to as $this->{data}, in addition to explicitely setting the object pointer every time: my $this = shift;, for instance.

      But, if I store a reference, I am obliged to refer to the corresponding scalar as ${$this->{data}}, and this clutters the code way too much for me ($$data would be acceptable) . Please, tell me that I am wrong and I can save all this typing.

      Moreover, just saving a reference to an argument as a class member is not completely satisfactory. The top of my dreams would be that the passed buffer becomes a private member, i.e. it is not possible to modify it from outside the constructor by referring to the actual argument:

      read(IN, $buffer, 1000000); push @mysegments, new MyPackage::MySegment($buffer); print "+$buffer+"; # gives ++
      but the buffer is still living as a private member in my object.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://330403]
Approved by Limbic~Region
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2022-09-28 10:43 GMT
Find Nodes?
    Voting Booth?
    I prefer my indexes to start at:

    Results (124 votes). Check out past polls.