Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

simply appending to a scalar...

by abachus (Monk)
on Jun 25, 2006 at 13:16 UTC ( #557427=perlquestion: print w/replies, xml ) Need Help??

abachus has asked for the wisdom of the Perl Monks concerning the following question:

Hello there,

What is the most efficient way to append data to a scalar ? I don't think the following is, or is it ? :

$data = $data . $more;
For what its worth, I plan to implement some code that will be handling rather large amounts of data, but the data is likely to arrive in aprox 8kb chunks from a filehandle with a sysread call.

thanks y'all,

Isaac.

Replies are listed 'Best First'.
Re: simply appending to a scalar...
by japhy (Canon) on Jun 25, 2006 at 13:27 UTC
    $x = $x . $y can be written as $x .= $y, and yes, that's the "best" way to append to a scalar.

    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
Re: simply appending to a scalar...
by davido (Cardinal) on Jun 25, 2006 at 16:36 UTC

    There are seven ways I can immediately think of, and you've picked one of the clearest and easiest. Here is the list I can come up with:

    • The . (dot) operator: concatenation.
    • The .= operator: append.
    • substr: Substring manipulation.
    • join: Joining two or more strings.
    • s/(...)/$1$string/: Substitution.
    • The qq/...../ or "...." operator: interpolation.
    • open: open my $fh, '>+', \$variable or die $!;: Print to an in-memory filehandle.

    I'm sure I've missed a few, but these ways jump to mind immediately. The last method I listed; open, is pretty obfuscatory in nature. There aren't many situations I can think of where it would be a favorable approach, especially if simple concatenation is your goal. But it's there, so I mentioned it.

    dot (.) and dot-equals (.=) are definitely the simplest and clearest approaches.


    Dave

      Examples of mentioned solutions:

      • $data = $data . $more;
      • $data .= $more;
      • substr($data, length($data), 0, $more);
      • substr($data, length($data)) = $more;
      • $data = join('', $data, $more);
      • $data =~ s/\z/$more/;
      • $data =~ "$data$more";
      • { open(my $fh, '>+', \$data); print $fh $more; } (Requires 5.8)

      Other useful solutions:

      • $data = sprintf('%s%s', $data, $more);
      • $data = pack('a*a*', $data, $more);
      • $data = pack('(a*)*', $data, $more); (Requires 5.8)
      • $data = do { local $"; my @array = ($data, $more); "@array" };

      Other text formatting functions:

      • format formats text sent to a file handle.
      • formline is the guts of format.

      And that's only using tools meant for text manipulation and/or formatting. You could do weird stuff like:

      • $data = reverse scalar reverse $data, $more;
      • $data = do { local $;; my %hash; $hash{$data, $more}++; (keys(%hash))[0] };
      The example of the open function you've mentioned has become of some interest also.

      I am able to open and print to the in-memory filehandle, though unable to syswrite to it and i don't know why :(

      thank you for your patience,

      Isaac.

        Because syswrite operates at the system level, bypassing Perl's higher-level file IO.

        Let me just say this however: Of the seven ways I listed for appending information to a string, in-memory filehandles are the most difficult, least Perlish one; the one that I strongly urge you to consider not using. There are three ways down from the top floor of the Sears Tower building: You can take the elevator, you can take the staircase, or you can jump off the roof. I would advise against jumping off the roof. But doing so will get you to the bottom, pronto.

        There is a very limited set of problems for which using in-memory filehandles will be the optimal choice. ...very limited, and you rarely see them in run-of-the-mill everyday code. It's a little like symbolic references (though even less useful); they exist, they have uses, but you will rarely see them in code, and will probably never actually need to implement them yourself.

        You may have one of those fairly uncommon situations where the in-memory filehandle leads to better code. If you do, I'd love to see what that application is. Then the next time someone asks where it's a good idea to use them, I'll have at least one good example. lol


        Dave

Re: simply appending to a scalar...
by Hue-Bond (Priest) on Jun 25, 2006 at 13:40 UTC

    Off the top of my head:

    use Benchmark qw/cmpthese/; my $append = 'B' x 50; cmpthese (100, { dot => sub { my $c = 'A' x 50; $c = $c . $append f +or 1..1e5; }, dot_eq => sub { my $c = 'A' x 50; $c .= $append f +or 1..1e5; }, substr => sub { my $c = 'A' x 50; substr $c, length $c, 0, $append f +or 1..1e5; }, }); __END__ Rate substr dot dot_eq substr 16.5/s -- -35% -37% dot 25.5/s 55% -- -2% dot_eq 26.0/s 58% 2% --

    Update: Erm, had to put the 1e5 inside each sub to reduce the overhead of the my declaration.

    Update2: After playing with it for a while, I find interesting what happens when $append has a size that is a power of 2. Size of $c doesn't matter:

    use Benchmark qw/cmpthese/; my $append = 'B' x 256; cmpthese (100, { dot => sub { my $c = 'A' x 50; $c = $c . $append f +or 1..1e5; }, dot_eq => sub { my $c = 'A' x 50; $c .= $append f +or 1..1e5; }, substr => sub { my $c = 'A' x 50; substr $c, length $c, 0, $append f +or 1..1e5; }, }); __END__ Rate substr dot_eq dot substr 6.99/s -- -35% -49% dot_eq 10.7/s 53% -- -21% dot 13.6/s 95% 27% --

    Size of $c doesn't matter. So if you are working with 8 Kb chunks, you better use $data = $data . $more.

    --
    David Serrano

      So if you are working with 8 Kb chunks...

      ... you're doing IO and microbenchmarks of Perl operators really don't matter. If you really need all the speed you can get, and if you've already optimized the rest of the slow parts in your program, tune your buffer size.

      How many times did you run the 256 one? dot_eq equals dot for me every time.

      ActivePerl 5.8.0 on WinXP.

        How many times did you run the 256 one?

        Before reading your post, 3 times; now 6. Always same results:

        substr: >6.80 dot_eq: 10.2 .. 10.7 dot: 13.3 .. 13.6

        This is perl, v5.8.8 built for i486-linux-gnu-thread-multi on Debian/Linux.

        --
        David Serrano

Re: simply appending to a scalar...
by bart (Canon) on Jun 25, 2006 at 15:34 UTC
    For what its worth, I plan to implement some code that will be handling rather large amounts of data, but the data is likely to arrive in aprox 8kb chunks from a filehandle with a sysread call.
    In that case, let sysread do the work for you.
    sysread FILEHANDLE,SCALAR,LENGTH,OFFSET

    An OFFSET may be specified to place the read data at some place in the string other than the beginning.

    Test script:
    $_ = "Old buffer "; sysread DATA, $_, 4, length; print; __DATA__ datadatadata
    Result:
    Old buffer data
    
    So this effectively can be used to append.
      bingo, i'm very gratefull for the answers i've received,
      indeed sysread() can do all that i need and i overlooked that.
      thank you :)

      Isaac.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://557427]
Approved by Limbic~Region
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2022-10-05 22:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My preferred way to holiday/vacation is:











    Results (25 votes). Check out past polls.

    Notices?