Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: How to concatenate N binary buffers?

by RichardK (Priest)
on Nov 13, 2012 at 17:07 UTC ( #1003672=note: print w/ replies, xml ) Need Help??


in reply to How to concatenate N binary buffers?

Why bother? why not just write each buffer separately?

syswrite $out, $buf1, $bufsize; syswrite $out, $buf2, $bufsize;


Comment on Re: How to concatenate N binary buffers?
Download Code
Re^2: How to concatenate N binary buffers?
by AnomalousMonk (Monsignor) on Nov 13, 2012 at 17:57 UTC
        syswrite $out, $buf1, $bufsize;
        syswrite $out, $buf2, $bufsize;

    If a constant number  $bufsize of bytes is always written, then when fewer than  $bufsize bytes are read, typically at the end of a file, a block of junk will be written to the output file between the  $buf1 and  $buf2 blocks (Update: and also from the end of  $buf2 to the end of the output file).

    Better to follow choroba's advice and record the bytes actually read, then write just that number of bytes (but still without the need for explicit concatenation):
        syswrite $out, $buf1, $size1;
        syswrite $out, $buf2, $size2;

Re^2: How to concatenate N binary buffers?
by mantager (Sexton) on Nov 14, 2012 at 07:03 UTC

    It's what I am going to do, but it seemed strange to repeat the syswrite N times, one for each data chunk. I was just wandering if there's a "safe" way to concatenate data (meaning: does the "." operator change the buffers or not?) and then issue just one write (writing a larger chunk should should also optimize I/O).

    Another interesting question: is it faster to concat data and then write once, or is it faster to issue one syswrite for each buffer?
    I'm afraid I'll have no time to benchmark.

    Thanks.

      does the "." operator change the buffers or not?
      T.I.T.S. Or, Try It To See.:
      # perl -E '$s .= chr for 0 .. 31; say $s' | xxd 0000000: 0001 0203 0405 0607 0809 0a0b 0c0d 0e0f ................ 0000010: 1011 1213 1415 1617 1819 1a1b 1c1d 1e1f ................ 0000020: 0a
      I'll have no time to benchmark
      I am afraid we will have no time to answer.
      لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

        So I took the time to benchmark:

        $ ./syswrite_or_concat.pl 1000000 Double syswrite: timethis 1000000: 2 wallclock secs ( 0.43 usr + 0.94 sys = 1.37 CPU +) @ 729927.01/s (n=1000000) Data concat: timethis 1000000: 1 wallclock secs ( 0.32 usr + 0.45 sys = 0.77 CPU +) @ 1298701.30/s (n=1000000) Data join: timethis 1000000: 1 wallclock secs ( 0.37 usr + 0.46 sys = 0.83 CPU +) @ 1204819.28/s (n=1000000)
        #!/usr/bin/env perl # ex: set tabstop=4 et syn=perl: use strict; use warnings; use 5.010.001; use Benchmark qw(timethis); use Fcntl; my $count = shift || 1_000_000; my $file = '/dev/shm/outfile'; my $bufsize = 256; my $data1 = chr(1) x $bufsize; my $data2 = chr(2) x $bufsize; say "Double syswrite: "; my $out; sysopen($out, $file, O_WRONLY|O_CREAT) or die "unable to write on $file"; timethis($count, sub { syswrite ($out, $data1, $bufsize) == $bufsize or die "unable to write whole data1 buffer"; syswrite ($out, $data2, $bufsize) == $bufsize or die "unable to write whole data2 buffer"; }); close $out; say "Data concat:"; sysopen($out, $file, O_WRONLY|O_CREAT) or die "unable to write on $file"; my $doublebuf = 2 * $bufsize; timethis($count, sub { syswrite ($out, $data1.$data2, $doublebuf) == $doublebuf or die "unable to write all data"; }); close $out; say "Data join:"; sysopen($out, $file, O_WRONLY|O_CREAT) or die "unable to write on $file"; timethis($count, sub { syswrite ($out, join('', $data1, $data2), $doublebuf) == $doublebu +f or die "unable to write all data"; }); close $out; unlink $file;
      Another interesting question: is it faster to concat data and then write once, or is it faster to issue one syswrite for each buffer? I'm afraid I'll have no time to benchmark.

      You try to read data from a broken RAID. And your main problem is performance? Sorry, I don't get it. You should be happy with every single bit you can still read. And while one can use perl for data rescue, I think you should not use perl for the job. There are better tools, and you seem to know none of them, including Perl. Pay an expert to recover your RAID. If you need your data back fast, you'll probably have to pay an extra fee.

      Questions you should ask later: Why did nobody notice that the RAID had problems before it failed? Why is there no recent, verified backup of the RAID?

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
        You try to read data from a broken RAID. And your main problem is performance?

        Yes. Well, no. I mean, I was just wondering... but what if the raid array was some many terabytes in size? It could take some days to rescue it. Just a few days spared could come in handy :)

        I think you should not use perl for the job. There are better tools, and you seem to know none of them, including Perl

        Come on, now you're being unfair. I am no perl developer, for sure, but I know a little. And if I'm here to ask, it's for sure because I don't know everything... why is there a "seekers of Perl wisdom" section at all, otherwise? It's not "guru meditations", afaik.
        As for the "better tools", I would really like to know more, if you're willing to tell me.

        Pay an expert to recover your RAID

        Why? I did it myself. Using Perl, among other things. I'm enough an expert to get data back from a corrupted disk, raid array or whatever is still willing to give me some bit of information :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1003672]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (12)
As of 2014-07-22 12:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (111 votes), past polls