Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Binary files

by BrowserUk (Patriarch)
on Mar 16, 2005 at 15:40 UTC ( [id://439984]=note: print w/replies, xml ) Need Help??


in reply to Binary files

my $xlsdata = do{ local $/ = \-s( XLSFILE ); <XLSFILE> };

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco.
Rule 1 has a caveat! -- Who broke the cabal?

Replies are listed 'Best First'.
Re^2: Binary files
by tlm (Prior) on Mar 17, 2005 at 14:55 UTC

    I don't get it. Why would one prefer

    my $xlsdata = do{ local $/ = \-s( XLSFILE ); <XLSFILE> };
    over plain ol'
    my $xlsdata = do { local $/; <XLSFILE> };
    ??

    the lowliest monk

      Leeeeeeeeeeeeeeeeeeeeets BENCHMARK!

      use Benchmark qw(:all); + timethese( 100, { 'By File Size' => sub { open XLSFILE, "x"; my $xlsdata = do { local $/ = \-s (XLSFILE); <XLSFILE> }; }, 'Plain Slurp' => sub { open XLSFILE, "x"; my $xlsdata = do { local $/; <XLSFILE> }; + }, } );
      Where "x" is 22Mb:
      By File Size: 11 wallclock secs ( 7.54 usr + 3.11 sys = 10.65 CPU) @ + 9.39/s (n=100) Plain Slurp: 11 wallclock secs ( 8.50 usr + 2.95 sys = 11.45 CPU) @ +8.73/s (n=100)
      It appears that using the file size is a nats cock faster, which is as I would expect as the traditional slurp will have to go looking for the end of the file.

      /J\

      I originally got it into my head that setting the size for slurping was a good thing after reading Slurp-Eazy. I seem to recall that it seemed quicker on my old portable under 5.6.1.

      However, looking again now under 5.8.4, the picture is both mixed and confusing:

      (ordered both ways to eliminate the possibilty of "first run bias")

      #! perl -slw use strict; use Benchmark qw[ cmpthese ]; open our $raw, '<:raw', $ARGV[ 0 ] or die $!; open our $txt, '< ', $ARGV[ 0 ] or die $!; our $s = -s $raw; cmpthese -3, { d_raw_trad => q[ my $data = do{ local $/; <$raw> }; ], c_raw_size => q[ my $data = do{ local $/ = \$s; <$raw> }; ], b_txt_trad => q[ my $data = do{ local $/; <$txt> }; ], a_txt_size => q[ my $data = do{ local $/ = \$s; <$txt> }; ], }; __END__ [15:46:09.53] P:\test>440395 100.dat Rate raw_size raw_trad txt_trad txt_size a_raw_size 7969/s -- -87% -88% -96% b_raw_trad 63136/s 692% -- -2% -69% c_txt_trad 64282/s 707% 2% -- -69% d_txt_size 206315/s 2489% 227% 221% -- [15:53:19.11] P:\test>440395 100.dat Rate c_raw_size d_raw_trad b_txt_trad a_txt_size c_raw_size 7855/s -- -87% -88% -96% d_raw_trad 62625/s 697% -- -4% -70% b_txt_trad 65166/s 730% 4% -- -68% a_txt_size 206449/s 2528% 230% 217% --

      Now quite why slurping a file in text-mode would be faster (and soo much faster!) than doing so in raw mode leaves me at a total loss for an explaination. Seems that I should reassess what I thought I once knew about Perl IO-layers and performance.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco.
      Rule 1 has a caveat! -- Who broke the cabal?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://439984]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (2)
As of 2024-04-26 01:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found