Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Re^5: use File::Slurp for (!"speed");

by revdiablo (Prior)
on Nov 09, 2004 at 22:07 UTC ( #406533=note: print w/replies, xml ) Need Help??

in reply to Re^4: use File::Slurp for (!"speed");
in thread use File::Slurp for (!"speed");

The original poster's argument is that their system slurps files that fall below the break-even point for File::Slurp. It's therefore more efficient (in runtime speed) to use inline idiomatic slurp.

Indeed, I acknowledged that when I said, "don't get me wrong. I'm not trying to disagree with your main point. In your particular case, switching to File::Slurp was probably not the right idea."

A fairer benchmark would compare the slurp methods as they'll be used. Since idiomatic slurp doesn't require a function call, it seems sensible not to add the extra overhead into the benchmark.

Perhaps the idiomatic slurp is almost always used inline, but that was not an assumption I was willing to make. I've used the idiomatic slurp wrapped in a subroutine before, and it's not inconceivable to me that others would do the same. I was trying to compare the speed of the two slurp methods, not the speed of their most common usage patterns.

I figured some folks would take issue with that, which is why I wrote the paragraph explaining my reasoning. Looking back, I probably should have just included the inline version in my benchmark in the first place, rather than trying to explain it away.

I'm sure there's a point where File::Slurp's efficiency overwhelms the function call penalty. It would be interesting to see where that point actually is, even though it probably varies from one system to another.

Then let's find out, shall we? Here's an updated version of the benchmark:

use strict; use warnings; use Benchmark qw(cmpthese); use File::Slurp; my $TESTFILE = "foo"; die "'$TESTFILE' exists" if -e $TESTFILE; sub slurp { local (@ARGV, $/) = $_[0]; <> } for my $size (500, 5_000, 50_000, 500_000, 5_000_000, 25_000_000) { print "--- Test size: $size\n"; open my $out, ">", $TESTFILE; print $out "x"x$size; close $out; cmpthese(-2, { fs_sub => sub { my $x = read_file $TESTFILE }, is_sub => sub { my $x = slurp $TESTFILE }, is_nosub => sub { my $x = do { local (@ARGV, $/) = $TESTFILE; <> }; }, }); unlink $TESTFILE; }

And here are the results I got:

$ perl benchmark --- Test size: 500 Rate fs_sub is_sub is_nosub fs_sub 30871/s -- -35% -39% is_sub 47468/s 54% -- -7% is_nosub 50957/s 65% 7% -- --- Test size: 5000 Rate fs_sub is_sub is_nosub fs_sub 28327/s -- -29% -33% is_sub 40049/s 41% -- -5% is_nosub 42162/s 49% 5% -- --- Test size: 50000 Rate is_sub is_nosub fs_sub is_sub 11441/s -- -3% -15% is_nosub 11763/s 3% -- -13% fs_sub 13453/s 18% 14% -- --- Test size: 500000 Rate is_sub is_nosub fs_sub is_sub 279/s -- -14% -16% is_nosub 326/s 17% -- -1% fs_sub 331/s 18% 1% -- --- Test size: 5000000 Rate is_sub is_nosub fs_sub is_sub 29.3/s -- -13% -17% is_nosub 33.8/s 15% -- -4% fs_sub 35.3/s 21% 5% -- --- Test size: 25000000 Rate is_sub is_nosub fs_sub is_sub 6.10/s -- -13% -18% is_nosub 7.04/s 15% -- -6% fs_sub 7.46/s 22% 6% --

The 50k test is rather strange. The IS in a sub is only 3% slower than it is inline, with FS 14% faster. But past that point, FS and the inline IS are essentially neck-in-neck. FS starts to pull away once we get into the 5m and 25m tests, but at this point, slurping seems a bit dubious. Like you said, this is probably all system-dependent, but interesting nonetheless.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://406533]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (7)
As of 2019-10-16 06:55 GMT
Find Nodes?
    Voting Booth?