Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^3: use File::Slurp for (!"speed");

by revdiablo (Prior)
on Nov 08, 2004 at 22:58 UTC ( [id://406234]=note: print w/replies, xml ) Need Help??


in reply to Re^2: use File::Slurp for (!"speed");
in thread use File::Slurp for (!"speed");

let's not arbitrarily turn something into a function call that doesn't have to be

I didn't turn it into a function call arbitrarily. In fact, I explained my reasoning for doing so. I was attempting to compare the speed of File::Slurp to the idiomatic slurp. I was not attempting to measure the speed of Perl's subroutine calls, much less loading of modules. It's already well-established that these things are slow.

This is how I see things so far: you stated that File::Slurp is slower than the idiomatic slurp. I demonstrated that it is, but only under very constrained circumstances. Then, you constrained the circumstances even further. I don't get the point you're going for here...

Replies are listed 'Best First'.
Re^4: use File::Slurp for (!"speed");
by rcaputo (Chaplain) on Nov 09, 2004 at 20:15 UTC
    I didn't turn it into a function call arbitrarily. In fact, I explained my reasoning for doing so. I was attempting to compare the speed of File::Slurp to the idiomatic slurp. I was not attempting to measure the speed of Perl's subroutine calls, much less loading of modules. It's already well-established that these things are slow.

    A fairer benchmark would compare the slurp methods as they'll be used. Since idiomatic slurp doesn't require a function call, it seems sensible not to add the extra overhead into the benchmark. Unfortunately the modular solution can't escape Perl's function calls, so the overhead must be factored into its performance.

    I'm sure there's a point where File::Slurp's efficiency overwhelms the function call penalty. It would be interesting to see where that point actually is, even though it probably varies from one system to another.

    The original poster's argument is that their system slurps files that fall below the break-even point for File::Slurp. It's therefore more efficient (in runtime speed) to use inline idiomatic slurp.

    -- Rocco Caputo - http://poe.perl.org/

      The original poster's argument is that their system slurps files that fall below the break-even point for File::Slurp. It's therefore more efficient (in runtime speed) to use inline idiomatic slurp.

      Indeed, I acknowledged that when I said, "don't get me wrong. I'm not trying to disagree with your main point. In your particular case, switching to File::Slurp was probably not the right idea."

      A fairer benchmark would compare the slurp methods as they'll be used. Since idiomatic slurp doesn't require a function call, it seems sensible not to add the extra overhead into the benchmark.

      Perhaps the idiomatic slurp is almost always used inline, but that was not an assumption I was willing to make. I've used the idiomatic slurp wrapped in a subroutine before, and it's not inconceivable to me that others would do the same. I was trying to compare the speed of the two slurp methods, not the speed of their most common usage patterns.

      I figured some folks would take issue with that, which is why I wrote the paragraph explaining my reasoning. Looking back, I probably should have just included the inline version in my benchmark in the first place, rather than trying to explain it away.

      I'm sure there's a point where File::Slurp's efficiency overwhelms the function call penalty. It would be interesting to see where that point actually is, even though it probably varies from one system to another.

      Then let's find out, shall we? Here's an updated version of the benchmark:

      use strict; use warnings; use Benchmark qw(cmpthese); use File::Slurp; my $TESTFILE = "foo"; die "'$TESTFILE' exists" if -e $TESTFILE; sub slurp { local (@ARGV, $/) = $_[0]; <> } for my $size (500, 5_000, 50_000, 500_000, 5_000_000, 25_000_000) { print "--- Test size: $size\n"; open my $out, ">", $TESTFILE; print $out "x"x$size; close $out; cmpthese(-2, { fs_sub => sub { my $x = read_file $TESTFILE }, is_sub => sub { my $x = slurp $TESTFILE }, is_nosub => sub { my $x = do { local (@ARGV, $/) = $TESTFILE; <> }; }, }); unlink $TESTFILE; }

      And here are the results I got:

      $ perl benchmark --- Test size: 500 Rate fs_sub is_sub is_nosub fs_sub 30871/s -- -35% -39% is_sub 47468/s 54% -- -7% is_nosub 50957/s 65% 7% -- --- Test size: 5000 Rate fs_sub is_sub is_nosub fs_sub 28327/s -- -29% -33% is_sub 40049/s 41% -- -5% is_nosub 42162/s 49% 5% -- --- Test size: 50000 Rate is_sub is_nosub fs_sub is_sub 11441/s -- -3% -15% is_nosub 11763/s 3% -- -13% fs_sub 13453/s 18% 14% -- --- Test size: 500000 Rate is_sub is_nosub fs_sub is_sub 279/s -- -14% -16% is_nosub 326/s 17% -- -1% fs_sub 331/s 18% 1% -- --- Test size: 5000000 Rate is_sub is_nosub fs_sub is_sub 29.3/s -- -13% -17% is_nosub 33.8/s 15% -- -4% fs_sub 35.3/s 21% 5% -- --- Test size: 25000000 Rate is_sub is_nosub fs_sub is_sub 6.10/s -- -13% -18% is_nosub 7.04/s 15% -- -6% fs_sub 7.46/s 22% 6% --

      The 50k test is rather strange. The IS in a sub is only 3% slower than it is inline, with FS 14% faster. But past that point, FS and the inline IS are essentially neck-in-neck. FS starts to pull away once we get into the 5m and 25m tests, but at this point, slurping seems a bit dubious. Like you said, this is probably all system-dependent, but interesting nonetheless.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://406234]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (9)
As of 2024-03-28 10:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found