Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re: code optimization

by choroba (Bishop)
on Nov 03, 2011 at 11:21 UTC ( #935635=note: print w/replies, xml ) Need Help??

in reply to code optimization

I do not see a way to make it much faster, but still: do not compute the division several times. Also, if you comment the defined $a and defined $w and part in my code, the sub will run about 5% faster, but you should be sure your input does not contain any invalid lines.
#!/usr/bin/perl use warnings; use strict; use Benchmark 'cmpthese'; sub arivu { my $input = shift; open my $FH, '<', $input or die; my $minimumAmount = 1e12; my ($amount, $weight); while (<$FH>) { if (/(\d+) (\d+)/) { if ($minimumAmount > ($1/$2)) { $minimumAmount = $1 / $2; $amount = $1; $weight = $2; } } } close $FH; return "$amount\t$weight\n"; } # arivu sub choroba { my $input = shift; open my $FH, '<', $input or die; my $minimumAmount = 1e12; my ($amount, $weight); while (<$FH>) { my ($a, $w) = split; if (defined $a and defined $w and (my $ratio = $a / $w) < $minimumAmount) { $minimumAmount = $ratio; $amount = $a; $weight = $w; } } close $FH; return "$amount\t$weight\n"; } # choroba my $input; $input .= (join ' ', map int(1 + rand 1000), 1 .. 2) . "\n" for 1 .. 1 +000; cmpthese(-1, {arivu => sub {arivu \$input}, choroba => sub {choroba \$input}, }); print arivu \$input; print choroba \$input; __END__ Output: Rate arivu choroba arivu 466/s -- -17% choroba 565/s 21% -- 8 953 8 953

Replies are listed 'Best First'.
Re^2: code optimization
by JavaFan (Canon) on Nov 03, 2011 at 12:28 UTC
    do not compute the division several times.
    That's debatable. The trade-off is, arivu calculates the division only a second time if it's actually smaller than the previous minimum value. You store the result in a variable *all the time*. Your benchmark is biased in your favour because Perl actually keeps the variable skeleton around for each iteration of the benchmark. You're also using split instead of pattern with backreferences. If both "arivu" and "choroba" use split, the smallest ratio is in the beginning of the list, and the benchmark uses strings instead of subs (to avoid costs being amortized over runs), arivu's solution is actually marginally faster. (On my machine, and by 5% +/- 6%).

    So, what can we conclude?

    • Benchmarking is harder than you think.
    • The real gain is in the split vs backref pattern.
    • Doing a "low cost" operation all the time isn't always faster than doing a slightly "less lower cost" sometimes.
      Interesting. I actually forgot to mention split in my comment, thanks for pointing it up.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://935635]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2018-11-18 11:58 GMT
Find Nodes?
    Voting Booth?
    My code is most likely broken because:

    Results (205 votes). Check out past polls.