Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Reminder to self: must use Memoize more often!

by tobyink (Abbot)
on Jan 12, 2013 at 12:13 UTC ( #1013033=CUFP: print w/ replies, xml ) Need Help??

For functions which are non-volatile (i.e. the same inputs will always produce the same outputs) with no side-effects, it often makes sense to "memoize" them. That is, cache the results for the next time they're called.

This is an especially good idea in the case of recursive functions. The following benchmark script illustrates the massive speed up.

use strict; use warnings; use Memoize; use Benchmark 'cmpthese'; sub fib1 { my $n = shift; return $n if $n < 2; return fib1($n-1) + fib1($n-2); } sub fib2 { my $n = shift; return $n if $n < 2; return fib2($n-1) + fib2($n-2); } memoize('fib2'); my %fib3_cache; sub fib3 { my $n = shift; return $fib3_cache{$n} ||= do { $n < 2 ? $n : fib3($n-1) + fib3($n-2) }; } for (0..5) { die "something bad" unless fib1($_)==fib2($_); die "something bad" unless fib1($_)==fib3($_); } cmpthese(-1, { fib1 => sub { fib1 20 }, fib2 => sub { fib2 20 }, fib3 => sub { fib3 20 }, }); __END__ Rate fib1 fib2 fib3 fib1 18.2/s -- -100% -100% fib2 37594/s 206668% -- -87% fib3 289129/s 1590107% 669% --

The Memoize module gives you a pretty decent speed up, without needing to change the guts of the function. Just add memoize('fib2') and the Memoize module will wrap the original function with memoization code.

The manually memoized version is clearly faster still, because it avoids the overhead of function wrapping - the memoization code is added to the guts of the function.

Of course, there's no such thing as a free lunch - you're trading CPU for memory.

perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

Comment on Reminder to self: must use Memoize more often!
Select or Download Code
Re: Reminder to self: must use Memoize more often!
by BrowserUk (Pope) on Jan 12, 2013 at 13:55 UTC
    return $fib3_cache{$n} ||= do {

    I'd strongly recommend trading //= for ||=. Any function that can return 0, but costs to determine that, looses out as is.

    You can get greater benefits still for functions that take low integers as inputs by using an array instead of a hash for the cache:

    my @fib4_cache; sub fib4 { my $n = shift; return $fib4_cache[$n] //= do { $n < 2 ? $n : fib4($n-1) + fib4($n-2) }; } __END__ C:\test>memofib.pl Rate fib1 fib2 fib3 fib4 fib1 88.7/s -- -100% -100% -100% fib2 219835/s 247838% -- -89% -91% fib3 2073455/s 2338415% 843% -- -19% fib4 2551114/s 2877136% 1060% 23% --

    And see the benefits more clearly if you do away with some of the benchmark overheads by loosing the sub wrappers (The module adds its own wrapper internally):

    cmpthese(-1, { fib1 => q{ fib1 20 }, fib2 => q{ fib2 20 }, fib3 => q{ fib3 20 }, fib4 => q{ fib4 20 }, }); __END__ C:\test>memofib.pl Rate fib1 fib2 fib3 fib4 fib1 87.4/s -- -100% -100% -100% fib2 223254/s 255377% -- -89% -91% fib3 2071504/s 2370384% 828% -- -20% fib4 2579867/s 2952119% 1056% 25% --

    And finally, there's a couple more % to be had:

    my @fib5_cache; sub fib5 { my $n = shift; return $fib5_cache[ $n ] //= do { $n < 2 ? $n : ( $fib5_cache[ $n-1 ] //= fib5($n-1) ) + ( $fib5_cache[ $n-2 ] //= fib5($n-2) ) ## corrected; thanks +to moriitz }; } __END__ C:\test>memofib.pl Rate fib1 fib2 fib3 fib4 fib5 fib1 89.0/s -- -100% -100% -100% -100% fib2 218452/s 245352% -- -89% -91% -91% fib3 2022174/s 2272005% 826% -- -18% -20% fib4 2478824/s 2785096% 1035% 23% -- -2% fib5 2537045/s 2850512% 1061% 25% 2% --

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      On my machine at least, fib6 is 4% faster than fib5:

      my @fib6_cache; sub fib6 { my $n = shift; $fib6_cache[ $n ] //= $n < 2 ? $n : ( $fib6_cache[ $n-1 ] //= fib6($n-1) ) + ( $fib6_cache[ $n-2 ] //= fib6($n-2) ); }
      perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

        Fair enough. I guess we're into Benchmark vagaries. (T'was a bit unnecessary given the 1000% already gained :)


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Reminder to self: must use Memoize more often!
by flexvault (Parson) on Jan 12, 2013 at 17:34 UTC

    Hello tobyink,

    I use 'Memoize' and it is an awesome product. But before I knew about it, I did some memorization using a hash. More than 10 years ago I wrote the following code for an editor that I converted from 'traditional C' to Perl:

    sub ATS { # PyrEdit assumes the HOME is col 1, row 1 our %CRToutput; my $col = int(shift); my $row = int(shift); my $Saved = "$col|$row"; if (exists $CRToutput{$Saved}) { return($CRToutput{$Saved}); } my $loc; if ( $SYS{"ATS"} eq "xterm" ) ## all 'xterm' compatibles { $loc = "\e\[$row\;$col"."H"; } else { ## PyrEdit assumes that HOME is col 1, row 1 if ( $SYS{"ATSDEC"} eq "Y" ) { $row--; $col--; } $loc = qx/tput cup $row $col/; chomp($loc); } $CRToutput{$Saved} = $loc; return($loc); }

    In *nix the 'tput' command is very expensive, and since I had to support lots of different terminal types, I cached the results using a hash. The results were as good as the 'traditional C' after just a few seconds of use.

    Just an alternative.

    Update: The original sub didn't have the 'my's. It was only after joining PM and learning of the value of 'use strict; use warnings' that I went back and put the 'my's into the code. I'm sure I've updated it in more than one place. It's currently on my mind since I received a package of xml files on Thursday without any "\n" and I added an '-xml' option the next day. After 2-3 hours of coding, I have a XML formatter that show the proper indention and structure for all(mine) XML files.

    "Well done is better than well said." - Benjamin Franklin

Re: Reminder to self: must use Memoize more often!
by davido (Archbishop) on Jan 13, 2013 at 06:12 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://1013033]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2014-08-31 02:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (294 votes), past polls