Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re (tilly) 3: Efficiency Question

by tilly (Archbishop)
on Feb 23, 2001 at 23:58 UTC ( [id://60541]=note: print w/replies, xml ) Need Help??


in reply to Re: Re (tilly) 1: Efficiency Question
in thread Efficiency Question

Fastest and smallest are incompatible.

Also do you need at least 30, or exactly?

If I wanted to be bare bones I would do something like this untested code:

use strict; use vars qw($max_size $reset_size); $max_size = 60; $reset_size = 30; { my %count; my %cache; sub get_elem { my $id = shift; if (exists $count{$id}) { ++$count{$id}; return $cache{$id}; } else{ return; } } my $size; sub set_elem { { my $id = shift; my $elem = shift; if (exists $count{$id}) { $cache{$id} = $elem; } else { if ($max_size < ++$size) { _clear_cache(); } $count{$id} = 0; $cache{$id} = $elem; } redo if @_; } } sub _clear_cache { my @ids = sort {$count{$b} <=> $count{$a}} keys %count; @ids = @ids[0..($reset_size-1)]; my %new; %count = (); foreach (@ids) { $new{$_} = $cache{$_}; $count{$_} = 0; } %cache = %new; $size = $reset_size; } }
Note that I go over 30 here, but doing so makes the actual lookup much faster.

Replies are listed 'Best First'.
Re: Re (tilly) 3: Efficiency Question
by MeowChow (Vicar) on Feb 24, 2001 at 01:56 UTC
    You can speed up the _clear_cache, which would in probably eat up the most cycles, being called once every 30 insertions, like so:
    sub clear_cache { my @ids = sort {$count{$b} <=> $count{$a}} keys %count; splice @ids, 0, $reset_size-1; delete @cache{@ids}; %count = (); @count{keys %cache} = (0) x $reset_size; $size = $reset_size; }
    Faster to delete elements from an existing hash, I've found, than to rebuild a hash. This of course depends mostly on the ratio of deleted/total hash entries.

    Also, this is isn't quite a frequency-expiration cache, since time is not a factor into the expiration routine, and you would stand a pretty good chance of getting stale cache entries that once had a bunch of hits, but remain in the cache even though they are entirely unused. I'd probably switch the count over to something like a shift-based MRU, called once every 30 fetches:

    my $do_shift = 0; sub get_elem { my $id = shift; if (++$do_shift == 30) { shift_mru(); $do_shift = 0; } if (exists $count{$id}) { $count{$id} |= 0x80000000; return $cache{$id}; } else{ return; } } sub shift_mru { while (my ($id, $count) = each %count) { $count{$id} >>= 1; } }
    I also seem to recall that there are ways to optimize the sorting algorithm for a return of fewer results than the total number to be sorted, but that's getting rather hairy :-)
       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://60541]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2025-07-14 06:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.