http://www.perlmonks.org?node_id=383137


in reply to Re^9: Re-orderable keyed access structure?
in thread Re-orderable keyed access structure?

But let's leave the ad hominem out...
Please do pick up a book or two on algorithms and data structures; this is stuff anyone who is serious about programming should know.

Yes. Let's do that.

In a heap with 1,000 elements, you need at most 10 swaps. How much money will you bet on splice?

Quite a lot, were I a betting man! :)

#! perl -slw use strict; use Benchmark qw[ cmpthese ]; our $SIZE ||= 1000; our $ITERS||= -5; sub spliceIt{ my @a = reverse 0 .. $SIZE; unshift @a, splice @a, $SIZE, 1; ## Update: changed push to unshif +t! # print "@a"; } sub heapIt { my @a = reverse 0 .. $SIZE; $a[ $SIZE ] = $SIZE + 1; moveUp( \@a, $SIZE ); # print "@a"; } sub moveUp { my( $ref, $l ) = @_; my $p = int $l /2; return if $p >= $l; @{ $ref }[ $p, $l ] = @{ $ref }[ $l, $p ]; moveUp( $ref, $p ); } print "Testing $SIZE items for $ITERS iterations"; cmpthese( $ITERS, { splice => \&spliceIt, heap => \&heapIt, }); __END__ P:\test>heaptest -ITERS=-5 -SIZE=100 Testing 100 items for -5 iterations Rate heap splice heap 19502/s -- -37% splice 30887/s 58% -- P:\test>heaptest -ITERS=-5 -SIZE=1000 Testing 1000 items for -5 iterations Rate heap splice heap 3159/s -- -4% splice 3288/s 4% -- P:\test>heaptest -ITERS=-5 -SIZE=10000 Testing 10000 items for -5 iterations Rate heap splice heap 327/s -- -0% splice 328/s 0% -- P:\test>heaptest -ITERS=-5 -SIZE=20000 Testing 20000 items for -5 iterations Rate splice heap splice 159/s -- -1% heap 160/s 1% --

From where you left off. A new item not currently in cache is called for, it is read from disk, the lowest item* (currently index 12) is replaced by the new item in the array** and the new item given a weight of 17.

a) 0 [ 16 ] b) 0 [ 16 ] c) 0 [ 16 ] d) 0 [ 16 ] e) 0 * 17 ] 1 [ 12 ] 1 [ 12 ] 1 [ 12 ] 1 * 17 ] 1 * 16 ] 2 [ 13 ] 2 [ 13 ] 2 [ 13 ] 2 [ 13 ] 2 [ 13 ] 3 [ 10 ] 3 [ 10 ] 3 * 17 ] 3 * 12 ] 3 [ 12 ] 4 [ 9 ] 4 [ 9 ] 4 [ 9 ] 4 [ 9 ] 4 [ 9 ] 5 [ 11 ] 5 [ 11 ] 5 [ 11 ] 5 [ 11 ] 5 [ 11 ] 6 [ 7 ] 6 * 17 ] 6 * 10 ] 6 [ 10 ] 6 [ 10 ] 7 [ 6 ] 7 [ 6 ] 7 [ 6 ] 7 [ 6 ] 7 [ 6 ] 8 [ 5 ] 8 [ 5 ] 8 [ 5 ] 8 [ 5 ] 8 [ 5 ] 9 [ 4 ] 9 [ 4 ] 9 [ 4 ] 9 [ 4 ] 9 [ 4 ] 10 [ 3 ] 10 [ 3 ] 10 [ 3 ] 10 [ 3 ] 10 [ 3 ] 11 [ 8 ] 11 [ 8 ] 11 [ 8 ] 11 [ 8 ] 11 [ 8 ] 12 * 17 ] 12 * 7 ] 12 [ 7 ] 12 [ 7 ] 12 [ 7 ]

Now, another new item is called for, so I need to locate the lowest weighted item in the array. *How do I do this?


And another problem, when I need to locate one of these items that are moving around in this heap via it's key.

**How do I locate it?

Actually, it's just the original one. That of maintaining the linkage between the items in the array(heap) and their keys. No matter how long I "look at the pictures"--or read the text--at heaps, I do not see the mechanism by which the lowest weighted item in the heap is located (other than a linear search).

To re-state the requirements. I need to be able to:

  1. Locate the highest weighted item.

    This is required to allow promotion of the lastest accessed item to the top in the classic LRU algorithm.

  2. Locate the lowest weighted item.

    Also an LRU requirement(or any variation), as this is the one that will be discarded when the cache is full and a new element must be added.

  3. Locate an item in the cache via it's key.

    As the items get moved around, that linkage *must* be maintained.

    Embedding the key within the item would require a linear search to locate it. The purpose of the exercise was to avoid a linear search.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

Replies are listed 'Best First'.
Re^11: Re-orderable keyed access structure?
by Aristotle (Chancellor) on Aug 25, 2004 at 21:07 UTC

    I was going to address this at some point, but to be honest, I've lost interest. So I decided to post a closing note instead of letting this silently slip into oblivion.

    The point I was going to make is that your heap code is not exactly efficient. The algorithm description is formulated recursively, but you don't need to implement it that way. Since it's just tail recursion, you can trivially write it iteratively, which would gain a lot of ground.

    Still, that would almost certainly only accelerate the code by a constant factor, which does not make it worth the effort for the smallish data sets you're working with.

    I confess my surprise to find Perl is that slow. I am no stranger to optimizing Perl, but I've never come across such a stellar disparity between a builtin and explicit code before.

    Makeshifts last the longest.

      By all means rewrite the recursive routines iteratively and re-run the benchmark, but to be realistic, you would need to run this one.

      However, even if the iterative approach was 3 orders of magnitude faster than the recursive (which I strongly doubt), it still leaves the original problem of how to retain keyed access to the data stored in the array after it has been re-ordered.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon