All: Does anyone see a need for a module that will allow you to retrieve hash keys in a pre-defined sorted order? This functionality can be used with any user defined sort routine, can be changed on the fly, works on multi-dimensional hashes, and DWYM if you decide to add key/values later.

There are two modules already that do similar things:

  • Tie::IxHash
  • Tie::SortHash

    The problem with Tie::IxHash is that it only preserves insertion order as well as provide rudimentary ability and getting keys/values in sorted order.

    The problems with Tie::SortHash are:

  • It uses pseudo hashes
  • It uses eval string to accomplish the arbitrary sort
  • It loops (n2 + n) / 2 times through the hash for keys, n = # of keys
  • It uses its own form of garbage collection
  • The test suite is not exhaustive
  • The calling syntax is not flexible, making expansion extremely difficult
  • Originally, I thought I would just contact the author with suggestions on how to correct these deficiencies or offer to take over maintenance. Almost a month has gone by with no response. I originally posted this as a proposed inplace upgrade for the module. Upon revisiting it yesterday, I have decided that if I do anything, it will be to upload a brand new module. It is too hard to maintain backwards compatability and have the module robust.

    The question I have is - why, what's the point? I am not aware that there is a big need for this module. I would certainly not want to put more code up on CPAN without a good reason.

    The new module that I have worked on is below, though it still unpolished, requires the POD to be finished, and needs an exhaustive test suite.

    #!/usr/bin/perl -w package Tie::SortedHash; use strict; use Carp; use constant HASH => 0; use constant LOOKUP => 1; use constant ARRAY => 2; use constant SORT => 3; use constant CHANGED => 4; use constant OPT => 5; our $VERSION = '1.00'; sub TIEHASH { my $class = shift; croak "Incorrect number of parameters" if @_ % 2; my %options = @_; my $self = bless [], $class; $self->_Build(\%options); return $self; } sub FETCH { my($self, $key) = @_; $self->[HASH]{$key}; } sub STORE { my($self, $key, $value) = @_; $self->[HASH]{$key} = $value; $self->[CHANGED] = 1; } sub EXISTS { my($self, $key) = @_; exists $self->[HASH]{$key}; } sub DELETE { my($self, $key) = @_; delete $self->[HASH]{$key}; if ($self->[OPT] == 2 && exists $self->[LOOKUP]{$key}) { splice(@{$self->[ARRAY]}, $self->[LOOKUP]{$key}, 1); delete $self->[LOOKUP]{$key}; } } sub FIRSTKEY { my $self = shift; $self->_ReOrder if $self->[OPT] == 1 || ($self->[OPT] == 2 && $sel +f->[CHANGED]); $self->_Iterate; } sub NEXTKEY { my ($self, $lastkey) = @_; $self->_Iterate($lastkey); } sub CLEAR { my $self = shift; $self->[HASH] = {}; $self->[CHANGED] = 1; } sub DESTROY { } sub _Build { my ($self, $opt) = @_; my $sort = exists $opt->{SORT} ? $opt->{SORT} : sub { my $hash = shift; sort {$a cmp $b || $a <=> $b} keys %$hash; }; $self->sortroutine($sort); my $hash = exists $opt->{HASH} ? $opt->{HASH} : {}; croak "$hash is not a hash ref" if ref $hash ne 'HASH'; @{$self->[HASH]}{keys %$hash} = values %$hash; if (exists $opt->{OPTLEVEL}) { croak "$opt->{OPTLEVEL} is not valid optimization level" if $o +pt->{OPTLEVEL} !~ /^[123]$/; $self->[OPT] = $opt->{OPTLEVEL}; } else { $self->[OPT] = 1; } } sub _ReOrder { my $self = shift; $self->[LOOKUP] = (); $self->[ARRAY] = (); my $index = 0; for my $key ($self->[SORT]($self->[HASH])) { $self->[LOOKUP]{$key} = $index; $self->[ARRAY][$index] = $key; $index++; } $self->[CHANGED] = 0; } sub _Iterate { my ($self, $lastkey) = @_; if ($self->[OPT] != 3) { my $index = defined $lastkey ? $self->[LOOKUP]{$lastkey} : -1; $index++; return $self->[ARRAY][$index]; } else { my $match; for my $key ($self->[SORT]($self->[HASH])) { + return $key if $match || ! defined $lastkey; $match = 1 if $key eq $lastkey; } } return undef; } sub sortroutine { my($self, $sort) = @_; croak "$sort is not a code ref" if ref $sort ne 'CODE'; $self->[SORT] = $sort; $self->[CHANGED] = 1; } 1; __END__ =head1 NAME Tie::HashSort - Perl module to get hash keys in a sorted order =head1 SYNOPSIS use Tie::HashSort; my %hash = ( 'John' => 33, 'Jacob' => 29, 'Jingle' => 15, 'Heimer' => 48, 'Smitz' => 12, ); my $sort = sub { my $hash = shift; sort {$hash->{$b} <=> $hash->{$a}} keys %$hash; }; tie my %sorted_hash, 'Tie::SortedHash', 'HASH' => \%hash, 'SORT' = +> $sort, 'OPTLEVEL' => 2; for my $name ( keys %sorted_hash ) { print "$name is $hash{$name} ears old.\n"; } ### OUTPUT ### Heimer is 48 ears old. John is 33 ears old. Jacob is 29 ears old. Jingle is 15 ears old. Smitz is 12 ears old. =head1 DESCRIPTION This module is a designed to retrieve hash keys in a pre-defined sorte +d order. It is often frustrating to have a hash return elements in a seemingly +random order when using C<keys()>, C<values()> or C<each()>. =head2 Tie In order to C<tie()> your hash to C<Tie::SortedHash>, use the followin +g syntax: tie HASH, 'Tie::SortedHash', OPTIONS; =cut
    I believe I have added functionality and avoided all the problems in Tie::SortHash. Here is my philosophy on the 3 optimization levels:
  • Level 1: Trade lots of memory for speed. Create an array with the hash keys in sorted order each time FIRSTKEY is called. NEXTKEY is then a simple matter of getting the index of the lastkey and adding 1. This is accomplished by having a hash lookup table.
  • Level 2: Identical to level 1 only faster. The array with sorted keys and hash lookup table are only recreated when a new key is added, a value is changed, or the sort routine is changed. Deleting a key doesn't require a rebuild because the element is deleted from the array and the lookup hash. This optimization only works on 1 dimensional hashes since it is impossible to detect values being changed below the root level. Depending on the sort routine, this may affect the order.
  • Level 3: No optimization at all. It is slower, but consumes no additional memory - (n2 + n) /2.
  • So what do you think - should I work on polishing the code, finishing the POD, building the test suite, and uploading it to CPAN or not?

    Cheers - L~R