Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
All: Does anyone see a need for a module that will allow you to retrieve hash keys in a pre-defined sorted order? This functionality can be used with any user defined sort routine, can be changed on the fly, works on multi-dimensional hashes, and DWYM if you decide to add key/values later.

There are two modules already that do similar things:

  • Tie::IxHash
  • Tie::SortHash

    The problem with Tie::IxHash is that it only preserves insertion order as well as provide rudimentary ability and getting keys/values in sorted order.

    The problems with Tie::SortHash are:

  • It uses pseudo hashes
  • It uses eval string to accomplish the arbitrary sort
  • It loops (n2 + n) / 2 times through the hash for keys, n = # of keys
  • It uses its own form of garbage collection
  • The test suite is not exhaustive
  • The calling syntax is not flexible, making expansion extremely difficult
  • Originally, I thought I would just contact the author with suggestions on how to correct these deficiencies or offer to take over maintenance. Almost a month has gone by with no response. I originally posted this as a proposed inplace upgrade for the module. Upon revisiting it yesterday, I have decided that if I do anything, it will be to upload a brand new module. It is too hard to maintain backwards compatability and have the module robust.

    The question I have is - why, what's the point? I am not aware that there is a big need for this module. I would certainly not want to put more code up on CPAN without a good reason.

    The new module that I have worked on is below, though it still unpolished, requires the POD to be finished, and needs an exhaustive test suite.

    #!/usr/bin/perl -w package Tie::SortedHash; use strict; use Carp; use constant HASH => 0; use constant LOOKUP => 1; use constant ARRAY => 2; use constant SORT => 3; use constant CHANGED => 4; use constant OPT => 5; our $VERSION = '1.00'; sub TIEHASH { my $class = shift; croak "Incorrect number of parameters" if @_ % 2; my %options = @_; my $self = bless [], $class; $self->_Build(\%options); return $self; } sub FETCH { my($self, $key) = @_; $self->[HASH]{$key}; } sub STORE { my($self, $key, $value) = @_; $self->[HASH]{$key} = $value; $self->[CHANGED] = 1; } sub EXISTS { my($self, $key) = @_; exists $self->[HASH]{$key}; } sub DELETE { my($self, $key) = @_; delete $self->[HASH]{$key}; if ($self->[OPT] == 2 && exists $self->[LOOKUP]{$key}) { splice(@{$self->[ARRAY]}, $self->[LOOKUP]{$key}, 1); delete $self->[LOOKUP]{$key}; } } sub FIRSTKEY { my $self = shift; $self->_ReOrder if $self->[OPT] == 1 || ($self->[OPT] == 2 && $sel +f->[CHANGED]); $self->_Iterate; } sub NEXTKEY { my ($self, $lastkey) = @_; $self->_Iterate($lastkey); } sub CLEAR { my $self = shift; $self->[HASH] = {}; $self->[CHANGED] = 1; } sub DESTROY { } sub _Build { my ($self, $opt) = @_; my $sort = exists $opt->{SORT} ? $opt->{SORT} : sub { my $hash = shift; sort {$a cmp $b || $a <=> $b} keys %$hash; }; $self->sortroutine($sort); my $hash = exists $opt->{HASH} ? $opt->{HASH} : {}; croak "$hash is not a hash ref" if ref $hash ne 'HASH'; @{$self->[HASH]}{keys %$hash} = values %$hash; if (exists $opt->{OPTLEVEL}) { croak "$opt->{OPTLEVEL} is not valid optimization level" if $o +pt->{OPTLEVEL} !~ /^[123]$/; $self->[OPT] = $opt->{OPTLEVEL}; } else { $self->[OPT] = 1; } } sub _ReOrder { my $self = shift; $self->[LOOKUP] = (); $self->[ARRAY] = (); my $index = 0; for my $key ($self->[SORT]($self->[HASH])) { $self->[LOOKUP]{$key} = $index; $self->[ARRAY][$index] = $key; $index++; } $self->[CHANGED] = 0; } sub _Iterate { my ($self, $lastkey) = @_; if ($self->[OPT] != 3) { my $index = defined $lastkey ? $self->[LOOKUP]{$lastkey} : -1; $index++; return $self->[ARRAY][$index]; } else { my $match; for my $key ($self->[SORT]($self->[HASH])) { + return $key if $match || ! defined $lastkey; $match = 1 if $key eq $lastkey; } } return undef; } sub sortroutine { my($self, $sort) = @_; croak "$sort is not a code ref" if ref $sort ne 'CODE'; $self->[SORT] = $sort; $self->[CHANGED] = 1; } 1; __END__ =head1 NAME Tie::HashSort - Perl module to get hash keys in a sorted order =head1 SYNOPSIS use Tie::HashSort; my %hash = ( 'John' => 33, 'Jacob' => 29, 'Jingle' => 15, 'Heimer' => 48, 'Smitz' => 12, ); my $sort = sub { my $hash = shift; sort {$hash->{$b} <=> $hash->{$a}} keys %$hash; }; tie my %sorted_hash, 'Tie::SortedHash', 'HASH' => \%hash, 'SORT' = +> $sort, 'OPTLEVEL' => 2; for my $name ( keys %sorted_hash ) { print "$name is $hash{$name} ears old.\n"; } ### OUTPUT ### Heimer is 48 ears old. John is 33 ears old. Jacob is 29 ears old. Jingle is 15 ears old. Smitz is 12 ears old. =head1 DESCRIPTION This module is a designed to retrieve hash keys in a pre-defined sorte +d order. It is often frustrating to have a hash return elements in a seemingly +random order when using C<keys()>, C<values()> or C<each()>. =head2 Tie In order to C<tie()> your hash to C<Tie::SortedHash>, use the followin +g syntax: tie HASH, 'Tie::SortedHash', OPTIONS; =cut
    I believe I have added functionality and avoided all the problems in Tie::SortHash. Here is my philosophy on the 3 optimization levels:
  • Level 1: Trade lots of memory for speed. Create an array with the hash keys in sorted order each time FIRSTKEY is called. NEXTKEY is then a simple matter of getting the index of the lastkey and adding 1. This is accomplished by having a hash lookup table.
  • Level 2: Identical to level 1 only faster. The array with sorted keys and hash lookup table are only recreated when a new key is added, a value is changed, or the sort routine is changed. Deleting a key doesn't require a rebuild because the element is deleted from the array and the lookup hash. This optimization only works on 1 dimensional hashes since it is impossible to detect values being changed below the root level. Depending on the sort routine, this may affect the order.
  • Level 3: No optimization at all. It is slower, but consumes no additional memory - (n2 + n) /2.
  • So what do you think - should I work on polishing the code, finishing the POD, building the test suite, and uploading it to CPAN or not?

    Cheers - L~R


    In reply to Any Point in Uploading Tie::SortedHash by Limbic~Region

    Title:
    Use:  <p> text here (a paragraph) </p>
    and:  <code> code here </code>
    to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Domain Nodelet?
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this?Last hourOther CB clients
    Other Users?
    Others browsing the Monastery: (3)
    As of 2025-06-20 07:00 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found

      Notices?
      erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.