Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Dump JudyHS

by diotalevi (Canon)
on Dec 29, 2008 at 22:57 UTC ( #733140=snippet: print w/ replies, xml ) Need Help??

Description:

This dumps the contents of a Judy::HS/JudyHS(3) array. I had to violate its API to do this. JudyHS is constructed as nested Judy::L/JudyL(3) arrays. The top level encodes the string length. The next level encodes a hashing. Each additional level encodes another 4 or 8 bytes of the input string until no more are needed and it terminates in a C struct which contains the key and value.

The below example loaded Judy::HS with a map from string to line number. It's completely arbitrary and I did it just to demo to myself that I could enumerate the contents of Judy::HS if I needed to.

Judy.h in the Judy C library has a nice, readable description of the structure that's being dumped here.

#!perl
use strict;
use warnings;
use Config '%Config';
use Judy::HS qw( Set );
use Judy::L qw( First Next );
use Judy::Mem qw( Peek Ptr2String2 );

use constant LONGSIZE => 0+$Config{longsize};

# Load $hs with a pile of data.
my $hs;
@ARGV = "$ENV{HOME}/Documents/Political Data/Secretary of state/Statew
+idevoters13102.txt";
while (<>) {
  Set( $hs, $_, $. );
}


# Nested printing.
our $P = -1;
sub p { print ' ' x ( 4 * $P ), @_ }


# Loop over JudyL array, each entry contains all strings of length $le
+ngthKey.
my ( undef, $lengthL, $lengthKey ) = First( $hs, 0 );
while ( defined $lengthKey ) {
  local $P = 1+$P;
  p( "LENGTH: $lengthKey\n" );


  # Loop over JudyL array, each entry contains all strings that map to
+ the same $hashKey.
  my $hashCount = 0;
  my ( undef, $hashL, $hashKey ) = First( $lengthL, 0 );
  while ( defined $hashKey ) {
    local $P = 1+$P;
    p( sprintf "HASH @{[ ++ $hashCount ]}: 0x%x\n", $hashKey );


    # Recurse down through JudyL until I find the key/value.
    dumpLTree( $hashL );

    ( undef, $hashL, $hashKey ) = Next( $lengthL, $hashKey );
  }

  ( undef, $lengthL, $lengthKey ) = Next( $hs, $lengthKey );
}


sub dumpLTree {
  my ( $l ) = @_;

  # Find the stored key/values.
  if ( Judy::JLAP_INVALID & $l ) {
    $l &= ~Judy::JLAP_INVALID;
    local $P = 1+$P;


    # Unpack the C struct containing my key value. The value is the fi
+rst 
    my $value = Peek( $l );
    my $str   = Ptr2String2( LONGSIZE + $l, $lengthKey );
    p( "{Value: $value, String: $str}\n" );
  }
  else {

    # Go deeper.
    my ( undef, $innerL, $key ) = First( $l, 0 );
    while ( defined $key ) {
      local $P = 1+$P;
      p( "str: $key\n" );

      dumpLTree( $key );
      ( undef, $innerL, $key ) = Next( $l, $key );
    }
  }
}
Comment on Dump JudyHS
Download Code

Back to Snippets Section

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://733140]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2014-07-10 04:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (198 votes), past polls