Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Generating A Hash Value

by Anonymous Monk
on Jul 03, 2003 at 10:37 UTC ( [id://271118]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,
I'm trying to replicate a java hashing algorithm for a string in a Perl script.
The way Java calculates the hash value is:

public int hashCode()
Returns a hashcode for this string. The hashcode for a String object is computed as

s[0]*31^(n-1)+s[1]*31^(n-2)+ ... +s[n-1]

using int arithmetic, where s[i] is the ith character of the string, n is the length of the string, and ^ indicates exponentiation. (The hash value of the empty string is zero.)

Can anyone help me with this.
Thanks

Replies are listed 'Best First'.
Re: Generating A Hash Value
by Chady (Priest) on Jul 03, 2003 at 11:30 UTC

    Here's a snippet that I think does what you mean?

    # assuming that... my $string = "Perl Monks"; my $n = length($string); my $i = 1; my $result = 0; # here it is. $result += ord($_) * 31^($n-$i++) for split '', $string; ^ ^ ^ ^ ^ ^ ^ ^ | | | | | | | +-- String to split | | | | | | +-- split all the charact +ers | | | | | +-- post increment $i | | | | +-- $i is the index | | | +--- $n is the length of $string | | +--- this is the character returned from split | +--- ord() returns the ascii code of the character +--- add what we have to $result

    He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.

    Chady | http://chady.net/
Re: Generating A Hash Value
by BrowserUk (Patriarch) on Jul 03, 2003 at 11:34 UTC

    When you say "using int arithmetic"; signed or unsigned? 32-bit?

    Could you supply the hash value generated for "The quick brown fox jumps over the lazy dog" so that we may have something to check our attempts against.

    This might be close, and reasonable efficient as it avoids the exponentiation by processing the string backwards and accumulating the exponent, and uses integer math. However truncating to 32-bits in reverse order maybe screwing up the result?

    #! perl -slw use strict; # s[0]*31^(n-1)+s[1]*31^(n-2)+ ... +s[n-1] sub JavaHash { use integer; my ($s) = @_; my $n = length $s; my @s = unpack 'C*', $s; my $hash = 0; my $p = 1; for ( reverse @s ) { print "$hash : $_: $p"; $hash += $_ * $p; $p *= 31; } return $hash; } die 'No arguments' unless @ARGV; printf '%10d:%s' , JavaHash( $_ ), $_ for @ARGV;

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


Re: Generating A Hash Value
by Tomte (Priest) on Jul 03, 2003 at 11:31 UTC

    What exactly is your problem? you happen to have the algorithm, all you have to do is write it down in perl ;) Here is a rather clumsy solution, since you don't provide a test-sample, I don't know if the results are correct:

    use strict; use warnings; my @strings = ('abzdefghijk', 'a b z d e f g h i j k', '', 'abzde fghijk',); print $_, " => ", hc($_), "\n" for @strings; sub hc { my $string = shift; my $erg = 0; my $max = int(length($string) - 1); # this line is your algorithm! $erg += (ord(substr($string,$_,1)))*31**($max-$_) for 0..$max; $erg; }
    I'd like a way to get rid of the substr ;)

    regards,
    tomte


    Hlade's Law:

    If you have a difficult task, give it to a lazy person --
    they will find an easier way to do it.

Re: Generating A Hash Value
by bart (Canon) on Jul 03, 2003 at 12:54 UTC
    It looks like what you need to do is: get the last value you had, multiply it by 31, and add the next number; repeat. In Perl that would become:
    sub hashing { my $s = shift; my $acc = 0; foreach my $c (unpack 'C*', $s) { $acc *= 31; $acc += $c; } return $acc; }
    You will have overflow problems, I think. Likely you should only retain the $n (<=32?) lower bits of $acc. Your remark on "using int arithmetic" also points in that direction. So I'll modify that into:
    sub hashing { my $s = shift; my $acc = 0; foreach my $c (unpack 'C*', $s) { $acc *= 31; $acc += $c; $acc %= 0x100000000; } return $acc; }
Re: Generating A Hash Value
by thor (Priest) on Jul 03, 2003 at 15:49 UTC
    Though your question has been answered, I have to wonder if this is really Java's hashing algorighm. It seems as though having an arbitrarily long hash key would be unwieldy. Though, if the algorithm is really that simple, I suppose it'd be fast...

    thor

      Two things:
      1. You don't have to wonder, you can read it for yourself.
      2. Seondly, and more importantly, the way the hashCode() method works in java allows for objects of any type to serve as hash keys. The original poster asked a question about the implementation of that method for Strings. (also: Perl allows for arbitrarily long hash keys, too, since any strings are hash keys, and strings are arbitrarily long)
      3. Three things: there is no third thing.
        To address your items, point by point:
        1. I could have, but I'm lazy. And laziness is a virtue.
        2. I understand what purpose hashCode() has. However, in perl, hash codes are not of arbitrary length. There is an algorithm described here that describes an algorithm that takes an arbitrary input and produces a 32-bit hash code. A bit slower, I suppose, but keeps you from having to keep track of an arbitrarily large integer for your hash key. So, it's not necessarily the case that one use arbitrarily large hash codes (which is different from arbitrarily large hash keys)
        3. Good, because I don't either...:).
        4. thor

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://271118]
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2024-03-28 11:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found