Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Re: Re: hash collision DOS

by crazyinsomniac (Prior)
on Jun 02, 2003 at 13:50 UTC ( #262374=note: print w/ replies, xml ) Need Help??

Comment on Re: Re: Re: hash collision DOS
Re: Re: Re: Re: hash collision DOS
by iburrell (Chaplain) on Jun 02, 2003 at 21:32 UTC
    It doesn't have anything to do with the Vars() method. The worst-case performance of the hash slows down insertion of elements. CGI.pm inserts parameter names as keys when parsing the query string.
      Look at this
      C:\>perl -MData::Dumper -MCGI -e"die Dumper( CGI->new( { 1..10 }) ) " $VAR1 = bless( { '.charset' => 'ISO-8859-1', '1' => [ 2 ], '3' => [ 4 ], 'escape' => 1, '5' => [ 6 ], '7' => [ 8 ], '.parameters' => [ '7', '9', '1', '3', '5' ], '9' => [ 10 ], '.fieldnames' => {} }, 'CGI' ); C:\>
      Do all the keys map to one bucket? I'm pretty sure they don't.
      C:\>perl -MData::Dumper -MCGI -e"warn Dumper( $a=CGI->new );die Dumper +(scalar $a->Vars() ) " a= b= c= d= e= f= g= $VAR1 = bless( { '.charset' => 'ISO-8859-1', 'a' => [ '' ], 'escape' => 1, 'b' => [ '' ], 'c' => [ '' ], 'd' => [ '' ], 'e' => [ '' ], 'f' => [ '' ], 'g' => [ '' ], '.parameters' => [ 'a', 'b', 'c', 'd', 'e', 'f', 'g' ], '.fieldnames' => {} }, 'CGI' ); $VAR1 = { 'a' => '', 'b' => '', 'c' => '', 'd' => '', 'e' => '', 'f' => '', 'g' => '' };
      The only way all the various keys would map to one bucket would be when you construct a hash by calling Vars. Do you see what I'm saying, or did I miss something crucial.
      update: I improved the 2nd snippet. Now it shows all the various keys going to a single bucket.

      update: Why not just post the example?

       
      ______crazyinsomniac_____________________________
      Of all the things I've lost, I miss my mind the most.
      perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

        The keys aren't going into a single bucket above. The Vars() is just stripping out the non-parameter keys that CGI.pm stores inside of itself. The 'a', 'b' parameters are there in the object.

        The way to determine what is happening inside a hash is evaluating it in scalar context. That gives you the number of buckets being used. tilly wrote a program that uses this feature to generate a list of colliding keys. This algorithm is fast and doesn't depend on reverse engineering the Perl hash algorithm.

        I ran some tests on a 10,000 keys generated by tilly's method. Both inserting them into a hash and parsing the query string with CGI. It takes over 20 seconds to parse the query string in the pathological case versus less than a second for 10,000 normal strings. I haven't been willing to wait long enough to let 100,000 strings run. For a sample, here are the first 10 integers that collide and the scalar hash value showing they all go in one bucket.

        8 14 22 30 38 46 54 62 70 78 86 1/8

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://262374]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2014-07-29 09:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (212 votes), past polls