Re: hash collision DOS (patch)
by tye (Sage) on Jun 02, 2003 at 04:53 UTC
|
Interesting timing. Just a couple of weeks back p5p accepted my resubmission of a patch I originally submitted years ago that never went anywhere.
So in Perl 5.8.0 and earlier (and perhaps a version of two after 5.8.0), it isn't hard at all to come up with a space of keys that will all get stuck into the same bucket when you put them in a hash. In fact, you can use 1/8th of all possible strings and end up with a hash that contains 8 buckets, 7 of which are empty, and the 8th of which contains as large of a linear linked list as you want.
In versions of Perl that have my patch applied, the more keys you add, the harder it is to find keys that fall into the same bucket. If you want 1,000 keys to fall into the same bucket, then you have to be very picky, only using 0.1% of possible strings. So it is like picking out 1,000 keys out of 1,000,000 keys. If you want 1,000,000 keys in the same bucket, then you can only use 1/1,000,000 of strings so it is like picking out your 1M keys from a space of 1,000,000,000,000 strings.
It looks like the slashdot article isn't even aware of this problem, however. They actually go to the work of finding keys that will get thrown into the same bucket no matter how many buckets get allocated. So they could greatly simplify their work.
- tye
| [reply] |
Re: hash collision DOS
by krisahoch (Deacon) on Jun 01, 2003 at 13:44 UTC
|
$start=time;
while (<>) {
chomp;
$testhash{$_}="x";
$i++;
if (0==($i % 100)) {
print "$i ";
print (time-$start);
print " ";
$_ = scalar(%testhash);
($used,$total) = split '/';
print "$used $total\n";
}
}
If there is anyone out there who write programs like this, then be wary for DoS attacks from Rice university;)
Update: Changed grammar of two sentences to sound less like a neanderthal
Kristofer Hoch
Si vos can lego is, vos es super erudio | [reply] [d/l] |
Re: hash collision DOS
by crazyinsomniac (Prior) on Jun 01, 2003 at 13:07 UTC
|
Any ideas on workarounds and fixes to reduce the risk of being DOS'ed ?
Yeah, avoid doing $query->Vars() or Vars() except when testing.
Don't just get everything, get only what you need.
Aside from that it's a non issue.
| [reply] [d/l] [select] |
|
It's not
just dumping a hash structure that causes it. Solutions
would including things like limiting the total number
of hash elements, or perturbing the input data in a
less predictable way.
The white paper is a bit short on details, but I'm not
sure I'd characterize it as a "non-issue".
Update:See this for
more detail and example exploits.
| [reply] |
|
| [reply] |
|
|
|
sub param {
my($self,@p) = self_or_default(@_);
return $self->all_parameters unless @p;
my($name,$value,@other);
#~~~~~Snip~~~~~
($name,$value,@other) = rearrange([NAME,[DEFAULT,VALUE,VALUES]],@p
+);
#~~~~~Snip~~~~~
return wantarray ? @{$self->{$name}} : $self->{$name}->[0];
}
It also looks like CGI::Simple does the same:
From CGI::Simple 0.06
sub param {
my ( $self, $param, @p ) = @_;
#~~~~~Snip~~~~~
return wantarray ? @{$self->{$param}} : $self->{$param}->[0];
}
How do I love -d? Let me count the ways... | [reply] [d/l] [select] |
|
| [reply] |
|
|
|
Re: hash collision DOS
by astaines (Curate) on Jun 01, 2003 at 23:01 UTC
|
Hi,
As far as I can tell from the article, what they have shown is that it is possible to exploit worst case behaviour of hashing algortihms to consume resources and that this exploit can be used against the hashing algortihms in both Perl 5.8.0 and 5.6.1. (And in many other programs)
What happpens when such an attack succeeds is that hash performance goes from O(n) to O(n^2). This matters a lot if n is several tens of thousands, but not at all if n is around 10 or 100. You can exploit this if (and only if) you can send large amounts of data to a program which then gets stored in a hash in the program.
To fix this you just need to patch perl to use a different hashing algorithm. I leave this as an exercise for you...
---- Anthony Staines
| [reply] |
Re: hash collision DOS
by Jenda (Abbot) on Jun 02, 2003 at 10:52 UTC
|
Well maybe CGI.pm and/or CGI::Lite could let us restrict the CGI parameters that are accepted and stored and throw away all others. Why should the $CGI object remember the parameters we are not interested in anyway?
Jenda
Always code as if the guy who ends up maintaining your code
will be a violent psychopath who knows where you live.
-- Rick Osborne
Edit by castaway: Closed small tag in signature | [reply] |
|
| [reply] |
|
acceptOnly( thesekeys => qw[ these keys ] );
acceptOnly( thismanykeys => 44 );
This would be trivial to add ... just a thought
MJD says you
can't just make shit up and expect the computer to know what you mean, retardo!
I run a Win32 PPM
repository for perl 5.6x+5.8x. I take requests.
** The Third rule of perl club is a statement of fact: pod is sexy.
|
| [reply] [d/l] |
|
PodMaster is right. ->delete() comes too late. And even the $CGI::POST_MAX doesn't help much.
Imagine you have a file upload script. There you need to keep the $CGI::POST_MAX rather high so they may be able to post quite a few CGI parameters. And then even the creation of the hash that CGI.pm uses to store data may take a lot of time. And the grep and delete would only make the issue worse.
Jenda
Always code as if the guy who ends up maintaining your code
will be a violent psychopath who knows where you live.
-- Rick Osborne
Edit by castaway: Closed small tag in signature
| [reply] [d/l] |
Re: hash collision DOS
by tilly (Archbishop) on Jun 03, 2003 at 03:35 UTC
|
It is quite easy to solve the problem permanently. Just use another fast structure, like a BTREE, and the possibility of pathological cases becomes impossible. Alternately one can use a hash, and then add logic to detect when a hash bucket has an abnormally long linked list, then switch it on the fly to a more robust structure.
The problem is that the average performance of hashes are very, very hard to beat, and any kind of paranoid safety logic will be hit all of the time, slowing Perl down a bit on virtually every step. Odds are that nobody in the wild has actually been hurt by Perl's hashing behaviour (yet). I guarantee that people will wonder why Perl slowed down overall. When do you make the engineering decision that average case performance matters more this time than worst case performance? | [reply] |
|
I commented on a possible 'fix' for the potential problem with perls current hashing function at Re: Hash Clash on purpose and would be interested on your assessment of the notion.
If that isn't a fix, or cannot be implemented without undue penalty, I wonder if there isn't at least some mileage in having a new warning, along the lines of the 'Deep recursion' warning, that simply notes that a hash is exhibiting anomolous or 'worst case' behaviour at runtime. It wouldn't fix anything, but its presence in the logs could go a long way in trying to track down such behaviour down?
Maybe a corresponding pragma that allowed the 'hash quality' threshold that would trigger the warning to be specified?
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
| [reply] |
|
I would have to examine the hash function in some detail to tell whether or not that would work as a fix (at a minimal performance loss and the space overhead of having to store the initial value for each hashing function in the hash).
Deep recursion checks in the list traversal logic is exactly the solution that I thought was likely to slow things down. Perl spends a lot of its time doing hash lookups, and any overhead (eg having to use up an extra register to track how deep the list went) is likely to show performance problems.
But thinking about the issue, I am not sure that a clever person couldn't manage to add such a check in some way that didn't cause significant performance problems. I simply have not worked close enough to the metal enough to have a sense of that. If you could add the performance check, though, there is no reason that you couldn't just fall back into a more complex and robust data structure instead of a linked list. The immediate algorithmic overhead is paid for by the fact that you only use it when you really need it.
| [reply] |
|
|
|
Re: hash collision DOS
by comand (Acolyte) on Jun 01, 2003 at 18:52 UTC
|
| [reply] |
|
| [reply] |