Just another Perl shrine  
PerlMonks 
comment on 
( [id://3333]=superdoc: print w/replies, xml )  Need Help?? 
"In fact, an infinite number of collisions!"
... "Not trying to split hairs, but I would say a finite, large number but not infinite." Sigh. md5 is a reasonably good hash. As it's input # grows, even as it approaches infinity, there are no numbers in it's range (output space) that cease to be 'hit.' So, theoretically, you can feed it an infinite number of consecutive inputs, and some subset of them they will give you an infinite number of collisions on any given point on the output space. But we're not talking about math in theory, we're talking about math in the real world. There are limits, based on speed of computation, memory size, disk size, etc. Based on these, there is a finite (though very large) number of possible md5 sums calculable in any given timeframe  even if that timeframe is "from the advent of the abacus to the heat death of the universe, when there's no entropy generatable and no work can be done." Less facetiously, I'd say that the difficulty of computing md5 sums from, say, >1 Terabyte inputs means that there will be a very low number of collisions from inputs that high. Why bother, when you can get a collision from underquadrupledigit bytes? So, the answer is really 'both.' In theory, there's an infinite number of collisions for any md5 output. In practice, there certainly isn't, and the number of collisions that will be generated in our lifetimes is finite to the point of being understandable, and maybe even visualized, by our little human brains. edit: more importantly, "Therefore the number of inputs that will map to any given md5 is infinity / 2**128" is incorrect. You're assuming even distribution from the domain to the range. This is not proven (otherwise given any consecutive set of (2**128)1 elements, they'd cover the range of md5 minus one, and adding one more would cover the range entirely. Not yet proven to be true, and in fact quite unlikely). So division doesn't follow, thus while your conclusion is correct your path to get there isn't. In reply to Re^3: MD5  what's the alternative
by danderson

