Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
a finite, large number but not infinite.

md5 hashing can be applied to inputs of any length. "Any length" means an infinite number of possible inputs. Adapting the old schoolboy definition of infinite, however many inputs you have hashed, you can always add one more byte to any one of them to get one more. Infinite.

Every input in this range of inputs will be mapped to one of the 2**128 possible md5s.

Therefore the number of inputs that will map to any given md5 is infinity / 2**128

Which is infinity.


It has often been said that there are collisions in MD5.

Of course there are. It could not be otherwise. Whenever the range of inputs is greater than the range of outputs, there will always be collisions. It simply is not possible for it to be otherwise. (Would that it were, it would make a brilliant compression algorithm:)

Theoretically, if you generated the md5s for all of the (16-byte binary encoded) integers 0 .. 2**128 -1, you might not find any duplicates. That is to say, the input range equals the output range, so it is possible that the md5 algorithm would be a 'perfect hash' for the input data.

And if you double the length of the input data to 32 bytes, then (theoretically) there will be exactly 2 inputs (plaintexts) for every output (md5).

I don't think that this "perfect hash" status has ever been confirmed (or even suggested). It would simply take too long to verify.

The example of duplicate plaintexts posted were 128 (8-bit) bytes each. 2**1024 inputs / 2**128 outputs.

There will be at least 8 duplicate, 128-byte inputs for every md5. That makes it much easier to find a duplicate.

As far as your Linux passwords are concerned, not only would the cracker need to generate a plaintext that gave the same MD5, it would also need to be of a length that the password verification process would accept as a password. Whilst accepting longer passwords generally gives greater security, that is only true up to a certain limit. With md5, that limit is 16 (8-bit) bytes.

If the plaintexts where restricted to 32 x 8-bit bytes, then there will be 2**256 inputs and 2**128 outputs. Therefore there *must be* at least 2 x 32-byte inputs that will produce every md5.

If the plaintext is restricted to 16 x 8-bit bytes, then it is possible (but not proven), that the are no duplicates.

If the inputs are restricted to 12 (8-bit bytes), then it is very unlikely that it will be duplicates. If you further restrict that to 7-bit ASCII, even less likelyhood. Exclude control characters, less again.

This is one of those cases where more isn't always better.

Actually, in my experience, more is very rarely better in computing:)


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

In reply to Re^3: MD5 - what's the alternative by BrowserUk
in thread MD5 - what's the alternative by kiat

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Domain Nodelet?
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this?Last hourOther CB clients
    Other Users?
    Others examining the Monastery: (3)
    As of 2024-09-18 05:47 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?
      The PerlMonks site front end has:





      Results (23 votes). Check out past polls.

      Notices?
      erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.