in reply to Breaking the indecipherable cipher, courtesy Charles Babbage.

I do not remember very well the theory, but one of the hardest part of the kasiski analysis was to discover which was the key length; and the fact is that you can only discover it, if the length of the cipher text is longer enough and the key shorter enough (we'll need practice to know this). isn't it?

Anyway, in those years, neither computer (i do not include the mechanic machine for calculations made by Babage) nor high level programming languages exists. Now, we can break that code by brute force (the only important thing, is to know if the ciphertext is a polialphabetic cipher or not) for discovering the key length with the frequency analysis help ... a'm i wrong?

I think i must read those old books of cryptography to analyse with great appetite your code ... but i'm curious and i cannot wait, did you use brute force, or another intersting algorithm for breaking the vigenere cipher? I mean, on your last post, you said that you've been written a extrange method to break the vigenere cipher.

Saludos :-)


PS: umm, i've made a code breaker script for the scitale cipher ... but it have some problems with thick scitales ^_^> ... i hope to submit it later, when i have some spare time. cheers

PS2: i forgot to say, great work! (i didn't know Babage break the Vigenere cipher before than Kasiski did)

perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'

Replies are listed 'Best First'.
Re^2: Breaking the indecipherable cipher, courtesy Charles Babbage.
by chargrill (Parson) on May 19, 2006 at 18:45 UTC

    You're absolutely correct - the cipher text must be long enough and in a recognizable langauge, because my code looks for repeated substrings at some interval, and compares the letter fequency distribution of every Nth letter to the standard letter frequency distribution for a given langauge (English in my case ;). In fact it looks for multiple repeating substrings to better try to zero in on the likely key length. Once repeated substrings are found, it computes the greatest common factor of the intervals for a given repeat, and then computes the greatest common factor for those greatest comman factors, and uses that as the key length.

    That doesn't always work :)

    I've had mixed results - if I rename to just "chargrill" and encode the same text, the above code can't find any common factors, and then fails. It would actually take a fair amount of refactoring to account for this.

    Honestly, I think my algorithm for figuring out what key letter would make an encrypted cipher string of every Nth character matches the standard letter frequency distribution is the most ... interesting. I couldn't think of a better way to do it than with the human eye, and I think my algorithm gets as close to a visual inspection of peaks and valleys of letter frequency charts as I could think of...

    $,=42;for(34,0,-3,9,-11,11,-17,7,-5){$*.=pack'c'=>$,+=$_}for(reverse s +plit//=>$* ){$%++?$ %%2?push@C,$_,$":push@c,$_,$":(push@C,$_,$")&&push@c,$"}$C[$# +C]=$/;($#C >$#c)?($ c=\@C)&&($ C=\@c):($ c=\@c)&&($C=\@C);$%=$|;for(@$c){print$_^ +$$C[$%++]}

      your algorithm works well your only fail is to feed your vigener cipher machine with simbols others than alphabetics ones.

      Problems of keeping spaces on your ciphertext are that there are some words that are very common on your language (examples: the, and, of, I, you ...), and the same goes for other simbols like the ``''' and the ``?''. So in old cipher methods is important to remove this simbols (except on transposition ciphers like the scitale ^_^>)

      i think your algorithm failed because of those spureous simbols

      prove with this text instead:

      TheVigenerecipherknownbysomeaslechiffreindechiffrableFrenchfor theindecipherablecipherisamethodofencryptionthatusesaseriesof differentCaesarciphersbasedonthelettersofakeywordthoughthereis someargumentamongcryptographiccirclesastowhichincarnationofthis particularpolyalphabeticsubstitutioncanaccuratelybeattributedtoBlaise deVigenereThisimplementationisasimpleformofapolyalphabetic substitutionToencipheratableofalphabetscanbeusedtermedatabula rectaorVigeneretableItconsistsofthealphabetwrittenout26times indifferentrowseachalphabetshiftedcycliclytotheleftcomparedtothe previousalphabetIncidentallytheresagoodchancethatthisplaintext islongenoughtodisplaythecryptographicalweaknessofthisparticular algorithmCanyouspotit

      perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'

        Yeah - perfect example: The following text, taken from Simon Singh's Code Challenge:

        ... fails miserably, but if I save it as "trajan" and run the following:

        $ cat trajan | perl -pe 's/\W|\s//gm' | perl

        It works like a champ. But this treatment can also foil my "best guess" as to the key length:

        $ tail -12 | perl -pe 's/\W|\s//gm' \ | perl - | perl

        ... the "key" found is "vrgpn", and the plaintext is then shown as "tyekipelepetiehnrinmwebnsxmcaqlvcwio..."

        All of which makes me think I need to do some refactoring to look out for these kinds of problems, or at the very least add some interactivity along the lines of the posted solution to the code challenge mentioned above.

        $,=42;for(34,0,-3,9,-11,11,-17,7,-5){$*.=pack'c'=>$,+=$_}for(reverse s +plit//=>$* ){$%++?$ %%2?push@C,$_,$":push@c,$_,$":(push@C,$_,$")&&push@c,$"}$C[$# +C]=$/;($#C >$#c)?($ c=\@C)&&($ C=\@c):($ c=\@c)&&($C=\@C);$%=$|;for(@$c){print$_^ +$$C[$%++]}