This is an archived low-energy page for bots and other anonmyous visitors.
Please sign up if you are a human and want to interact.
in reply to Re: Breaking the indecipherable cipher, courtesy Charles Babbage. in thread Breaking the indecipherable cipher, courtesy Charles Babbage.
You're absolutely correct - the cipher text must be long enough and in a recognizable langauge, because my code looks for repeated substrings at some interval, and compares the letter fequency distribution of every Nth letter to the standard letter frequency distribution for a given langauge (English in my case ;). In fact it looks for multiple repeating substrings to better try to zero in on the likely key length. Once repeated substrings are found, it computes the greatest common factor of the intervals for a given repeat, and then computes the greatest common factor for those greatest comman factors, and uses that as the key length.
That doesn't always work :)
I've had mixed results - if I rename vigenere.pl to just "chargrill" and encode the same text, the above code can't find any common factors, and then fails. It would actually take a fair amount of refactoring to account for this.
Honestly, I think my algorithm for figuring out what key letter would make an encrypted cipher string of every Nth character matches the standard letter frequency distribution is the most ... interesting. I couldn't think of a better way to do it than with the human eye, and I think my algorithm gets as close to a visual inspection of peaks and valleys of letter frequency charts as I could think of...
--chargrill
$,=42;for(34,0,-3,9,-11,11,-17,7,-5){$*.=pack'c'=>$,+=$_}for(reverse s
+plit//=>$*
){$%++?$ %%2?push@C,$_,$":push@c,$_,$":(push@C,$_,$")&&push@c,$"}$C[$#
+C]=$/;($#C
>$#c)?($ c=\@C)&&($ C=\@c):($ c=\@c)&&($C=\@C);$%=$|;for(@$c){print$_^
+$$C[$%++]}
Re^3: Breaking the indecipherable cipher, courtesy Charles Babbage.
by turo (Friar) on May 19, 2006 at 15:26 UTC
|
your algorithm works well your only fail is to feed your vigener cipher machine with simbols others than alphabetics ones.
Problems of keeping spaces on your ciphertext are that there are some words that are very common on your language (examples: the, and, of, I, you ...), and the same goes for other simbols like the ``''' and the ``?''. So in old cipher methods is important to remove this simbols (except on transposition ciphers like the scitale ^_^>)
i think your algorithm failed because of those spureous simbols
prove with this text instead:
TheVigenerecipherknownbysomeaslechiffreindechiffrableFrenchfor
theindecipherablecipherisamethodofencryptionthatusesaseriesof
differentCaesarciphersbasedonthelettersofakeywordthoughthereis
someargumentamongcryptographiccirclesastowhichincarnationofthis
particularpolyalphabeticsubstitutioncanaccuratelybeattributedtoBlaise
deVigenereThisimplementationisasimpleformofapolyalphabetic
substitutionToencipheratableofalphabetscanbeusedtermedatabula
rectaorVigeneretableItconsistsofthealphabetwrittenout26times
indifferentrowseachalphabetshiftedcycliclytotheleftcomparedtothe
previousalphabetIncidentallytheresagoodchancethatthisplaintext
islongenoughtodisplaythecryptographicalweaknessofthisparticular
algorithmCanyouspotit
perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'
| [reply] [d/l] [select] |
|
|
Yeah - perfect example: The following text, taken from Simon Singh's Code Challenge:
... fails miserably, but if I save it as "trajan" and run the following:
$ cat trajan | perl -pe 's/\W|\s//gm' | perl babbage.pl
It works like a champ. But this treatment can also foil my "best guess" as to the key length:
$ tail -12 vigenere.pl | perl -pe 's/\W|\s//gm' \
| perl vigenere.pl - | perl babbage.pl
... the "key" found is "vrgpn", and the plaintext is then shown as "tyekipelepetiehnrinmwebnsxmcaqlvcwio..."
All of which makes me think I need to do some refactoring to look out for these kinds of problems, or at the very least add some interactivity along the lines of the posted solution to the code challenge mentioned above.
--chargrill
$,=42;for(34,0,-3,9,-11,11,-17,7,-5){$*.=pack'c'=>$,+=$_}for(reverse s
+plit//=>$*
){$%++?$ %%2?push@C,$_,$":push@c,$_,$":(push@C,$_,$")&&push@c,$"}$C[$#
+C]=$/;($#C
>$#c)?($ c=\@C)&&($ C=\@c):($ c=\@c)&&($C=\@C);$%=$|;for(@$c){print$_^
+$$C[$%++]}
| [reply] [d/l] [select] |
|
|
turo@somewhere:~# tail -12 vigenere.pl | perl -pe 's/\W|\s//gm' | pe
+rl vigenere.pl - | perl babbage.pl
Key: vigenerepl
Plaintext:
thevigenerecipherknownbysomeaslechiffreindechiffrable ...
The more bizarre thing, is that if i rename the file to chargrill and do the same, the result is wrong
turo@somewhere:~# tail -12 chargrill | perl -pe 's/\W|\s//gm' | perl
+ chargrill - | perl babbage.pl
Key: a
Plaintext:
voemoxmyptlczvymcvpvwehpazxghscktptqhyeztumnskmfigstpqtl ...
Maybe the solution is to choose a keylengh (and key), get the result for the bests fits, and pass some test to see if the result is right or not (based on letter frequencies) ... or something like that
turo
PS: only one more question, which book did you bought? (I've got the David Khan's Codebreakers; but is too big) ... Is the Sinmon Singh's one? ... woff, woff
perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'
| [reply] [d/l] [select] |
|
|