You're absolutely correct - the cipher text must be long enough and in a recognizable langauge, because my code looks for repeated substrings at some interval, and compares the letter fequency distribution of every Nth letter to the standard letter frequency distribution for a given langauge (English in my case ;). In fact it looks for multiple repeating substrings to better try to zero in on the likely key length. Once repeated substrings are found, it computes the greatest common factor of the intervals for a given repeat, and then computes the greatest common factor for those greatest comman factors, and uses that as the key length.

That doesn't always work :)

I've had mixed results - if I rename vigenere.pl to just "chargrill" and encode the same text, the above code can't find any common factors, and then fails. It would actually take a fair amount of refactoring to account for this.

Honestly, I think my algorithm for figuring out what key letter would make an encrypted cipher string of every Nth character matches the standard letter frequency distribution is the most ... interesting. I couldn't think of a better way to do it than with the human eye, and I think my algorithm gets as close to a visual inspection of peaks and valleys of letter frequency charts as I could think of...

--chargrill
```\$,=42;for(34,0,-3,9,-11,11,-17,7,-5){\$*.=pack'c'=>\$,+=\$_}for(reverse s
+plit//=>\$*
){\$%++?\$ %%2?push@C,\$_,\$":push@c,\$_,\$":(push@C,\$_,\$")&&push@c,\$"}\$C[\$#
+C]=\$/;(\$#C
>\$#c)?(\$ c=\@C)&&(\$ C=\@c):(\$ c=\@c)&&(\$C=\@C);\$%=\$|;for(@\$c){print\$_^
+\$\$C[\$%++]}
• Comment on Re^2: Breaking the indecipherable cipher, courtesy Charles Babbage.

Replies are listed 'Best First'.
Re^3: Breaking the indecipherable cipher, courtesy Charles Babbage.
by turo (Friar) on May 19, 2006 at 19:26 UTC

your algorithm works well your only fail is to feed your vigener cipher machine with simbols others than alphabetics ones.

Problems of keeping spaces on your ciphertext are that there are some words that are very common on your language (examples: the, and, of, I, you ...), and the same goes for other simbols like the ``''' and the ``?''. So in old cipher methods is important to remove this simbols (except on transposition ciphers like the scitale ^_^>)

i think your algorithm failed because of those spureous simbols

```TheVigenerecipherknownbysomeaslechiffreindechiffrableFrenchfor
theindecipherablecipherisamethodofencryptionthatusesaseriesof
differentCaesarciphersbasedonthelettersofakeywordthoughthereis
someargumentamongcryptographiccirclesastowhichincarnationofthis
particularpolyalphabeticsubstitutioncanaccuratelybeattributedtoBlaise
deVigenereThisimplementationisasimpleformofapolyalphabetic
substitutionToencipheratableofalphabetscanbeusedtermedatabula
rectaorVigeneretableItconsistsofthealphabetwrittenout26times
indifferentrowseachalphabetshiftedcycliclytotheleftcomparedtothe
previousalphabetIncidentallytheresagoodchancethatthisplaintext
islongenoughtodisplaythecryptographicalweaknessofthisparticular
algorithmCanyouspotit

perl -Te 'print map { chr((ord)-((10,20,2,7)[\$i++])) } split //,"turo"'

Yeah - perfect example: The following text, taken from Simon Singh's Code Challenge:

... fails miserably, but if I save it as "trajan" and run the following:

```\$ cat trajan  | perl -pe 's/\W|\s//gm' |  perl babbage.pl

It works like a champ. But this treatment can also foil my "best guess" as to the key length:

```\$ tail -12 vigenere.pl | perl -pe 's/\W|\s//gm' \
| perl vigenere.pl - | perl babbage.pl

... the "key" found is "vrgpn", and the plaintext is then shown as "tyekipelepetiehnrinmwebnsxmcaqlvcwio..."

All of which makes me think I need to do some refactoring to look out for these kinds of problems, or at the very least add some interactivity along the lines of the posted solution to the code challenge mentioned above.

--chargrill
```\$,=42;for(34,0,-3,9,-11,11,-17,7,-5){\$*.=pack'c'=>\$,+=\$_}for(reverse s
+plit//=>\$*
){\$%++?\$ %%2?push@C,\$_,\$":push@c,\$_,\$":(push@C,\$_,\$")&&push@c,\$"}\$C[\$#
+C]=\$/;(\$#C
>\$#c)?(\$ c=\@C)&&(\$ C=\@c):(\$ c=\@c)&&(\$C=\@C);\$%=\$|;for(@\$c){print\$_^
+\$\$C[\$%++]}

automation is right ;) ...

i did the same as you did, but i've obtained the correct result ...

```turo@somewhere:~# tail -12 vigenere.pl | perl -pe 's/\W|\s//gm'   | pe
+rl vigenere.pl - | perl babbage.pl
Key: vigenerepl
Plaintext:
thevigenerecipherknownbysomeaslechiffreindechiffrable ...

The more bizarre thing, is that if i rename the file to chargrill and do the same, the result is wrong

```turo@somewhere:~# tail -12 chargrill | perl -pe 's/\W|\s//gm'   | perl
+ chargrill - | perl babbage.pl
Key: a
Plaintext:
voemoxmyptlczvymcvpvwehpazxghscktptqhyeztumnskmfigstpqtl ...