An higher order approach would be to look at probabilities for chains of letters. E. g. in English the chance for an 'u' should be higher than average after a 'q'.
This is based on the concecpt of apes using a typewriter to produce texts (Sir Arthur Eddington, 1927). I own a 1988 special edition of Scientific American on Computer problems (translated to German), where a BASIC program is discussed to collect statistics from longer input texts and a random text is produced wheighing probabilities for the next letter with collected statistics for N preceeding letters. Interpunctation and space characters are included as letters into this randomly produced string. At N=4 one entry might be ('Per','l', 50%). At N=5 most of the words are correct and resemble the input text, but sentences are nonsense.
Perl really shines at these applications. The author Brian Hayes hints at large memory requirements for the N-dimensional array where many fields remain empty. A perfect application for a hash...