A Beginning Guide To Evolutionary Algorithms

What Are They?

In an evolutionary algorithm (EA), we search the solution space (the set of all possible inputs) of a difficult problem for the best solutions, but not naïvely like in a random or brute-force search. We use some biological principles to help guide the search:

Reproduction: New solutions don't just appear out of thin air (or our random number generator) -- we combine existing good solutions to come up with new (hopefully even better) solutions.
Mutation: Just like in real life, mutations can help or hurt. But they keep the gene pool from becoming stagnant and homogenous -- preventing EA "inbreeding".
Survival of the fittest: The "good" solutions in the population are the ones we pick to pass down their genes. We eliminate the poor solutions from consideration.

EA is an umbrella term encompassing ideas used in the more specific fields of genetic programming, genetic algorithms, and evolutionary computing, so the concepts covered here apply to those fields as well.

Armed with just these principles, you could implement your own rudimentary (and working) EAs. You may already have implemented something like this before. However, as they say, the devil's in the details. It's important to understand how the implementation details affect your EA.

Why/When Use EAs?

(Adapted from evoweb's take on the subject)

You only need to specify the goal of the computation, and need not have a complete mathematical understanding of the problem.
Works well with incomplete and noisy data.
Easy and cheap to implement.
Easily adaptable by simply modifying parameters.
Efficiently parallelized.

On top of all that, EAs are refreshingly fun to use compared to other forms of analysis, and they leave plenty of room for creativity. Here are a few examples of real-world applications that I know about:

Finding circuit layouts that minimize production costs, using graph representations.
Modelling biological principles like cooperation, speciation, specialization.
Finding effective movement strategies for cheap (and dumb) robots.
Creating classifiers and predictors for data sets, often with neural nets.
Generating music/art that satisfies certain aesthetic criteria.

The Basics:

In EAs, we work with a population of individuals, data structures that represent elements of the problem's solution space. When we talk about representation, we mean the type of internal data structure the EA uses to store the individuals (array, string, tree, neural net, etc). Representation does matter, but for the scope of this document's examples, strings and arrays will suffice. As they are easily available as native data types in any sane language, they are much easier to implement and conceptualize. The actual encoding of a solution into a data structure is called the individual's gene. We'll talk a little more about representation later.

Fitness:

If our goal is to find individuals which represent "good" solutions, we should probably be a little more specific about what we mean by "good." We must have a way of scoring an individual's effectiveness for the problem. We call this measurement the individual's fitness (as in survival of the fittest). The fitness measure should reflect the characteristics you desire in a "good" solution, so the higher an individual's fitness, the better it demonstrates the traits you want. The fitness measure is always dependent on the representation you use. It sets your EA's goal.

Commonly, fitness is just a function of the individual's gene data structure. However, the fitness measure need not be a true function in the mathematical sense. It might be probablistic, or it might depend also on other members of the population. It also often involves a model or simulation of the problem, executed with the individuals of the population.

The Process:

The most basic evolutionary algorithm psuedocode is rather simple:

Create an initial population (usually at random)
Until "done": (exit criteria)
- Select some pairs to be parents (selection)
- Combine pairs of parents to create offspring (recombination)
- Perform some mutation(s) on the offspring (mutation)
- Select some population members to be replaced by the new offspring (replacement)
Repeat

This is extremely general, so let's look at each step in a little more detail. As you can imagine, there are a zillion ways to fill in the implementation details of each step, so I'll only list the most common ones.

The exit criteria sets the target for the fitness measure, but also usually includes an upper limit on the number of iterations, in case the evolution gets "stuck." A typical exit criteria might be: "stop when some individual achieves a fitness of 100, or when we have iterated 10,000 times." We'll talk more about evolution getting "stuck" later. Sticking with the biology jargon, each iteration of the loop is called a generation.

Selection and replacement grant breeding rights and cause extinction within the population, respectively. They are independent of the representation scheme, and should only rely on your choice of fitness measure. Usually a small fraction of the population are chosen for breeding or replacement each generation. For simplicity, often the same number of individuals are chosen for breeding and replacement, although this is not required (causing the population to change in size). Here are a few of the most common selection and replacement methods:

Random: Choose completely at random, with uniform probability given to each individual (regardless of fitness).
Absolute: Always breed the n best-fit individuals, and replace the n least-fit individuals. (No randomness, always a deterministic choice)
Roulette: Pick randomly, but with relative weights proportional to fitness. Higher-fit individuals have a better chance of getting chosen for breeding, and less-fit individuals have a better chance of getting chosen for replacement
Rank: Same as roulette, but make the relative weights proportional to an individual's rank within the population, not fitness. The least-fit individual has rank 1, while the most-fit has rank N (the size of the population).

Selection and replacement methods are independent of each other. You could use absolute replacement with rank selection, for example.

Recombination (or breeding) is the process of using existing pairs of "parent" genes to produce new "offspring" genes. The details of this operation depend on your representation scheme, but by far the most common recombination operation is called crossover. Crossover can be used with string and array representations. It involves making copies of the parents and then swapping a chunk between the copies. Here's a visual example on two string genes:

   parent 1: "aaaaaaaaaaaaaa"  (these are only copies -- the parents a
+re
   parent 2: "AAAAAAAAAAAAAA"      not modified in this operation)
      cut here:     ^    ^     (these two points chosen at random)
      and then swap sections..
   result 1: "aaaaaaAAAAAAaa"
   result 2: "AAAAAAaaaaaaAA"
[download]

The concept of crossover can be extended and used in other representations as well. For instance, a crossover operation on two tree structures might involve the exchange of two subtrees. Common variations on crossover include swapping chunks from different parts of the two genes or exchanging more than one chunk.

Mutation is a random process which slightly modifies the gene of an individual. With string genes, a mutation usually consists of changing a fixed number of characters, or changing each character with a very low probability (e.g, a 5% chance of changing each character). Other interesting mutations include lengthening, shortening, or modifying the gene, each with a respective probability.

A "Hello World" Example:

It's time for an example! The "Hello World" of EAs is called ONE-MAX. The problem is to produce a string of all 1s, starting from a population of random binary strings. This may seem silly since we already know the final answer, but the same could be said for "Hello World" programs in programming languages. In this case, it's the EA process and the concepts, not the final answer, that are most important. Using a string representation for the genes, the code is as follows:

use strict;
use List::Util qw/shuffle sum/;

my $str_length = 20;
my $pop_size   = 50;

my @population = sort { fitness($a) <=> fitness($b) }
                 map  { rand_string() } 1 .. $pop_size;

my $generations;
while ( $generations++ < 1000 and fitness($population[-1]) != $str_len
+gth ) {
    my @parents = shuffle @population[-10 .. -1];

    my @children;
    push @children, crossover( splice(@parents, 0, 2) )
        while @parents;

    @population[0 .. 4] = map { mutate($_) } @children;

    @population = sort { fitness($a) <=> fitness($b) } @population;

    printf "Average fitness after %d generations is: %g\n",
        $generations, (sum map { fitness($_) } @population)/@populatio
+n;
}

#####

sub fitness {
    return $_[0] =~ tr/1/1/;
}

sub crossover {
    my ($s1, $s2) = @_;
    my ($start, $end) = sort {$a <=> $b} map { int rand length $s1 } 1
+ .. 2;

    (substr($s1, $start, $end - $start), substr($s2, $start, $end - $s
+tart)) =
      (substr($s2, $start, $end - $start), substr($s1, $start, $end - 
+$start));
    return ($s1, $s2);
}

sub mutate {
    my $s = shift;
    for (0 .. length($s) - 1) {
        substr($s, $_, 1) = 1 - substr($s, $_, 1) if rand() < 0.2;
    }
    return $s;
}

sub rand_string {
    join "" => map { rand() > 0.5 ? 0 : 1 } 1 .. $str_length;
}
[download]

Can you pick out which parts of this code correspond to the parts of the pseudocode? What type of mutation was used (N-point or probabalistic)? What type of selection and replacement scheme were used? What percentage of the population gets breeding rights at each generation? What is the exit criteria? How could this code be made more efficient (there are many ways)? How could the EA process be modularized? How much harder would this have been to write in C or Java? ;)

Now What? How Do I Choose?

You now probably have a feeling for the wide range of EA building blocks. But there are so many, how will you choose what's best for a particular problem? What makes them different? It's time for a little theory...

Fitness Landscapes & Diversity:

One way to think of how EAs solve problems is through hill-climbing. Think of breeding as a process of exploring the solution space: starting with high-fitness individuals, recombination and mutation bring new individuals into the population, whose genes are "nearby" the genes of the parents. Selection and replacement fuel the up-hill part: the new individuals who have a higher fitness will in turn be explored while the lower ones will eventually be discarded and so on, until you discover individuals that have the highest fitness of all nearby individuals -- they are at the top of that particular "hill" of fitness. Notice that "nearness" of other individuals is measured in the number of mutations and/or recombinations needed to get from here to there. So your choice of mutation and recombination operators determines the fitness landscape.

On one hand, hill-climbing casues EA populations is to slowly cluster near the tops of these hills as they try to achieve maximum fitness. When most of the population's members are very close to one another (very few mutations or crossovers apart), their genes are very similar, they have much genetic material in common, and we say the population is not diverse. Hill-climbing is desired (we do want to maximize fitness after all), but only in moderation. If it happens too fast, it's easy for the whole population may become "stuck" on a small number of fitness hills that are not the highest in the solution space. Mathematically speaking, these are local optima.

On the other hand, when the population is diverse and spread out in the landscape, you may combine two "distant" parents to get a child somewhere in the middle, maybe on a new fitness hill. This allows for more fitness hills to be discovered, reducing the chance of getting stuck on a local optima.

(You may have noticed that in the ONE-MAX example, there are none of these. There's only one fitness hill, with the string of all 1s at the top. Its fitness landscape is a 20-dimensional hypercube. Mutation moves along one or more edges of the cube, and crossover moves to any vertex along the subcube induced by the parents. Non-trivial problems generally have fitness landscapes that are too complex to characterize.)

Here is how diversity is affected in general by the different operations:

The recombination-replacement process decreases diversity: When we reuse the good genes and throw out the bad genes, the population tends to approach a stale and homogenous state. Since we only reuse genetic material already existing in the population throw out some information, we can never create new genetic material.
Mutation is the only process that increases diversity: It's the only way new genetic material is introduced throughout the EA process. Without mutation, there would be no way to achieve gene configurations that aren't merely rearrangements of the initial population's genes. However, mutations that are too large or too frequent can make individuals "jump" between fitness hills. We want mutation to move individuals a small enough distance that they usually stay on the same fitness hill.
Stronger fitness bias in selection/replacement means faster decrease in diversity: If you select and replace individuals completely at random (no bias towards fitness), it will take a long time for the population to become homogenous (but it will still happen eventually). On the other hand, if you always breed the most-fit individuals and always replace the least-fit (no randomness, 100% fitness bias), then only the best-fit individuals (and their presumably high-fitness children) keep producing children generation after generation. The high-fitness genetic material gets reused for breeding over and over again, while the lower-fitness genetic material never has a chance to be reused.

Population diversity is needed to ensure a many fitness hills are encountered. But eventually diversity must be sacrificed so that good solutions can "climb the hill" to maximize fitness.

Representation Matters, Too!

Mutation and recombination (and therefore the fitness landscape) rely on your choice of representation scheme. The representation should therefore make mutation and recombination behave like the biological concepts they represent. For instance, a small change in an individual's gene should make only a small to moderate change in its fitness characterstics. Likewise, combining parts of the genes of two individuals should produce an individual that shares some of its parents' characterstics. However, the result need not be merely an average of the parents; there may be synergy between different parts of the genes.

In solving difficult problems with EAs, finding a good representation scheme with good recombination and mutation operations can often be the hardest piece of the puzzle. There is no magic advice for choosing the "right" representation, and in addition to adhering to these guidelines, the choice must be feasible to implement.

Some Final Notes

EAs are a lot of fun, especially when you are modelling some real-world situation. Because at times EAs seem to work like magic, it's easy to forget that they have the same limitations (in terms of complexity) as any computer model. Do expect interesting results, but don't expect sentient digital lifeforms to crawl out of the primordial ooze of your 1s and 0s.

I think you'll enjoy working with evolutionary algorithms, as they're a bit of a departure from classical computation/analysis methods. Hopefully this guide will give you the background needed to have a lot of fun tinkering with EAs. Be creative and inventive, and happy evolving!

blokhead

Update: fixed moved evoweb link (thanks atcroft)

Comment on A Beginning Guide To Evolutionary Algorithms Select or Download Code

Replies are listed 'Best First'.
Re: A Beginning Guide To Evolutionary Algorithms by biosysadmin (Deacon) on Oct 14, 2003 at 12:30 UTC
I have another concept that you might add to your tutorial. Evolutionary algorithms assume that the route to an optimal solution is paved with progressively better solutions. This is not be true for some problems, so it's important to keep that in mind. Also, sometimes your evolutionary algorithms tend to migrate towards a sub-optimal solution that is not anywhere "near" an optimal solution. In your hill metaphor, you might call this a false peak of sorts. One way around this is to introduce a cataclysmic event that kills off most of your population and replacing them with randomly initialized individuals. One more thing: at my school we call them genetic algorithms, perhaps you could not that somewhere in your "What Are They?" section? Awesome write-up, this is a very good explanation of a cool paradigm in problem solving. :)	[reply]
Re: Re: A Beginning Guide To Evolutionary Algorithms by blokhead (Monsignor) on Oct 14, 2003 at 15:05 UTC
Evolutionary algorithms assume that the route to an optimal solution is paved with progressively better solutions. This isn't true. When this is the case, every mutation that increases fitness is a move in the "right direction", so you just have one big fitness hill. But then you don't even need real evolution! Just take one individual, examine some possible single-mutations from it until you get something better. Keep repeating, and you're guaranteed to get an optimal solution in the end. We called these "stochastic hill-climbers," IIRC. Notice that the ONE-MAX example is like this. Every new 1-character you get in the string gene is a step in the right direction, up the hypercube so to speak. This might be a reason why ONE-MAX is a deceptively simple first example? This is not be true for some problems, so it's important to keep that in mind. You're exactly right here. I wanted to imply this by mentioning suboptimal fitness hills, but it's probably better to say it explicitly. By climbing up a suboptimal hill, you are taking steps "up", but you won't get to an optimal solution. To get to an optimal solution, must to take steps "down" the hill first. So the route to an optimal solution isn't paved with progressively better things. This is why we don't want to kill off lower-fitness individuals as soon as we lay eyes on them in hard problems `;)` .. if they have a chance of breeding, they might even lead us up a different fitness hill than the one we're stuck on. One more thing: at my school we call them genetic algorithms I never got most of these silly and similar-sounding terms straight in my head! According to the EA FAQ, EA encompasses genetic algorithms. However, the psuedocode they give for both seems the same. So I honestly don't know the difference. It's seems pretty safe to interchange the two terms, and definitely within the structure laid out in my writeup (which follows the pseudocode of both). blokhead	[reply]
Re: Re: Re: A Beginning Guide To Evolutionary Algorithms by biosysadmin (Deacon) on Oct 15, 2003 at 12:44 UTC
I don't think I said exactly what I meant in my first reply. I didn't mean that each individual mutation must increase fitness for an individual for a population. Here's a more specific explanation: There are some sets of parameters that perform well when combined together in the same individual. One hypothesis of evolutionary algorithms is that combination of these good "building blocks" will form an overall more fit individual. There exist problems that violate this hypothesis, and are therefore very difficult for a simple genetic algorithm. If you do a google search for the "Minimal Deceptive Problem," you should find some references for this. In general, these deceptive problems are very rare and are also easy to spot if you track the evolution of your population over time. Hope this cleared my comment up. Again, awesome guide. :)	[reply]
Re: A Beginning Guide To Evolutionary Algorithms by Jaap (Curate) on Oct 14, 2003 at 09:47 UTC
Perhaps for a nice discussion, we could try to implement the "travelling salesman problem" where a salesman has to visit a bunch of points represented as (x,y) coordinates in such a way that the total distance travelled (the length of the path) is minimized (this is the fitness criterium) We'd need a good representation of a solution, perhaps a 2D array with a series of (x,y) coordinates. We'd also need a subroutine to calculate the length of the path. And we'd need the crossover() and mutate() functions.	[reply]
Re: A Beginning Guide To Evolutionary Algorithms by Jaap (Curate) on Oct 14, 2003 at 09:26 UTC
Excellent article! You might however have replaced the words "solution space" with something less intimidating like ehhh.... "solution". I know it's a space in the mathematical sense but not everyone has followed math classes untill the age of 25. This article is almost a tutorial. We could move it to the tut section.	[reply]
Re: A Beginning Guide To Evolutionary Algorithms by Jaap (Curate) on Oct 14, 2003 at 09:34 UTC
The problem is to produce a string of all 1s... This is not the real problem. It is the absolute solution to the problem (because of the fitness algorithm). Normally you don't know what the solution looks like.	[reply]
Re: Re: A Beginning Guide To Evolutionary Algorithms by Anonymous Monk on Oct 14, 2003 at 15:11 UTC
Which is what he said in his post.	[reply]
Re: A Beginning Guide To Evolutionary Algorithms by AssFace (Pilgrim) on Oct 14, 2003 at 16:12 UTC
Does Perl have any modules that would allow plotting the 3D problem/solution space? I do a lot of work with genetic algorithms and neural nets and it would be interesting to be able to easily output your results on a 3D contour map of the problem space... Although I guess in most cases you can't map the solution space or problem space, just the parts of it that you have discovered and then have to look at the graph and guess the rest. ------------------------------------------------------------------- There are some odd things afoot now, in the Villa Straylight.	[reply]
Re: Re: A Beginning Guide To Evolutionary Algorithms by blokhead (Monsignor) on Oct 14, 2003 at 16:59 UTC
The fitness landscape is a 3-dimensional surface only if the genes of the individuals have exactly 2 degrees of freedom. Even in the ONE-MAX problem above, the fitness landscape is a 21-dimensional surface. Your indpendent variable is any tuple in `{0,1}^20`, and you've got the dependent variable, fitness. Granted, the 20 independent dimensions are not big (just 0 and 1), but it's still more than I know how to visualize effectively. Using terms like "hills" to help visualize the problem space is helpful, but only because we usually omit the fact that the hills are almost never 3-dimensional. `;)` It's worse if you're evolving structures other than strings and arrays. An n-character string is easily analogous to a tuple in some n-dimensional space. But what if we are evolving neural net structures, or state machines, or parse trees? How do we "plot" the fitnesses of these things? How do you even measure the distance between two state machines so that you can build the "grid" that the fitness will be plotted on? Strict mathematical analysis on these problems is tough. Even trying to look at the spaces involved is often next to impossible. There can't be any single plug-n-play way of dumping a plot of your fitness landscape. Doing such a plot requires a large understanding of the independent variables, and probably lots of trial and error. You're probably better off analyzing the EA results using statistical analysis, as opposed to inspecting the actual fitness landscape. blokhead	[reply]
Re: Re: Re: A Beginning Guide To Evolutionary Algorithms by AssFace (Pilgrim) on Oct 14, 2003 at 17:43 UTC
I suspected that sort of thing. That said, I think for the things that I do, I could still argue that a 3D map of the space completed would be feasible... perhaps not terribly useful - but for what I do, eye-candy actually does have merit when presenting to non-tech heads. If it is plotting in one direction the time that you are testing over (say +/- X), then another direction (say +/- Y) the performance score of your code, and then in say the Z +/- direction perhaps the series of evolutions that you do. That way I could show them with pretty charts that as the cycles of the code went on, I moved away from certain points and towards others. I see your point at the level of usefulness of this - but I still raise my question of if there is a way to in Perl to make 3D plots - I haven't seen it, but then I honestly haven't looked all that hard in the past bit (I looked over a year ago I can recall). ------------------------------------------------------------------- There are some odd things afoot now, in the Villa Straylight.	[reply]
Re: Re: Re: A Beginning Guide To Evolutionary Algorithms by Anonymous Monk on Oct 14, 2003 at 19:36 UTC
In fact, one major problem in evolutionary biology and phylogenetics is the near-impossibility of defining the distance between two different phylogenetic trees mathematically. For instance, you may wish to have some quantitative metric to indicate how much a tree is perturbed by the addition/removal of an additional element, but as best as I know that's virtually impossible.	[reply]
Re: A Beginning Guide To Evolutionary Algorithms by AssFace (Pilgrim) on Oct 20, 2003 at 16:12 UTC
Also, don't forget the variant on that that is evolutionary programming. An evolutionary algorithm takes a result, usually a string, and manipulates it so that it gets progressively better over time based on some scoring system. Evolutionary programming does a similar thing, but with the program code itself. An example of this is in the game GridWars, the team that came in second used a program that was all evolutionarily built. Interestingly enough, in that particular competition, speed matters very much, and that is one drawback of evolutionary programming - it tends to make bulky code that isn't always as fast as it could be. I haven't yet tried it myself with eval, but I suspect it would be relatively easy to implement in Perl. The idea is that you have your desired result, and instead of manipulating your string as the desired output - you manipulate the code that generates your output. Usually this means adding if/then statements and adjusting the values that they compare, and perhaps nesting them as well. I haven't ever written in Lisp, but I have read that it is relatively easy to implement in Lisp. This is a good way to let a program design a program for you. Then later on it usually benefits the speed of the program for a human to go over the final result and look for areas that it could be changed to optimize various parts of it. ------------------------------------------------------------------- There are some odd things afoot now, in the Villa Straylight.	[reply]
Re: Re: A Beginning Guide To Evolutionary Algorithms by blokhead (Monsignor) on Oct 20, 2003 at 18:52 UTC
In evolutionary programming (also known as genetic programming), you evolve structures that are programs. The most common way to do this is by evolving lists of statements or parse trees / abstract syntax trees. Interestingly enough, in that particular competition, speed matters very much, and that is one drawback of evolutionary programming - it tends to make bulky code that isn't always as fast as it could be. This is something we studied in my EA class. The reason for this is that evolution tends "pad" its solutions to protect against mutation. The best solutions you find will usually have many sections of dead/unreachable code and/or redundancy, which will increase the probability of a mutation not causing harm to the "real" code. It also increases the change that the important code is not split during a crossover as well. An interesting research topic is how the average percentage of dead code is related to your mutation rate. I haven't yet tried it myself with eval, but I suspect it would be relatively easy to implement in Perl. You may wish to look at Genetic Programming or breeding Perls, which is a neat exercise. I currently have a few relevant Perl/EA links on my homenode at the moment. And also, for the moment, I have in my scratchpad a reimplementation of Breeding Perls using one of my own EA modules. It uses a fixed-length array of simple Perl statements, and uses eval to score the fitness. Usually this means adding if/then statements and adjusting the values that they compare, and perhaps nesting them as well. For nesting, conditionals, etc, you would need to evolve an abstract syntax tree instead of just a list of statements (that is, if you want the code to actually be syntactically valid for each individual). But another alternative very similar to what you suggest is called an ISAc list. I forget what it stands for, but it was invented by some guy from a previous semester of the EA class I took. It's very much like modern assembly codes with only jump and no-op instructions. But each instruction has a boolean predicate based on two data values and a comparison operator. I'm a little short on time right now, but I can come back a bit later and explain this in full.. OK, an ISAc list has 4 parts, two offsets into a data vector, an action, and a jump location. The environment defines a data vector, which holds some values. Some values may be constants, some may change each time (like a list of the opponents' last few moves, etc). The environment also defines a comparison operator. To execute each instruction, the two data-vector values are compared, and if the comparison succeeded the action is taken. Otherwise, flow continues to the next statement (looping around at the bottom). This will probably be clearer with an example: `data vector = (x1,..,x5) = (3,5,2,7,2) comparison operator = greater-than isac list: 1: (x1, x3, jump, 3) # if (x1 > x3) goto 3; 2: (x5, x3, noop, 1) # if (x5 > x3) noop; 3: (x2, x1, foo, 5) # if (x2 > x1) return "foo"; 4: ...` [download] Two actions, "jump" and "noop" are special flow-control actions. Other actions cause a return from the structure, as the program's "final answer." Anyway, the lists seem to evolve well, as crossovers and mutations are quite straight-forward. This is a good way to let a program design a program for you. Then later on it usually benefits the speed of the program for a human to go over the final result and look for areas that it could be changed to optimize various parts of it. The biggest problem is that computer-evolved programs are notoriously impossible to understand. You don't need to look any farther than the example outputs of Breeding Perls. It may not be possible to do much more with evolved programs than just remove dead/unreachable code, but this is an actively researched field and you may find some interesting papers out there. blokhead	[reply] [d/l]
Re: A Beginning Guide To Evolutionary Algorithms by AssFace (Pilgrim) on Oct 20, 2003 at 16:26 UTC
One area where these are great to use is old school encryption breaking. Back when encryption was based on things such as the German Enigma or the French Vigenere, etc, they were all essentially ways of using multiple alphabets to encode a message. If trying to brute force it by hand, this would take a lot of time as the number of alphabets used increased. People like Turning worked out ways around it with mathematical approaches that greatly reduced the search space that one would need to go through. Even today with fast computers, if you wanted to brute force the Poe Cipher (which is how I got into Perl, during the process of breaking that - I wasn't the first though - 5th I think), it would have taken more time than we can even really grasp. That is where evolutionary algorithms come in. In the case of the Poe Cipher - all of his written works and many of his letters are online at the Gutenburg project - so you can download those. After which, you can scan through them all and populate a Markov Matrix with their probabilities of showing up in various patterns. Digraphs, trigraphs, etc (number of characters in a row). After that, you can now scan over a section of text and score it with that MM - the higher the score, the more likely it is to be actual text instead of just total gibberish. Then you simpy have a tight loop that evolves the alphabets, tests output to see what its score is - if the score is higher than the last time, then we have a better set of alphabets. Then you mate and mutate them and test the score again, etc etc. Due to errors in the encoding (whether purposeful or not), it wasn't likely that there would be 100% completion of the decryption by the program - but if you were outputting the code to the screen, once you started seeing text that you could understand, then you knew you were close and then could eventually do the rest by hand. There are also applications similar to this in the stock market world - which I will gladly go into detail on if anyone is interested. ------------------------------------------------------------------- There are some odd things afoot now, in the Villa Straylight.	[reply]
Re: A Beginning Guide To Evolutionary Algorithms by t'mo (Pilgrim) on Oct 14, 2003 at 17:06 UTC
People who read your article shoud be able to understand mstone's "aside" in this now. ++ ...every application I have ever worked on is a glorified munger...	[reply]
Re: A Beginning Guide To Evolutionary Algorithms by Anonymous Monk on Jul 16, 2007 at 06:35 UTC
Hi. This is an excellent article, nice work. I'm writing an e-book about global optimization, including topics like evolutionary algorithms, genetic algorithms, and genetic programming. Maybe it can also be of some utility as further reading here, see http://www.it-weise.de/projects/book.pdf. Thomas.	[reply]
Re: A Beginning Guide To Evolutionary Algorithms by rupesh (Hermit) on Aug 12, 2004 at 05:58 UTC
I know this is a bit late for this node. I was just going thru some of your nodes and this one seemed to be interesting... Anyway, have you read Dan Brown's Digital Fortress? That was one of my recent reads and it was a wonderful experience.. The reason i mention the book, is because it is full of Encryption Algorithms (including some of his other books) ... the characters are people from Crypto!.. One the whole, a great read.. both your's above as well as the book. Thanks!	[reply]