http://www.perlmonks.org?node_id=307330


in reply to Perl, Genetic Algorithms and Encoding...

Greetings!

Ok, so before we get started, let me preach for a few minutes. A genetic algorithm is an incredibly expensive "directed stochastic" search of a very large problem domain in which an elegant algorithmic method for finding solutions (or approximations to solutions) does not exist. Due to that fact, once you've resigned to using a genetic algorithm to attempt to approximate a solution to your problem domain, you are making the statement: "I can't write an algorithm other than an exhaustive search of some sort to find this answer in a reasonable amount of time, so I am going to use a heuristic to attempt to approximate it in the time I have alotted." You'll notice I used the word "time" twice. Perl (although it's my favoritest language ever to program in) really is a slow language for a great many tasks. And programming a genetic algorithm in perl (other than for purposes of prototyping a GA) is usually not a "good idea" due to the fact that you're sacrificing orders of magnitude of performance over the exact same algorithm implemented in C or C++.

But I don't *know* C or C++, I hear you say. Well, guess what... the subset of C or C++ you neeed to learn in order to effectively use any of the *fine* genetic algorithm libraries out there is minimal at best. If I could, let me direct you to GAlib: Matthew's Genetic Algorithm Library. Written for use in C++, it already provides an entire framework for programming genetic algorithms with very little effort, and it also provides buckets of sample programs to show you how to use it. You can most likely modify one of the sample problems slightly, and get what you want. Even better there is a thesis available for you to read about using GAlib for jobshop scheduling available here! Things just don't get any nicer than that.

Now, as to how you would actually encode such a thing, the problem that you have is that you not only have to worry about encoding of the problem domain into a genome... but due to the nature of your data, your crossover and mutation operators are going to have to be intelligent enough to not render a genome invalid (making up a class number that doesn't exist, for example) or your selection criteria is going to have to check each genome for "validity."

Good luck, but just remember, there is *volumes* of literature regarding constraint based scheduling using genetic algorithms. With a bit of googleing you should find a thesis, dissertation, or sample implementations to get you on your way.