Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Genetic Programming or breeding Perls

by gumpu (Friar)
on Sep 05, 2000 at 21:06 UTC ( #31147=CUFP: print w/ replies, xml ) Need Help??

This is a follow up on the discussion Be a monkey! about getting monkeys to write a Shakespeare novel by randomly typing on a typewriter. I have translated this into the following programming challenge:

Write a Perl program that given a limited number of statements, creates a number that comes as close as possible to some target number. The program starts with

my $x=1; my $y=1;

which is followed by a combination of at most 30 statements chosen from the following 5 statements:

$x += 1; $x =$y; $x |=$y; $x +=$y; $y =$x;

The last statement of the program is the result of the program.

For instance if the target number is 10, a possible solution is:

my $x=1; my $y=1; $x+=1; $y=$x; $x+=$y; $x+=1; $y=$x; $x+=$y;

Such a program is easy to find for small number. It is harder for larger numbers. For instance a solution for 10512 is:

my $x=1;my $y=1; $x+=1; $x+=1;$x+=$y ;$y=$x ;$x+=$y ;$x+=$y ;$y=$x ;$x +=$y ; $x+=$y ;$x+=$y ;$x+=$y ;$x+=1;$x+=$y ;$x+=$y ;$y=$x ;$x+=$y ;$x+=$y +; $x+=$y ;$y=$x ;$x+=$y ;$x+=$y ;$x+=$y ;$y=$x ;$x+=$y ;$x|=$y ;$y=$x +; $x+=$y ;$x+=$y ;

There are 6^30 = 2.2 * 10^23 ways to combine the 5 statements into a program of at most 30 of such statements. 6^30 = 2.2*10^23, which makes it quite impossible to just search all possible combinations for an answer. Randomly generating such programs is about as likely to find an answer as a monkey writing one or two lines of Shakespeare by randomly typing.

However the following program does find an exact solution or a close approximation to the target number. It uses the technique of genetic programming, based on natural selection.

It works with a population of individuals. Each individual has 30 genes. Each gene is a Perl statement. To evaluate the fitness of an individual these statements are stringed together into a Perl program. This program is then evaluated using eval(). The better this program is as creating the target number, the fitter the individual.

The population cycles through a number of generations, in which a new population is generated (bred) from the old population. Pairs of individual are formed and they get two children. Each child is either an exact copy of one of one of its parents or a recombination of its parents. Which individuals are to become parents is based on their fitness (how well they did at reaching the target number). The fitter the individual the more offspring it has.

After this all parents die and the cycle repeats.

Generation after generation the population will become better at creating the target number. The nice quality of this technique is that no knowledge about the search space is needed. The only thing you need to define is a function that tells how good a particular solution is. Try changing the Gene base (possible Perl statements) or the target number. Not that there is no intelligence behind the process, but it can come-up with surprising solutions. Would be a good technique to create obfuscated code. :)

It is ofcourse not a solution to all problems, for instance it does not work if there is only one good solution in the entire search-space. There must be intermediate solutions too. However it can be used to solve very hard problems. For instance I use a modified version of this program to solve the problem of how to pack 256 connections of varying capacity into 32 links. This has a search space of 2*10^385. Searching this would require more time then the life time of the current universe and many that come after it :)

This is my first attempt of using OO in Perl. So please point-out any improvements. (I used the cookbook and the Perl FAQ as information sources).

UPDATE: RE-ADDED CODE

#!/usr/bin/perl -w # An implementation of Genetic programming in Perl. use strict; package GenePool; sub new { my $class = shift; my $self = {}; bless ($self, $class); # Each gene is a Perl statement. $self->{GENES} = ['$x+=1 ;', '$x=$y ;', '$y=$x ;', '$x|=$y ;', '$x+=$y ;', ' ;']; return $self; } # Randomly select a gene from the gene pool sub random_gene { my $self = shift; return ${$self->{GENES}}[rand(@{$self->{GENES}})]; } package Individual; sub new { my $class = shift; my $self = {}; bless ($self, $class); $self->{LENGTH} = 32; $self->{GENES} = []; # An array of perl statements. $self->{NEW_GENES} = []; # The genes of the individual in # the next generation. return $self; } # create the genes for this individual by randomly choosing # 30 Genes (Perl statements) from the GenePool. sub create { my $self = shift; my $genebase = GenePool->new(); push (@{$self->{GENES}}, 'my $x=1;my $y=1;'); for my $i (1 .. $self->{LENGTH}) { push (@{$self->{GENES}}, $genebase->random_gene()); } } # Convert the genes into an string of statements that can # be evaluated using eval(). sub get_code { my $self = shift; my $code = ""; map { $code .= $_} (@{$self->{GENES}}); return $code; } # Set the new set of genes and do some mutation. sub set_new_genes { my $self = shift; @{$self->{NEW_GENES}} = @_; # Once in a while there is an error during copying and a # gene is mutated. if (rand(1.0) < 0.005) { my $mutate = 1 + int(rand(@{$self->{NEW_GENES}} - 1)); my $genebase = GenePool->new(); ${$self->{GENES}}[$mutate] = $genebase->random_gene(); } } # Get a copy of the genes sub get_genes { my $self = shift; return @{$self->{GENES}}; } # Switch the new genes with the old genes sub switch_genes { my $self = shift; @{$self->{GENES}} = @{$self->{NEW_GENES}}; } package Population; sub new { my $class = shift; my $self = {}; bless ($self, $class); $self->{SIZE} = 1999; # The population size $self->{INDIVIDUALS} = []; $self->{FITNESSES} = []; # The fitness of each individual for my $i (0 .. $self->{SIZE}) { my $individual = Individual->new(); $individual->create(); push (@{$self->{INDIVIDUALS}}, $individual); } return $self; } # Determine the fitnes of all individuals in the population sub survival { my $self = shift; my $fitnesses = $self->{FITNESSES}; my $i = 0; foreach my $individual (@{$self->{INDIVIDUALS}}) { my $value = eval($individual->get_code()); ${$fitnesses}[$i] = objective($value); ++$i; } } # Scale the fitnes values such that they are all between 0 and 1 # and such that the total sum is 1. sub scale { my $self = shift; my $fitnesses = $self->{FITNESSES}; my $i; my $size = $self->{SIZE}; my $min = ${$fitnesses}[0]; for ($i = 0; $i < $size; ++$i) { $min = ${$fitnesses}[$i] if (${$fitnesses}[$i] < $min); } my $sum = 0.0; for ($i = 0; $i < $size; ++$i) { ${$fitnesses}[$i] -= $min; $sum += ${$fitnesses}[$i]; } for ($i = 0; $i < $size; ++$i) { ${$fitnesses}[$i] /= $sum; } } # Function that determines how fit an individual is # That is how close it comes to the objective. (target number) # The higher the number the fitter the individual. sub objective { my $value = shift; return -abs(10512 - $value); } # Display the fitest individual sub statistics { my $self = shift; my $fitnesses = $self->{FITNESSES}; my $i; my $index = 0; my $size = $self->{SIZE}; my $max = ${$fitnesses}[0]; for ($i = 0; $i < $size; ++$i) { if (${$fitnesses}[$i] > $max) { $max = ${$fitnesses}[$i]; $index = $i; } } my $individual = ${$self->{INDIVIDUALS}}[$index]; print " ", eval($individual->get_code()), "\n"; print $individual->get_code(), "\n"; } # Randomly select an individual from the population. # The fitter an individual it there more likely it is it # is chosen. sub choose { my $self = shift; my $f = rand(1.0); my $index = 0; my $sum = 0.0; foreach my $fitnes (@{$self->{FITNESSES}}) { $sum += $fitnes; return ${$self->{INDIVIDUALS}}[$index] if $sum >= $f; ++$index; } die "can't select an individual"; } # Generate a new poplation out of the old population by # letting the fitest individuals mate. sub breed { my $self = shift; my $size = $self->{SIZE}; for (my $i = 0; $i < $size;) { # Get the genes from two randomly chosen (fitest) individuals my @genes1 = ($self->choose())->get_genes(); my @genes2 = ($self->choose())->get_genes(); my @new_genes1 = @genes1; my @new_genes2 = @genes2; # Now either # (1) copy both genes into the new population or # (2) select a random cut point and swap the two gene # halves, that is # xxxxxxxxx becomes xxxxxyyyy # yyyyyyyyy yyyyyxxxx if (rand(1.0) > 0.5) { my $cut = 1 + int(rand(@genes1 - 1)); splice @new_genes1, $cut; splice @new_genes2, $cut; push @new_genes1, (splice @genes2, $cut); push @new_genes2, (splice @genes1, $cut); } ${$self->{INDIVIDUALS}}[$i++]->set_new_genes(@new_genes1); ${$self->{INDIVIDUALS}}[$i++]->set_new_genes(@new_genes2); } } # swap the old genes with the newly created genes to get the new # population. sub switch { my $self = shift; foreach my $individual (@{$self->{INDIVIDUALS}}) { $individual->switch_genes(); } } package main; my $population = Population->new(); for my $generation (0 .. 100) { $population->survival(); $population->scale(); $population->statistics(); $population->breed(); $population->switch(); }

Comment on Genetic Programming or breeding Perls
Select or Download Code
(Ovid) RE: Genetic Programming or breeding Perls
by Ovid (Cardinal) on Sep 06, 2000 at 01:26 UTC
    I would just like to say that I, for one, am seriously impressed. I'm going to have to start playing around with this and see what I can come up with (though I can tell that this will be a steep learning curve).

    Cheers,
    Ovid

RE: Genetic Programming or breeding Perls
by Nooks (Monk) on Sep 06, 2000 at 03:32 UTC
    Well done! I tried to do this several months ago and failed because I didn't hit on the idea of using a sequence of individual statements. (I tried to define a small language I could eval and had trouble making it both expressive and easy to evaluate.)

    I notice you left out division---presumably you don't want to have to deal with errors from the eval statement?

      He also left out multiplication and subtraction. Presumably it was to minimize the operators used. I was a bit surprised at "|=" myself.

        Presumably it was to minimize the operators used.

        Yup. It thought it would make it easier to understand the problem (the programming challenge). It is also to limit the number of possible solutions. If there are many building blocks the search space is large but also full of good solutions. Then even a random search works. With this I hoped to demonstrate that even with limited building blocks the algorithm can work to a good solution. (It would be interesting to create a Perl program that can determine the solution density, say using monte carlo or so).

        I was a bit surprised at

        |=

        myself.

        :) I added that to show that the algorithm can come up with solutions that are not easily visable to humans. You can even add things like

        $x ^= 715; $x >>= 1;

        and it will come up with surprising results.

        Have Fun

      I notice you left out division---presumably you don't want to have to deal with errors from the eval statement?

      Yup that was one of the reasons. It would have made the program longer that it already is.

      Programs with errors in them are not a problem as long as they have different fitness values. For instance as long as program with one error in it has a higher fitness as a program with two errors in it. If all programs that do not evaluate correctly, result in the same fitness value there is no way for the algorithm to gradually move to a better solution.

      GP works only if there are intermediate solutions.

      Have Fun

RE: Genetic Programming or breeding Perls
by nop (Hermit) on Sep 06, 2000 at 14:49 UTC
    There's a rich literature on genetic programming. For example, check out John Koza's stuff at www.cs.bham.ac.uk/~wbl/biblio/gp-html/JohnKoza.html.

    Another interesting thread is Tom Ray's Tierra. Realizing that computer programs are just too brittle to evolve well (as one bad character in a program can render it all useless, vs. the laxness in say DNA coding and codons), Ray proposed an interesting computer architecture loosely based on biology that supports evolving programs. www.hip.atr.co.jp/~ray/tierra/

    Another interesting idea: Danny Hillis (of Connection Machine fame ) suggested in a 1990 paper that programs evolve better when confronted with "parasites". Hillis claimed he could evolve better sorting programs when they were pitted against an evolving landscape of "hard" sort sequences (which in turn were generated by different program evolving hard sequences with the goal of stumping the sorters... see citeseer.nj.nec.com/context/15365/9503

    A good site for such topics is SFI www.santafe.edu

    This field hasn't generated much of practical significance yet, but it is cool.
      This field hasn't generated much of practical significance yet, but it is cool.

      Well, I beleive that there are a number of chip frabricators that are using GA to minimize the effect of parasitic transistors, also to minimize labour in layout.

      Danny Hillis's idea does work quite nicely if you can get it right.

      Another idea is to implement a sex difference. By having the genome translate into two different phenomes (for instance in a tracker implementation make sex X build from the genome from the left and sex X from the right) and forcing phenome/genomes to mate with the opposite (you can have as many sexes as you want, I played around with 3 sexes, when two go together they produce a child of the third type. Got the idea from a Piers Anthony book, from the Tarot series.) this minimizes the chance of getting stuck on a local min/maxima. Partially because the best solution will be forced to breed with the other sex, which is evaled differently, thus ensuring that 'good' genes get mixed with 'bad' genes (from either sexes POV). This means the randomness (im sure there is a more appropriate word) in the genepool stays higher.

      Another idea is implement chromosomes. Ie split the genome up into smaller packets that can mutated/bread individually. That way random insertions dont fubar the whole genome, just the chromosome they are in.

      I found the challenging aspect, and perhaps the limiting spect is coming up with an appropriate fitness function. If you can score your creatures then you can can solve the problem they arte trying to solve so whats the point? I mean not really but you get the idea.

      For instance if you do a tracker implementation, by very subtly changing the fitness function you basically kill any possiblity of solving the problem (eating all of the dots).

      From what I know there is really no way to know that your fitness function will enable to population to improve.

      Sorta reminds me of the classic night.day tank problem in neural nets actually..

      Yves

      --
      You are not ready to use symrefs unless you already know why they are bad. -- tadmc (CLPM)

RE: Genetic Programming or breeding Perls
by joe (Acolyte) on Sep 23, 2000 at 02:10 UTC
    Where is the code for this?
      Oy! Good point! Clicking on the <Read More> link doesn't pull up the code! I clicked it several times. Perhaps the <Read More> link should be titled "How to Keep Monks in Suspense!"

      Cheers,
      Ovid

      Update: Looks like the problem has been fixed.

      Join the Perlmonks Setiathome Group or just go the the link and check out our stats.

        The code used to be attached to this article. It was edited by someone other than I. I will try and put it back.

        Have Fun

RE: Genetic Programming or breeding Perls
by Petruchio (Vicar) on Sep 24, 2000 at 10:26 UTC
    I think I can pretty safely say that no number of monkeys, under any circumstances, will ever reproduce a single Shakespeare novel... since he was a playwright, and not a novelist. Sorry to nitpick. :-)

    Ay, mistress, and Petruchio is the master;
    That teacheth tricks eleven and twenty long
    - Shakespeare

      I believe your statement is false. Shakespeare was a human. Humans are primates. Primates are monkeys. Thus, a monkey did in fact write all Shakespearean novels! Imagine what any number of monkeys, under any circumstance could do! Now who's nitpicking? :-) cheers, Thomax "What find I here? Fair Portia's counterfeit! What demi-god Hath come so near creation?" -- Shakespeare (Bassanio in The Merchant of Venice)
        Actually monkeys are primates but not the other way around.

        The primates include lemurs, monkeys, and apes. The apes are the ones without tails, and we are apes. Among the great apes the chimpanzee and bonobo are closest, then we join in, then gorillas, and the orangutang is more distant. This is measuring by percentage of genetic material that is the same.

        Yes, you heard it right. We are biologically more similar to chimps than either we or chimps are to gorillas.

        Thou liest, thou jesting monkey, thou - The Tempest :-)

        As tilly kindly pointed out, only some primates are monkeys, and humans are not amongst these.

        Some people might still say that, "a monkey did in fact write all Shakespearean novels". Predicating a quality (written by a monkey) to a non-existent object (a Shakespearean novel) is logically problematic. And these problems are very interesting, though I'm sorry to say that I no longer remember them well enough to speak knowledgeably about them.

        In any case, it doesn't matter. Let's say a monkey did write all Shakespearean novels (in which case so did each of the squirrels in my yard... and indeed, they wrote themselves, too. The novels that is, not the squirrels). Still, I said a monkey would never reproduce a single Shakespeare novel, and that holds true.

        It interests me, however, that in the same post you claim both to be a monkey and to be picking nits... at least you seem well-groomed. ;-)

        Now, God help thee, poor monkey! - Macbeth

        (Monkeys have prehensile tails. Apes do not. Shakespeare might be counted as an ape, but for some minor taxonomical differences.)
RE: Genetic Programming or breeding Perls
by dumpest (Novice) on Sep 27, 2000 at 02:37 UTC
    WOW... this was really fun and interesting code... I have a question...isn't there a patent on Genetic Programming algorithms? I remember hearing about this once before...but maybe it was in relation to something else... Anyone have any information about this...?
      I know that the general idea is not new... if I remember correctly, these sort of ideas run at least back to the '60s, when they were known as "branch and bound algorithms". Thus my hunch is that while a particular implementation might well be under active patent, loads of prior art could be found for anything more general. So there's $0.02 worth of insight from someone with only $0.01 worth of knowledge on the topic. :-)
        If I remember correctly, these sort of ideas run at least back to the '60s,

        Yup, and even further back. The bibliography of Genetic Algorithms by David Goldberg lists books from 1951 and 1952. Also they basically 'stole' the idea from mother nature :)

        Have Fun

        No, sorry, branch-and-bound is not genetic. Actual true genetic programming started in the '70's, with Holland.
RE: Genetic Programming or breeding Perls
by runrig (Abbot) on Sep 27, 2000 at 03:38 UTC
    Very cool :) One small nitpick though, is the use of map in void context. Specifically (in Individual->get_code):
    my $code = ""; map { $code .= $_} (@{$self->{GENES}});
    could be:
    my $code = join('', @{$self->{GENES}});
    Also, if your Individual->create() method returned $self at the end, you could change this (in Population->new):
    for my $i (0 .. $self->{SIZE}) { my $individual = Individual->new(); $individual->create(); push (@{$self->{INDIVIDUALS}}, $individual); }
    To this:
    push (@{$self->{INDIVIDUALS}}, Individual->new->create) for 0..$self-> +{SIZE};
    Or maybe even this:
    $self->{INDIVIDUALS} = [ map {Individual->new->create} 0..$self->{SIZE +} ];
    But now I'm probably being too nit-picky :) </code>

      Feel free to nitpick some more :), the replacement you suggest look very elegant. It's my first "complicated" perl program so there are bound to be more points that can be improved.

      Have Fun

RE: Genetic Programming or breeding Perls
by Anonymous Monk on Sep 30, 2000 at 02:36 UTC
    Whoa -- someone else hit upon the same idea that I did! Two years ago, whilst a Sophmore (yeah, in HS -- I'm only now a Senior) I needed a science project for biology. Having previously fallen in love with Perl, and being interested in GP, I also wrote my own GP system in Perl. Your system is considerably different from mine -- probably mostly because your implementation isn't quite true to the definition of GP as defined by Koza. Normally, GP individuals are actual program trees, with branching constructs and multiple layers. This makes crossover harder than just string manipulation -- you have to keep track of the inherent structure of the individual. Many people writing GP in C or Java use pointers to construct the tree. Perl being what it is, I wrote a tokenizer that tokenized the syntacticly correct Perl individuals, and munged them as strings. Not as "elegant," nor as fast, but muchly fun. ;> Anyways, I was thinking, sooner or later, of throwing the code up on CPAN -- but later is the key word. I'm using my Perl GP implentation to do some research for the Westinghouse competition, which demands a 20-page paper. Which is due October 2nd. So I'm a little busy right now.. ;> Incidentally, the paper is about making distributed GP more efficient -- and I of course wrote my own client/server GP implentation in Perl. Gotta love threaded Perl. Anyways, I'll wander through here again once I have Free Time again, and post again -- hopefully with a little more clarity and content.
      Aargh -- the monastery ate my linefeeds and login. Oops. Take pity on the initiate and forgive the lack of formatting.
RE: Genetic Programming or breeding Perls
by Crayman (Acolyte) on Oct 16, 2000 at 07:01 UTC
    cool concept and the implementation looks well done. i need to study it more in order to make any detailed or deep comments. extremely minor nitpiks and some questions. 1) someone already covered the map in void context. 2) my $min = ${$fitnesses}[0]; for ($i = 0; $i < $size; ++$i) { # set $i = 1 since you used 0th index to load $min. # same thing for the $max value in a diff routine later. $min = ${$fitnesses}[$i] if (${$fitnesses}[$i] < $min); } 3) sub random_gene { my $self = shift; return ${$self->{GENES}}[rand(@{$self->{GENES}})]; why do you allow perl to truncate the above, which provides a fair distribution for the 0th index, but you handle explicitly below, which never includes the 0th index? there may be a good reason, but i couldn't figure out what it was. if (rand(1.0) < 0.005) { my $mutate = 1 + int(rand(@{$self->{NEW_GENES}} - 1)); 4) if (rand(1.0) > 0.5) { my $cut = 1 + int(rand(@genes1 - 1)); # i understand it here - since replacing the entire gene from beginning might not make sense 5) i factored out the following from the code so I could stick everything on the top and play: my $Genes = ['$x+=1 ;', '$x=$y ;', '$y=$x ;', '$x|=$y ;', '$x+=$y ;', + ' ;']; my $IndivGeneLen = 32; my $PopSize = 1999; my $Target = 10512; my $NumGenerations = 100; next thought - this can be generalized into a module. very cool, thanks!
    ___cliff_AT_rayman.com___
      i figured out the answer to #3. the declaration of the lexical variables occur in index 0, that makes them position dependent. therefore index 0 has to be skipped during mutations and splicing. something like this might be worthy of a comment in the code. ___cliff rayman___cliff_AT_rayman.com___
      i figured out the answer to #3.  the declaration of the lexical
      variables occur in index 0, that makes them position dependent.
      therefore index 0 has to be skipped during mutations and splicing.
      something like this might be worthy of a comment in the
      code.
      ___cliff rayman___cliff_AT_rayman.com___
      
Re: Genetic Programming or breeding Perls
by t'mo (Pilgrim) on May 03, 2001 at 00:22 UTC

    I realize that this thread may not be followed anymore, but if it does (and especially if gumpu is out there listening), it seems to me that each Individual ought to know its own fitness. In other words, FITNESS should be a field in each Individual, and FITNESSES should not be field in Population.

    Any thoughts?

      You are right. FITNESS should be a field in each Individual. Fitness is something associated with an individual. Population should only keep track of the statistics of all the individuals in the population. Bad design choice; probably had a case of 'premature optimization' :)

      Have Fun

      I disagree. Fitness is a combination of the individual abilities and the constraints of the environment. If you put an individual in a different environment its fitness for that environment will change.

        Yes, fitness is related to the environment. But...

        In this case, if the algorighm becomes simpler if FITNESS is more tightly associated with the INDIVIDUAL. For example, this:

        sub choose { my $self = shift; my $f = rand(1.0); my $index = 0; my $sum = 0.0; foreach my $fitnes (@{$self->{FITNESSES}}) { $sum += $fitnes; return ${$self->{INDIVIDUALS}}[$index] if $sum >= $f; ++$index; } die "can't select an individual"; }

        becomes:

        sub choose { my $self = shift; my $f = rand(1.0); my $sum = 0.0; foreach my $individual (@{$self->{INDIVIDUALS}}) { $sum += $individual->{FITNESS}; return $individual if $sum >= $f; } die "can't select an individual"; }

        which I consider an improvement, not only due to the fact that there's (a little) less code, but the concept of choosing an individual and not a fitness is emphasized.

        Finally, I must admit that I have a bias in relation to the idea you presented. Yes, environment does determine fitness. However, what if you're trying to evolved generalized behavior, i.e., a program that will perform well in any environment, and not simply the one's it was trained in? The little bit of work I've done with GP has been focused in the direction of trying to avoid such "over-training" or specialization.

Re: Genetic Programming or breeding Perls
by kwilliams (Sexton) on May 24, 2001 at 19:26 UTC
    I thought I should point out the article that Brad Murray and I wrote a couple years ago for The Perl Journal on GAs in Perl. It's in issue #15.
Re: Genetic Programming or breeding Perls
by Anonymous Monk on May 24, 2001 at 19:30 UTC
    I would like to point out that the code given does not implement Genetic Programming. It implements Genetic Algorithms. The difference is significant. -- John Porter
Re: Genetic Programming or breeding Perls
by rtpc (Initiate) on Aug 21, 2001 at 18:13 UTC
    Hhhmm, I've written my on vesion, with somewhat different operators (genes) to help with the speed of convergence. Actually, I changed the goal too, it now looks for an individual number, repedatively; it eventually prints ot JUST ANOTHER PERL HACKER. But I'm having a problem, quite often the populaion i overrun with exactly the same genes. I really need hlp figuring out how this is happening. The big differences are the choose and breed routines, which are called select and mate in this code. The mate routine selects some of the most fit oganisms, breeds them, and replaces some of the least fit organisms with the offspring. Here's the code:
    #!/usr/bin/perl package GenePool; sub new { my $class = shift; my $self = {}; bless($self, $class); $self->{GENES} = [ '$x+=1;', '$y+=1;', '$X-=1;', '$y-=1;', '$X+=$y;', '$y+=$x;', '$x-=$y;', '$y-=$x;', ';' ]; return $self; } sub gene { my $self = shift; return ${$self->{GENES}}[rand(@{$self->{GENES}})]; } package Organism; sub new { my $class = shift; my $self = {}; bless($self, $class); $self->{LENGTH} = 12; $self->{GENES} = []; $self->{NEXTGEN} = []; return $self; } #create genes for this organism by selecting genes from the genepool sub initialize { my $self = shift; my $genepool = GenePool->new(); foreach (1..$self->{LENGTH}){ push(@{$self->{GENES}}, $genepool->gene()); } return; } sub set_nextgen { my $self = shift; @{$self->{NEXTGEN}} = @_; return; } sub age { my $self = shift; @{$self->{GENES}} = @{$self->{NEXTGEN}} unless $#{$self->{NEXTGEN} +} == -1; return; } sub get_genes { my $self = shift; return @{$self->{GENES}}; } #return code to be evaluated sub get_code { my $self = shift; my $code = ""; foreach(@{$self->{GENES}}){ $code .= $_; } return $code; } package Population; sub new { my ($class, $size) = @_; my $self = {}; bless($self, $class); $size++ if $size%2 == 0; $size+=2 if ($size+1)%4 != 0; $self->{SIZE} = $size; $self->{ORGANISMS} = []; $self->{FITNESSES} = []; $self->{OBJECTIVE} = ""; $self->{VERBOSE} = ""; $self->{MIDDLE} = 0; foreach (0..$self->{SIZE}){ my $organism = Organism->new(); $organism->initialize; push(@{$self->{ORGANISMS}}, $organism); } return $self; } sub set_verbosity { my $self = shift; $self->{VERBOSE} = shift; return; } sub get_verbosity { my $self = shift; return $self->{VERBOSE}; } sub set_objective { my $self = shift; $self->{OBJECTIVE} = shift; return; } sub get_objective { my $self = shift; return $self->{OBJECTIVE}; } #determine the fitnesses of each organism sub fitness { my $self = shift; my $f = $self->{FITNESSES}; my $i = 0; print STDERR "\nFITNESS OF GENEPOOL evaluation(code_value)\n" if $self +->{VERBOSE}; foreach my $organism (@{$self->{ORGANISMS}}){ my $val = eval('my $x = 1; my $y = 1;' . $organism->get_code() +); ${$f}[$i] = $self->evaluate($val); print STDERR ${$f}[$i] . "($val)" if $self->{VERBOSE}; $i++; } print STDERR "\n" if $self->{VERBOSE}; return; } sub evaluate { my ($self, $val) = @_; return -abs($self->{OBJECTIVE} - $val); } #scale the fitnesses so they are less than one and add to one sub scale_fitness { my $self = shift; my $fitnesses = $self->{FITNESSES}; my $i; my $min = ${$fitnesses}[0]; my $size = $self->{SIZE}; my $sum = 0.0; for ($i = 0; $i <= $size; ++$i ){ $min = ${$fitnesses}[$i] if ${$fitnesses}[$i] < $min; } for ( $i = 0; $i <= $size; ++$i ){ ${$fitnesses}[$i] -= $min; $sum += ${$fitnesses}[$i]; } for ( $i = 0; $i <= $size; ++$i ){ if($sum != 0){ ${$fitnesses}[$i] /= $sum; }else{ ${$fitnesses}[$i] = 1/$#{$fitnesses}; } } return; } #pick the fitest individual, excepting those we are told to ignore # (which are the previous picks) sub select { my ($self, $type, @excludelist) = @_; my $index = 0; my ($fitest, $ffit); $ffit = 1.0 if $type eq 'least fit'; foreach my $fitness (@{$self->{FITNESSES}}){ my $next = ""; foreach (@excludelist){ if($index == $_){ $next = "next"; last; }} $index++; next if $next eq "next"; if ( (($type eq 'fitest') && ($ffit <= $fitness)) || (($type eq 'least fit') && ($ffit >= $fitness)) ){ $fitest = $index - 1; $ffit = $fitness; } } return ${$self->{ORGANISMS}}[$fitest], $fitest; } sub find_middle { my $self = shift; my $f = 0; foreach my $fitness (@{$self->{FITNESSES}}){ $f += $fitness; } $self->{MIDDLE} = $f/(@{$self->{FITNESSES}}+1); return; } #find the individual who's fitness is nearest the "middle" sub find_nearest_middle { my $self = shift; my $middle = $self->{MIDDLE}; my ($nearest, $n, $i); foreach (@{$self->{FITNESSES}}){ if ( abs($nearest - $middle) > abs($_ - $middle)){ $nearest = $_; $n = $i; } $i++; } return ${$self->{ORGANISMS}}[$n]; } sub mutate { my @genes = @_; my $genepool = GenePool->new(); foreach my $i (0..$#genes){ $genes[$i] = $genepool->gene if rand(1.0) > 0.825; } return @genes; } #produce offspring for the next generation sub mate { my $self = shift; my $size = $self->{SIZE}; my @minexcludes = (); my @maxexcludes = (); for ( my $i = 0; $i < $size; $i+=4 ){ my $chance = rand(1.0); if($chance > 0.5){ my (@genes_one, @genes_two, $org_one, $org_two, $index_one +, $index_two); ($org_one, $index_one) = $self->select('fitest', @maxexclu +des); push(@maxexcludes, $index_one); @genes_one = $org_one->get_genes(); ($org_two, $index_two) = $self->select('fitest', @maxexclu +des); push(@maxexcludes, $index_two); @genes_two = $org_two->get_genes(); my @new_genes_one = @genes_one; my @new_genes_two = @genes_two; my $point = 1 + int(rand(@genes_one - 1)); splice @new_genes_one, $point; splice @new_genes_two, $point; push @new_genes_one, (splice @genes_two, $point); push @new_genes_two, (splice @genes_one, $point); my (undef, $min_one) = $self->select('least fit', @minexcl +udes); push(@minexcludes, $min_one); my (undef, $min_two) = $self->select('least fit', @minexcl +udes); push(@minexcludes, $min_two); ${$self->{ORGANISMS}}[$min_one]->set_nextgen(@new_genes_on +e); ${$self->{ORGANISMS}}[$min_two]->set_nextgen(@new_genes_tw +o); }elsif($chance < 0.05){ if(rand(1.0)>0.5){ my (@genes_one, $org_one, $index_one); ($org_one, $index_one) = $self->select('fitest', @maxe +xcludes); push(@maxexcludes, $index_one); @genes_one = $org_one->get_genes(); my @new_genes_one = @genes_one; @new_genes_one = mutate(@genes_one); my (undef, $min_one) = $self->select('least fit', @min +excludes); push(@minexcludes, $min_one); ${$self->{ORGANISMS}}[$min_one]->set_nextgen(@new_gene +s_one); }else{ my (@genes_two, $org_two, $index_two); ($org_two, $index_two) = $self->select('fitest', @maxe +xcludes); push(@maxexcludes, $index_two); @genes_two = $org_two->get_genes(); my @new_genes_two = @genes_two; @new_genes_two = mutate(@genes_two); my (undef, $min_two) = $self->select('least fit', @min +excludes); push(@minexcludes, $min_two); ${$self->{ORGANISMS}}[$min_two]->set_nextgen(@new_gene +s_two); } } } return; } sub generate { my $self = shift; foreach my $organism (@{$self->{ORGANISMS}}){ $organism->age(); } return; } package Statistics; sub new { my $class = shift; my $self = {}; bless($self, $class); $self->{GRADE} = [ ' ', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I +', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z' ]; return $self; } sub grade_fitest { my ($self, $pop) = @_; my ($best, undef) = ${$pop->{ORGANISMS}}[$pop->select('fitest')]; my $indexorg = eval('my $x=1; my $y=1;' . $best->get_code()); my $index = $indexorg; my $obj = $pop->get_objective(); if($index < 0){ $index = abs($index) + 1 + $obj; } return "*", $indexorg if $index > 26; return ${$self->{GRADE}}[$index], $indexorg; } sub show_fitest { my ($self, $pop) = @_; my ($best, undef) = ${$pop->{ORGANISMS}}[$pop->select('fitest')]; return $best->get_code(); } sub show_middle { my ($self, $pop) = @_; $pop->find_middle; my $middle = $pop->find_nearest_middle(); my $indexorg = eval('my $x=1; my $y=1;' . $middle->get_code()); my $index = $indexorg; my $obj = $pop->get_objective(); if($index < 0){ $index = abs($index) + 1 + $obj; } return "*", $indexorg if $index > 26; return ${$self->{GRADE}}[$index], $indexorg; } package main; $|=1; my @string = (10, 21, 19, 20, 0, 14, 15, 20, 8, 5, 18, 0, 16, 5, 18, 1 +2, 0, 8, 1, 3, 11, 5, 18); my (@kings, @halloffame); foreach my $target (@string){ my $population = Population->new(49); $population->set_objective($target); #$population->set_verbosity(1); my $generation = 0; my $found = 0; while(!$found){ print "Generation[$generation]"; $population->fitness(); $population->scale_fitness(); my $statistics = Statistics->new(); my ($grade, $val) = $statistics->grade_fitest($population); my $fitest = $statistics->show_fitest($population); print join("", @kings) . $grade; $population->mate(); $population->generate(); if($val == $target){ print "\n$fitest\n"; $found++; push(@kings, $grade); push(@halloffame, $fitest); }elsif($population->get_verbosity){ my ($grade, $val) = $statistics->show_middle($population); print " ($grade, $val)"; } print "\n" unless $found; $generation++; } } print "\nHALL OF FAME\n", join("\n", @halloffame);

      "But I'm having a problem, quite often the populaion i overrun with exactly the same genes. I really need hlp figuring out how this is happening."

      There can be a number of reasons for that. (1) Your population size is too small (GP works best with large populations) (2) Your mutation rate is too low (but your program it looks fine), or (3) Individuals with a low fitness have a too low probability to reproduce. If only the fitest individuals are allowed to reproduce they will take over the whole population. So weaker individuals have to have a chance to reproduce too. The probability for this depends on the population size. For a large population it can be low, for a small population is has to be high.

      Hope that helps

      Have Fun

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://31147]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (9)
As of 2014-12-29 11:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (187 votes), past polls