Evolutionary computation in Perl | ||
---|---|---|
<<< Previous | Introduction: an evolutionary algorithm step by step | Next >>> |
The programs shown here have not been optimized for efficiency or elegance, but rather for clarity and brevity. Even so, they can probably be improved, so I would like to hear any comment or criticism you have on them. |
Although Darwin's view of forces at work in the evolutionary process seems quite simple, putting them in black on white in an actual algorithm is something completely different. What kind of modifications should be applied to the data structures? What do we do with modified data structures? Should we put them in the population, just like that? Should we (gulp), like, off some other member of the population? If so, which one?
Historically, the first solutions to this conundrum seemed to be: create a population of possible solutions, take a member of the population, change it until you obtain something better (from the point of view of how close it is to the solution, that is, its fitness), and then eliminate one of the least fit members of the population, possible the least fit. That way, every time you introduce a new member of the population you are going one step up the evolutionary ladder (just like the X-Men.)
We will work, from now on, on the following problem: evolve a population of ASCII strings so that they are as close as possible to the string yetanother. The fitness of each (initially random) string will be the difference between the ASCII values of the letter in each position and the ASCII value of the letter in the target string. Optimal strings will have 0 fitness, so the process will try to minimize fitness. The job is done by the following Perl program (1stga.pl, whitespace and POD comments have been suppressed):
#Declare variables (1) my $generations = shift || 500; #Which might be enough my $popSize = 100; # No need to keep it variable my $targetString = 'yetanother'; my $strLength = length( $targetString ); my @alphabet = ('a'..'z'); (1) my @population; #Declare some subs (not the best place to do it, but we are going to #need them (2) sub fitness ($;$) { my $string = shift; my $distance = 0; for ( 0..($strLength -1)) { $distance += abs( ord( substr( $string, $_, 1)) - ord( substr( $targetString, $_, 1))); } return $distance; (2) } sub printPopulation { for (@population) { print "$_->{_str} -> $_->{_fitness} \n"; } } (3) sub mutate { my $chromosome = shift; my $mutationPoint = rand( length( $chromosome->{_str})); substr( $chromosome->{_str}, $mutationPoint, 1 ) = $alphabet[( rand( @alphabet))]; (3) } #Init population (4) for ( 1..$popSize ) { my $chromosome = { _str => '', _fitness => 0 }; for ( 1..$strLength ) { $chromosome->{_str} .= $alphabet[( rand( @alphabet))]; } $chromosome->{_fitness} = fitness( $chromosome->{_str} ); push @population, $chromosome; (4) } #Print initial population printPopulation(); #Sort population @population = sort { $a->{_fitness} <=> $b->{_fitness} } @population; #Go ahead with the algorithm (5) for ( 1..$generations ) { my $chromosome = $population[ rand( @population)]; #Generate offspring that is better my $clone ={}; do { $clone = { _str => $chromosome->{_str}, _fitness => 0 }; mutate( $clone ); $clone->{_fitness} = fitness( $clone->{_str} ); } until ( $clone->{_fitness} > $chromosome->{_fitness}); #Substitute worst $population[$#population]=$clone; #Re-sort @population = sort { $a->{_fitness} <=> $b->{_fitness} } @population; #Print best print "Best so far: $population[0]->{_str} => $population[0]->{_fitness} \n"; #Early exit last if $population[0]->{_fitness} == 0; } (5) |
Use the data structure that best suits your expertise, tickles your fancy, or the one that is closest to the problem you want to solve. Testing different data structures for performance will not hurt you, either. |
This program, when run, produces this output: cekyhtvvjh -> 97 mwehwoxscv -> 82 lalmbnbghi -> 81 [More like this...] Best so far: vowjmwwgft => 41 Best so far: vowjmwwgft => 41 Best so far: vowjmwwgft => 41 [Maaany more like this...]
There are several problems with this algorithm. First, the population is not really used, and it is not actually needed. It is actually a hill-climbing algorithm, and very ineffective at that, since it takes an element, improves it a bit, puts it back into the population, takes another one... it would be much better to just take a random string, and start to change it until it hits target, right? In any case, since it is using a random mutation, what we are performing is basically random search over the space of all possible strings. Not an easy task, and this is the reason why the solution is usually not found, even given several hundred thousands iterations.
Blind mutation usually takes you nowhere, and it takes a long time to do so. |
This indicates there is something amiss here; even if nature is a blind watchmaker, it has the help of a very powerful tool: sex. And that is what we will use in the next section.
<<< Previous | Home | Next >>> |
Introduction: an evolutionary algorithm step by step | Up | Here comes sex |