With a little help from our friend PerlFirst, we are going to take advantage of a Perl module, to
extend our library with new classes. There's a wealth of Perl modules
out there, and many of them are devoted to working with data
structures such as lists or strings. For instance, the
Algorithm::Permute class comes in handy to
create a permutation operator that acts on strings (included binary
strings), as is done next
(1)
package Algorithm::Evolutionary::Op::Permutation;
use Carp;
use Algorithm::Evolutionary::Op::Base;
use Algorithm::Permute;
our @ISA = qw (Algorithm::Evolutionary::Op::Base);
our $APPLIESTO = 'Algorithm::Evolutionary::Individual::String';(1)
our $ARITY = 1; (2)
sub new {
my $class = shift;
my $rate = shift || 1;
my $self = Algorithm::Evolutionary::Op::Base::new( 'Algorithm::Evolutionary::Op::Permutation', $rate );
return $self; (2)
} (3)
sub create {
my $class = shift;
my $rate = shift || 1;
my $self = { rate => $rate };
bless $self, $class; (3)
return $self; (4)
}
sub apply ($;$) {
my $self = shift;
my $arg = shift || croak "No victim here!";
my $victim = $arg->clone();
croak "Incorrect type ".(ref $victim) if ! $self->check( $victim );
my @arr = split("",$victim->{_str});
my $p = new Algorithm::Permute( \@arr );
$victim->{_str} = join( "",$p->next ); (4)
return $victim;
}
|
- (1)
- This is the usual introduction to modules, which
should be preceded with some POD documentation: description, synopsis,
and so on. After declaration of the package name, we declare needed
modules:Algorithm::Permute, a class for fast
permutations, and base
class for all operators,
Algorithm::Evolutionary::Op::Base. Two
constants should be defined also for the module: one of them is
optional, the $APPLIESTO variable, which states to
which individual class it might apply to; this will be used in the
apply method, but if it applies to a whole
hierarchy, for instance, all subclasses of
String, it's better to find out a more
sophisticated check; the second one, $ARITY, is
used by other objects to find the number of arguments the
apply method needs.
| Do not reinvent the wheel: always look up CPAN
when writing operators or individuals; you might find the right class
for the job. |
- (2)
- The new method does not
do much this time, other than forward object creation to the base
class (as all objects should do). An operator just has a
rate as default variable; and this one has not
got any other.
- (3)
- This is equivalent to new, and it's a fossil. Do
not worry about it, it will probably be eliminated in other versions
of the library
- (4)
- This is the most important method: is the one that
actually does the job, and the one it is called to modify
chromosomes. Chromosomes are passed by value, that is why it is cloned, and the result of the
modification is returned; this is the way the higher-level classes,
such as Algorithm::Evolutionary::Op::Full,
expect them to behave that way, but they might do something different
for particular algorithms (for instance,
Algorithm::Evolutionary::Op::QuadXOver takes both
arguments by reference, and is used within the canonical GA). This
method creates a permutation object from the chromosome, and permutes
it, assigns it back to the created chromosome, and returns it.
Other methods, such as set
or asXML, can also be overridden from the base
class, but just if you want to do something very specific; the base
class versions will do just fine in most cases. Sub-classing an
Individual, or creating new kinds of data structures to evolve, is
just as simple, and it is left as an exercise to the reader.
We will use this class to evolve DNA strings; just as we
did before, but, first, our target string will only be composed of A,
C, G and T, and, second, we will have no "distance to char", but
overall distance among strings. The exercise is purely academic, but a
similar problem is solved when sequence alignments want to be done,
only in that cases there are several targets. We will use another
module, String::Approx to compute approximate
distances among strings (ea-ex5.pl).
use Algorithm::Evolutionary::Op::Creator;
use Algorithm::Evolutionary::Op::Permutation;
use Algorithm::Evolutionary::Op::IncMutation;
use Algorithm::Evolutionary::Op::Crossover;
use Algorithm::Evolutionary::Op::CanonicalGA;
use String::Approx qw( adistr );
my $target =shift || 'CGATACGTTGCA';
my $maxGenerations = shift || 100;
my $popSize = shift || 100;
my $fitness = sub {
my $indi = shift;
return 1 - abs ( adistr( $indi->Chrom, $target ) );
};
my $incmutation = new Algorithm::Evolutionary::Op::IncMutation;
my $mutation = new Algorithm::Evolutionary::Op::Permutation;
my $crossover = new Algorithm::Evolutionary::Op::Crossover;
my $ez = new Algorithm::Evolutionary::Op::Easy $fitness, 0.4, [$mutation, $crossover, $incmutation ];
my $indiType = 'String';
my $hash = { length => length( $target ),
chars => ['A','C','G','T']} ;
my $creator = new Algorithm::Evolutionary::Op::Creator( $popSize, 'String', $hash);
my @population = ();
$creator->apply( \@population );
my $gen;
do {
$ez->apply (\@population );
print "Best so far: ", $population[0]->asString(), "\n";
} until ( $population[0]->Chrom eq $target ) || ($gen++ > $maxGenerations) ;
print "Final\n", $population[0]->asString();
|
This program is very similar to previous
examples. The only differences are that we use a different kind of
chromosome,
Individual::String, which uses any
alphabet, and that we use several variation operators:
Op::IncMutation, which increments a single
element in the chromosome by one, taking into account the alphabet
(that is, it would cycle A -> C -> G -> T);
Op::Permutation, which we just declared. The
fitness returns the distance between the string and the target string,
taking into account the length difference, and the insertions and
deletions needed to turn a string into the other. This is a problem,
since AA and TA will have the same distance to GA, and there are many
mutations which are
neutral, leading to no
change in fitness. Furthermore, strings such as GAAAAA are at a
distance of 1 (or 1/divided by total length) from AAAAAG, but a very
lucky permutation is needed to turn one into the other. This leads to
the fact that, in this case, the evolutionary algorithm does not
always find the solution.
| Combinatorial
optimization problems like this one are usually
hard for evolutionary algorithms (and for any other search method, for
that matter). It always help to have a good knowledge of the problem,
and use any other methods available to us to improve search and make
the fitness landscape less rugged. |