Article
01/27/2016

June 2012

Volume 27 Number 06

Test Run - Evolutionary Optimization Algorithms

By James McCaffrey | June 2012

James McCaffrey

An evolutionary optimization algorithm is an implementation of a meta-heuristic modeled on the behavior of biological evolution. These algorithms can be used to find approximate solutions to difficult or impossible numerical minimization problems. You might be interested in evolutionary optimization algorithms for three reasons. First, knowing how to code these algorithms can be a practical addition to your developer, manager and tester skill sets. Second, some of the techniques used, such as tournament selection, can be reused in other algorithms and coding scenarios. And third, many engineers simply find these algorithms interesting in their own right.

An evolutionary optimization algorithm is essentially a type of genetic algorithm in which the virtual chromosomes are made of real values instead of some form of bit representation. The best way for you to see where I’m headed is to take a look Figure 1 and Figure 2.

Figure 1 Schwefel’s Function

Figure 2 Evolutionary Optimization Demo

The graph in Figure 1 shows Schwefel’s function, a standard benchmark numerical minimization problem. Schwefel’s function is defined as:

f(x,y) = (-x * sin(sqrt(abs(x)))) + (-y * sin(sqrt(abs(y))))

The function has a minimum value of -837.9658 when x = y = 420.9687. In Figure 1 this minimum value is at the far-left corner of the graph. The function is deliberately designed to be difficult for optimization algorithms to solve because of the numerous false local minimum values.

The screenshot in Figure 2 shows the output from a C# console application. After displaying some informational messages, the program sets eight parameters and then uses an evolutionary algorithm to search for the optimum solution. In this example, the algorithm found a best solution of (420.9688, 420.9687), which is extremely close to, but not quite equal to the optimal solution of (420.9687, 420.9687).

Evolutionary optimization algorithms model a solution as a chromosome in an individual. In high-level pseudo-code, the algorithm implemented in Figure 2 is:

initialize a population of random solutions
determine best solution in population
loop
  select two parents from population
  make two children from the parents
  place children into population
  make and place an immigrant into population
  check if a new best solution exists
end loop
return best solution found

In the sections that follow, I’ll present all the code that generated the screenshot in Figure 2. You can access the code from msdn.com/magazine/msdnmag0612. This article assumes you have intermediate or advanced programming skills and at least a very basic understanding of genetic algorithms. I coded the demo program using C#, but you should be able to easily refactor the code into other languages such as Visual Basic .NET or IronPython.

Program Structure

The overall program structure is shown in Figure 3. I used Visual Studio 2010 to create a new C# console application project named EvolutionaryOptimization. In the Solution Explorer window, I renamed Program.cs to EvolutionaryOptimizationProgram.cs, which automatically renamed class Program. I also deleted all the unneeded template-generated using statements.

Figure 3 Overall Structure of the EvolutionaryOptimization Program

using System;
namespace EvolutionaryOptimization
{
  class EvolutionaryOptimizationProgram
  {
    static void Main(string[] args)
    {
      try
      {
        int popSize = 100;
        int numGenes = 2;
        double minGene = -500.0; double maxGene = 500.0;
        double mutateRate = 1.0 / numGenes;
        double precision = 0.0001;
        double tau = 0.40;
        int maxGeneration = 8000;
        Evolver ev = new Evolver(popSize, numGenes, 
          minGene, maxGene, mutateRate,
          precision, tau, maxGeneration);
        double[] best = ev.Evolve();
        Console.WriteLine("\nBest solution found:");
        for (int i = 0; i < best.Length; ++i)
          Console.Write(best[i].ToString("F4") + " ");
        double fitness = Problem.Fitness(best);
        Console.WriteLine("\nFitness of best solution = " 
          + fitness.ToString("F4"));
      }
      catch (Exception ex)
      {
        Console.WriteLine("Fatal: " + ex.Message);
      }
    }
  }
  public class Evolver
  {
    // Member fields
    public Evolver(int popSize, int numGenes, 
      double minGene, double maxGene,
      double mutateRate, double precision, 
      double tau, int maxGeneration) { ... }
    public double[] Evolve() { ... }
    private Individual[] Select(int n) { ... }
    private void ShuffleIndexes() { ... }
    private Individual[] Reproduce(Individual parent1, 
      Individual parent2) { ... }
    private void Accept(Individual child1, 
      Individual child2) { ... }
    private void Immigrate() { ... }
  }
  public class Individual : IComparable<Individual>
  {
    // Member fields
    public Individual(int numGenes, 
      double minGene, double maxGene,
      double mutateRate, double precision) { ... }
    public void Mutate() { ... }
    public int CompareTo(Individual other) { ... }
  }
  public class Problem
  {
    public static double Fitness(double[] chromosome) { ... }
  }
} // NS

In addition to the main program class, the EvolutionaryOptimization demo has three program-defined classes. Class Evolver contains most of the algorithm logic. Class Individual models a possible solution to the minimization problem. Class Problem defines the function to be minimized, Schwefel’s function in this case. An alternative design structure would be to place the Individual class inside the Evolver class.

The Individual Class

The definition of class Individual begins:

public class Individual : IComparable<Individual>
{
  public double[] chromosome;
  public double fitness;
  private int numGenes;
  private double minGene;
  private double maxGene;
  private double mutateRate;
  private double precision;
  static Random rnd = new Random(0);
...

The class inherits from the IComparable interface so that arrays of Individual objects can be automatically sorted by their fitness. The class has eight data members. The chromosome array represents a possible solution to the target problem. Notice that chromosome is an array of double, rather than an array with some form of bit representation typically used by genetic algorithms. Evolutionary optimization algorithms are sometimes called real-valued genetic algorithms.

The fitness field is a measure of how good the chromosome-solution is. For minimization problems, smaller values of fitness are better than larger values. For simplicity, chromosome and fitness are declared with public scope so that they’re visible to the logic in the Evolver class. A gene is one value in the chromosome array and the numGenes field holds the number of real values in a possible solution. For Schwefel’s function, this value will be set to 2. With many numeric optimization problems, minimum and maximum values for each gene can be specified and these values are stored in minGene and maxGene. If these values are not known, minGene and maxGene can be set to double.MinValue and double.MaxValue. I’ll explain the mutateRate and precision fields when I present the code that uses them.

The Individual class definition continues with the class constructor:

public Individual(int numGenes, double minGene, double maxGene,
  double mutateRate, double precision)
{
  this.numGenes = numGenes;
  this.minGene = minGene;
  this.maxGene = maxGene;
  this.mutateRate = mutateRate;
  this.precision = precision;
  this.chromosome = new double[numGenes];
  for (int i = 0; i < this.chromosome.Length; ++i)
    this.chromosome[i] = (maxGene - minGene) 
    * rnd.NextDouble() + minGene;
  this.fitness = Problem.Fitness(chromosome);
}

The constructor allocates memory for the chromosome array and assigns random values in the range (minGene, maxGene) to each gene cell. Notice that the value of the fitness field is set by calling the externally defined Fitness method. Alternatively, you could pass into the constructor a reference to the Fitness method via a Delegate. The Mutate method is defined as such:

public void Mutate()
{
  double hi = precision * maxGene;
  double lo = -hi;
  for (int i = 0; i < chromosome.Length; ++i) {
    if (rnd.NextDouble() < mutateRate)
      chromosome[i] += (hi - lo) * rnd.NextDouble() + lo;
  }
}

The mutation operation walks through the chromosome array, changing randomly selected genes to random values in the range (lo, hi). The range of values to assign is determined by the precision and maxGene member values. In the example in Figure 2, precision is set to 0.0001 and maxGene is set to 500. The highest possible value for a gene mutation is 0.0001 * 500 = 0.05, which means that if a gene is mutated, its new value will be the old value plus or minus a random value between -0.05 and +0.05. Notice that the precision value corresponds to the number of decimals in the solution; this is a reasonable heuristic to use for the precision value. The mutation rate value controls how many genes in the chromosome will be modified. One heuristic for the value of the mutateRate field is to use 1.0 / numGenes, so that on average one gene in the chromosome will be mutated every time Mutate is called.

The Individual class definition concludes with the CompareTo method:

...
  public int CompareTo(Individual other)
  {
    if (this.fitness < other.fitness) return -1;
    else if (this.fitness > other.fitness) return 1;
    else return 0;
  }
}

The CompareTo method defines a default sort ordering for Individual objects, in this case from smallest fitness (best) to largest fitness.

The Problem Class

The Problem class encapsulates the target problem for the evolutionary algorithm:

public class Problem
{
  public static double Fitness(double[] chromosome)
  {
    double result = 0.0;
    for (int i = 0; i < chromosome.Length; ++i)
      result += (-1.0 * chromosome[i]) *
        Math.Sin(Math.Sqrt(Math.Abs(chromosome[i])));
    return result;
  }
}

Because a chromosome array represents a possible solution, it’s passed as an input parameter to the Fitness method. With arbitrary minimization problems, the target function to be minimized is usually called a cost function. In the context of evolutionary and genetic algorithms, however, the function is usually called a fitness function. Notice the terminology is a bit awkward because lower values of fitness are better than higher values. In this example, the fitness function is completely self-contained. In many optimization problems, the fitness function requires additional input parameters, such as a matrix of data or a reference to an external data file.

The Evolver Class

The Evolver class definition begins:

public class Evolver
{
  private int popSize;
  private Individual[] population;
  private int numGenes;
  private double minGene;
  private double maxGene;
  private double mutateRate;
  private double precision;
  private double tau;
  private int[] indexes;
  private int maxGeneration;
  private static Random rnd = null;
...

The popSize member holds the number of individuals in the population. Larger values of popSize tend to increase the algorithm’s accuracy at the expense of speed. In general, evolutionary algorithms are much faster than ordinary genetic algorithms because evolutionary algorithms work on real values and don’t incur the overhead of converting and manipulating bit representations. The heart of the Evolver class is an array of Individual objects named population. The tau and the indexes members are used by the Selection method, as you’ll see shortly.

The Evolver class definition continues with the constructor definition shown in Figure 4.

Figure 4 Evolver Class Constructor

public Evolver(int popSize, int numGenes, double minGene, double maxGene,
  double mutateRate, double precision, double tau, int maxGeneration)
{
  this.popSize = popSize;
  this.population = new Individual[popSize];
  for (int i = 0; i < population.Length; ++i)
    population[i] = new Individual(numGenes, minGene, maxGene,
      mutateRate, precision);
  this.numGenes = numGenes;
  this.minGene = minGene;
  this.maxGene = maxGene;
  this.mutateRate = mutateRate;
  this.precision = precision;
  this.tau = tau;
  this.indexes = new int[popSize];
  for (int i = 0; i < indexes.Length; ++i)
    this.indexes[i] = i;
  this.maxGeneration = maxGeneration;
  rnd = new Random(0);
}

The constructor allocates memory for the population array and then uses the Individual constructor to fill the array with individuals that have random gene values. The array named indexes is used by the Select method, which picks two parents. I’ll explain indexes later, but notice the constructor allocates one cell per individual and sequentially assigns integers from 0 to popSize-1.

The Evolve method, listed in Figure 5, is surprisingly short.

Figure 5 The Evolve Method

public double[] Evolve()
{
  double bestFitness = this.population[0].fitness;
  double[] bestChomosome = new double[numGenes];
  population[0].chromosome.CopyTo(bestChomosome, 0);
  int gen = 0;
  while (gen < maxGeneration)  {
    Individual[] parents = Select(2);
    Individual[] children = Reproduce(parents[0], parents[1]);
    Accept(children[0], children[1]);
    Immigrate();
    for (int i = popSize - 3; i < popSize; ++i)  {
      if (population[i].fitness < bestFitness)  {
        bestFitness = population[i].fitness;
        population[i].chromosome.CopyTo(bestChomosome, 0);
      }
    }
    ++gen;
  }
  return bestChomosome;
}

The Evolve method returns the best solution found as an array of type double. As an alternative, you could return an Individual object where the chromosome holds the best solution found. The Evolve method begins by initializing the best fitness and best chromosomes to the first ones in the population. The method iterates exactly maxGenerations times, using gen (generation) as a loop counter. One of several alternatives is to stop when no improvement has occurred after some number of iterations. The Select method returns two good, but not necessarily best, individuals from the population. These two parents are passed to Reproduce, which creates and returns two children. The Accept method places the two children into the population, replacing two existing individuals. The Immigrate method generates a new random individual and places it into the population. The new population is then scanned to see if any of the three new individuals in the population is a new best solution.

The Select method is defined as:

private Individual[] Select(int n)
{
  int tournSize = (int)(tau * popSize);
  if (tournSize < n) tournSize = n;
  Individual[] candidates = new Individual[tournSize];
  ShuffleIndexes();
  for (int i = 0; i < tournSize; ++i)
    candidates[i] = population[indexes[i]];
  Array.Sort(candidates);
  Individual[] results = new Individual[n];
  for (int i = 0; i < n; ++i)
    results[i] = candidates[i];
  return results;
}

The method accepts the number of good individuals to select and returns them in an array of type Individual. To minimize the amount of code, I’ve omitted normal error checking, such as verifying that the number of individuals requested is less than the population size. The Select method uses a technique called tournament selection. A subset of random candidate individuals is generated and the best n of them are returned. The number of candidates is computed into variable tournSize, which is some fraction, tau, of the total population size. Larger values of tau increase the probability that the best two individuals will be selected.

Recall that the Evolver class has a member array named indexes with values 0..popSize-1. The ShuffleIndexes helper method rearranges the values in array indexes into random order. The top n of these random indexes are used to pick candidates from the population. The Array.Sort method then sorts the candidate individuals from smallest (best) fitness to largest. The top n individuals of the sorted candidates are returned. There are many different evolutionary selection algorithms. An advantage of tournament selection compared to most other techniques is that selection pressure can be tuned by modifying the tau parameter.

The ShuffleIndexes helper method uses the standard Fisher-Yates shuffle algorithm:

private void ShuffleIndexes()
{
  for (int i = 0; i < this.indexes.Length; ++i) {
    int r = rnd.Next(i, indexes.Length);
    int tmp = indexes[r];
    indexes[r] = indexes[i];
    indexes[i] = tmp;
  }
}

The Reproduce method is listed in Figure 6. The method begins by generating a random crossover point. The indexing is a little bit tricky, but child1 is created from the left part of parent1 and the right part of parent2. Child2 is created from the left part of parent2 and the right part of parent1. The idea is illustrated in Figure 7, where there are two parents with five genes and the crossover point is 2. A common design alternative is to use multiple crossover points. After the Individual children objects are created they are mutated and their new fitness is computed.

Figure 6 The Reproduce Method

private Individual[] Reproduce(Individual parent1, Individual parent2)
{
  int cross = rnd.Next(0, numGenes - 1);
  Individual child1 = new Individual(numGenes, minGene, maxGene,
    mutateRate, precision);
  Individual child2 = new Individual(numGenes, minGene, maxGene,
     mutateRate, precision);
  for (int i = 0; i <= cross; ++i)
    child1.chromosome[i] = parent1.chromosome[i];
  for (int i = cross + 1; i < numGenes; ++i)
    child2.chromosome[i] = parent1.chromosome[i];
  for (int i = 0; i <= cross; ++i)
    child2.chromosome[i] = parent2.chromosome[i];
  for (int i = cross + 1; i < numGenes; ++i)
    child1.chromosome[i] = parent2.chromosome[i];
  child1.Mutate(); child2.Mutate();
  child1.fitness = Problem.Fitness(child1.chromosome);
  child2.fitness = Problem.Fitness(child2.chromosome);
  Individual[] result = new Individual[2];
  result[0] = child1; result[1] = child2;
  return result;
}

Figure 7 The Crossover Mechanism

The Accept method places the two child individuals created by Reproduce into the population:

private void Accept(Individual child1, Individual child2)
{
  Array.Sort(this.population);
  population[popSize - 1] = child1;
  population[popSize - 2] = child2;
}

The population array is sorted by fitness, which places the two worst individuals in the last two cells of the array, where they’re then replaced by the children. There are many alternative approaches to choosing which two individuals die. One possibility is to use a technique called roulette wheel selection to probabilistically select the two individuals to replace, where worse individuals have a higher probability of being replaced.

The Immigrate method generates a new random individual and places it into the population just above the location of the two children that were just generated (immigration helps prevent evolutionary algorithms from getting stuck in local minima solutions; design alternatives include creating more than one immigrant and placing the immigrant at a random location in the population):

private void Immigrate()
{
  Individual immigrant =
    new Individual(numGenes, minGene, maxGene, mutateRate, precision);
  population[popSize - 3] = immigrant;
}

Starting Point for Experimentation

In this article, evolutionary optimization is used to estimate the minimum value of a mathematical equation. Although evolutionary optimization algorithms can be used in this way, more often they’re used to find the values for a set of numeric parameters in a larger optimization problem for which there’s no effective deterministic algorithm. For example, when using a neural network to classify existing data in order to make predictions on future data, the main challenge is to determine a set of neuron weights and biases. Using an evolutionary optimization algorithm is one approach for estimating the optimal weight and bias values. In most cases, evolutionary optimization algorithms are not well-suited for use on non-numerical combinatorial optimization problems such as the Traveling Salesman Problem, where the goal is to find the combination of cities with the shortest total path length.

Evolutionary algorithms, like pure genetic algorithms, are meta-heuristics. This means they’re a general framework and a set of conceptual guidelines that can be used to create a specific algorithm to solve a specific problem. So the example presented in this article can be viewed more as a starting point for experimentation and creating evolutionary optimization code rather than a fixed, static code base.

Dr. James McCaffrey works for Volt Information Sciences Inc., where he manages technical training for software engineers working at the Microsoft Redmond, Wash., campus. He has worked on several Microsoft products including Internet Explorer and MSN Search. He’s the author of “.NET Test Automation Recipes” (Apress, 2006), and can be reached at jammc@microsoft.com.

Thanks to the following Microsoft technical expert for reviewing this article: Anne Loomis Thompson