February 2016

Volume 31 Number 2

[Test Run]

Roach Infestation Optimization

By James McCaffrey

James McCaffreyRoach infestation optimization is a numerical optimization algorithm that’s loosely based on the behavior of common cockroaches such as Periplaneta americana. Yes, you read that correctly. Let me explain.

In machine learning, a numerical optimization algorithm is often used to find a set of values for variables (usually called weights) that minimize some measure of error. For example, logistic regression classification uses a math equation where, if there are n predictor variables, there are n+1 weight values that must be determined. The process of determining the values of the weights is called training the model. The idea is to use a collection of training data that has known correct output values. A numerical optimization algorithm is used to find the values of the weights that minimize the error between computed output values and known correct output values.

There are many different numerical optimization algorithms. The most common are based on calculus derivatives, but there are also algorithms that are based on the behaviors of natural systems. These are sometimes called bio-inspired algorithms. This article explains a relatively new (first published in 2008) technique called roach infestation optimization (RIO). RIO optimization loosely models the foraging and aggregating behavior of a collection of roaches.

A good way to get an idea of what RIO is and to see where this article is headed is to take a look at the demo program in Figure 1. The goal of the demo is to use RIO to find the minimum value of the Rastrigin function with eight input variables. The Rastrigin function is a standard benchmark function used to evaluate the effectiveness of numerical optimization algorithms. The function has a known minimum value of 0.0 located at x = (0, 0, . . 0)  where the number of 0 values is equal to the number of input values.

The Roach Infestation Optimization Algorithm in Action
Figure 1 The Roach Infestation Optimization Algorithm in Action

The Rastrigin function is difficult for most optimization algorithms because it has many peaks and valleys that create local minimum values that can trap the algorithms. It’s not possible to easily visualize the Rastrigin function with eight input values, but you can get a good idea of the function’s characteristics by examining a graph of the function for two input values, shown in Figure 2.

The Rastrigin Function with Two Input Variables
Figure 2 The Rastrigin Function with Two Input Variables

The demo program sets the number of roaches to 20. Each simulated roach has a position that represents a possible solution to the minimization problem. More roaches increase the chance of finding the true optimal solution at the expense of performance. RIO typically uses 10 to 100 roaches.

RIO is an iterative process and requires a maximum loop counter value. The demo sets the maximum value to 10,000 iterations. The maximum number of iterations will vary from problem to problem, but values between 1,000 and 100,000 are common. RIO has an element of randomness and the demo sets the seed value for the random number generator to 6, because 6 gave representative demo output.

In the demo shown in Figure 1, the best (smallest) error associated with the best roach position found so far was displayed every 500 time units. After the algorithm finished, the best position found for any roach was x = (0, 0, 0, 0, 0, 0, 0, 0), which is, in fact, the correct answer. But notice if the maximum number of iterations had been set to 5,000 instead of 10,000, RIO would not have found the one global minimum. RIO, like almost all numerical optimization algorithms, is not guaranteed to find an optimal solution in practical scenarios.

This article assumes you have at least intermediate programming skills but doesn’t assume you know anything about numerical optimization or the roach infestation optimization algorithm. The demo program is coded using C#, but you shouldn’t have too much difficulty refactoring the code to another language such as Visual Basic or JavaScript.

The complete demo code, with a few minor edits to save space, is presented in this article. The demo is also available in the code download that accompanies this article. The demo code has all normal error checking removed to keep the main ideas as clear as possible and the size of the code small.

Overall Program Structure

The overall program structure is presented in Figure 3. To create the demo, I launched Visual Studio and created a new C# console application named RoachOptimization. The demo has no significant Microsoft .NET Framework dependencies so any recent version of Visual Studio will work.

Figure 3 Roach Optimization Demo Program Structure

using System;
namespace RoachOptimization
{
  class RoachOptimizationProgram
  {
    static void Main(string[] args)
    {
      Console.WriteLine("Begin roach optimization demo");
      // Code here
      Console.WriteLine("End roach demo");
      Console.ReadLine();
    }
    static double[] SolveRastrigin(int dim, int numRoaches,
      int tMax, int rndSeed) { . . }
    public static double Error(double[] x) { . . }
    static double Distance(double[] pos1,
      double[] pos2) { . . }
    static void Shuffle(int[] indices,
      int seed) { . . }
  } // Program
  public class Roach
  {
    // Defined here
  }
}

After the template code loaded into the Visual Studio editor, in the Solution Explorer window I renamed file Program.cs to the more descriptive RoachOptimizationProgram.cs and Visual Studio automatically renamed class Program for me. At the top of the source code, I deleted all unnecessary using statements, leaving just the single reference to System.

I coded the demo using a mostly static-method approach rather than a full Object-Oriented Programming (OOP) approach. The demo has all the control logic in the Main method. It begins by setting up the algorithm input parameter values:

Console.WriteLine("Begin roach optimization demo");
int dim = 8;
int numRoaches = 20;
int tMax = 10000;
int rndSeed = 6;

The dim variable specifies the number of input values for Rastrigin’s function. In a non-demo machine learning scenario, the dim represents the number of weights in the prediction model. The number of roaches is set to 20. Variable tMax is the maximum number of iterations. RIO, like most bio-inspired algorithms, is probabilistic. Here, a random variable seed value is set to 6.

Next, the RIO parameters are echoed to the console:

Console.WriteLine("Goal is to minimize Rastrigin's " +
  "function in " + dim + " dimensions");
Console.WriteLine("Problem has known min value = 0.0 " +
  "at (0, 0, .. 0) ");
Console.WriteLine("Setting number of roaches = " +
  numRoaches);
Console.WriteLine("Setting maximum iterations = " +
  tMax);
Console.WriteLine("Setting random seed = " + rndSeed);;

The roach optimization algorithm is called like so:

Console.WriteLine("Starting roach optimization ");
double[] answer = SolveRastrigin(dim, numRoaches,
  tMax, rndSeed);
Console.WriteLine("Roach algorithm completed");

The Main method concludes by displaying the results:

double err = Error(answer);
Console.WriteLine("Best error found = " +
  err.ToString("F6") + " at: ");
for (int i = 0; i < dim; ++i)
  Console.Write(answer[i].ToString("F4") + " ");
Console.WriteLine("");
Console.WriteLine("End roach optimization demo");
Console.ReadLine();

The roach optimization algorithm presented in this article is based on the 2008 research paper, “Roach Infestation Optimization,” by T. Havens, C. Spain, N. Salmon and J. Keller. You can find the paper in several locations on the Web.

Understanding the Roach Optimization Algorithm

In RIO, there’s a collection of simulated roaches. Each roach has a position in n-dimensions that represents a possible solution to a minimization problem. Simulated roach behavior is based on three behaviors of real roaches. First, roaches tend to move toward dark areas. Second, roaches like to group together. Third, when roaches get hungry, they will leave their current location to search for food. Exactly how simulated roaches model these behaviors will become clear when the code is presented.

Expressed in very high-level pseudocode, the roach optimization algorithm is presented in Figure 4. At first glance, the algorithm seems quite simple; however, there are a lot of details that aren’t apparent in the pseudo-code.

Figure 4 The Roach Algorithm at a High Level

initialize n roaches to random positions
loop tMax times
  compute distances between all roaches
  compute median distance
  for-each roach
    compute number neighbors
    exchange data with neighbors
    if not hungry
      compute new velocity
      compute new position
      check if new best position
    else if hungry
      relocate to new position
    end-if
  end-for
end loop
return best position found

The Roach Class

The definition of program-defined class Roach begins as:

public class Roach
{
  public int dim;
  public double[] position;
  public double[] velocity;
  public double error;
...

The dim field is the problem dimension, which is 8 in the case of the demo. The position field is an array that conceptually represents the location of a roach, and also represents a possible solution to a minimization problem. The velocity field is an array of values that determine where the roach will move in the next time unit. For example, if dim = 2 and position = (5.0, 3.0) and velocity = (1.0, -2.0), the roach will move to (6.0, 1.0). The error field is the error associated with the current position.

The definition continues:

public double[] personalBestPos;
public double personalBestErr;
public double[] groupBestPos;
public int hunger;
private Random rnd;

The personalBestPos field holds the best position found by the simulated roach at any point during its movement. The personalBestError holds the error that corresponds to personalBestPos. The groupBestPos field holds the best position found by any of a group of neighbor roaches. The hunger field is an integer value that represents how hungry the roach is. The Random object rnd is used to initialize a roach to a random position.

The Roach class definition continues by defining a constructor:

public Roach(int dim, double minX, double maxX,
  int tHunger, int rndSeed)
{
  this.dim = dim;
  this.position = new double[dim];
  this.velocity = new double[dim];
  this.personalBestPos = new double[dim];
  this.groupBestPos = new double[dim];
...

The minX and maxX parameters are used to set limits for the components of the position vector. Parameter tHunger is a maximum hunger value. When a roach’s hunger reaches tHunger, the roach will move to a new location. The constructor allocates space for the four array fields.

Next, the constructor initializes the Random object and then uses it to set the initial hunger value to a random value:

this.rnd = new Random(rndSeed);
this.hunger = this.rnd.Next(0, tHunger);

Next, the constructor sets the initial position, velocity, personal best location and group best location arrays to random values between minX and maxX:

for (int i = 0; i < dim; ++i) {
  this.position[i] = (maxX - minX) * rnd.NextDouble() + minX;
  this.velocity[i] = (maxX - minX) * rnd.NextDouble() + minX;
  this.personalBestPos[i] = this.position[i];
  this.groupBestPos[i] = this.position[i];
}

The Roach constructor definition finishes like so:

...
    error = RoachOptimizationProgram.Error(this.position);
    personalBestErr = this.error;
  } // ctor
} // Roach

The error field is set by calling external method Error, defined in the calling program class. An alternative approach is to compute the error value before calling the constructor, then pass the error value in as a parameter to the constructor.

Implementing the Roach Algorithm

The RIO algorithm is contained in static method SolveRastrigin, whose definition begins as:

static double[] SolveRastrigin(int dim, int numRoaches,
  int tMax, int rndSeed)
{
  double C0 = 0.7;
  double C1 = 1.43;
  double[] A = new double[] { 0.2, 0.3, 0.4 };
...

Constants C0 and C1 are used when computing a roach’s new velocity, as you’ll see shortly. The values used, 0.7 and 1.43, come from particle swarm theory. You might want to investigate other values.

Roaches that are close to each other are called neighbors. Neighbors will sometimes, but not always, exchange information. The array named A holds probabilities. The first value, 0.2, is the probability that a roach with one neighbor will exchange information with that neighbor. The second value, 0.3, is the probability that a roach will exchange information if it has two neighbors. The third value, 0.4, is the probability of exchanging information if a roach has three or more neighbors.

These probability values, (0.2, 0.3, 0.4), are not the ones that were used in the source research paper. The original study used probability values of (0.49, 0.63, 0.65), which correspond to actual roach behavior as described in a biology research paper. I discovered the probabilities that match real roach behavior weren’t as effective as the artificial probability values used in the demo. The definition of method SolveRastrigin continues with:

int tHunger = tMax / 10;
double minX = -10.0;
double maxX = 10.0;
int extinction = tMax / 4;
Random rnd = new Random(rndSeed);

Local variable tHunger determines when a roach will become hungry and leave its current location and neighbors. For example, if tMax is 10,000 as in the demo, then, when a roach’s hunger value reaches tMax / 10 = 1,000, the roach will move to a new position.

Variables minX and maxX set limits on a roach’s position vector. The values (-10.0, +10.0) are normal for machine learning weights and also correspond to usual limits for the Rastrigin function. For example, for a problem with dimension = 3, the position vector is an array of three cells, all of which will have values between -10.0 and +10.0.

Local variable extinction determines when all the roaches will die and be reborn at new positions. This mechanism is a restart and helps prevent the algorithm from getting stuck at a non-­optimal solution.

Local Random object rnd is used by the algorithm for three purposes. The order in which the roaches are processed is randomized; the exchange of information between neighbor roaches occurs with a certain probability; and there’s a random component to each roach’s new velocity. Method SolveRastrigin continues:

Roach[] herd = new Roach[numRoaches];
for (int i = 0; i < numRoaches; ++i)
  herd[i] = new Roach(dim, minX, maxX, tHunger, i);

The collection of simulated roaches is an array named herd. There are all kinds of interesting names for collections of animals, such as a pod of whales and a gaggle of geese. As a matter of fact, a collection of cockroaches is called an intrusion of roaches. (This information could make a useful bar bet for you.)

Notice that the loop index variable, i, is passed to the Roach constructor. The index variable acts as a random seed for the Random object that’s part of the Roach class definition. Passing a loop index variable to be used as a random seed value is a common technique in machine learning. The method definition continues with:

int t = 0;  // Loop counter (time)
int[] indices = new int[numRoaches];
for (int i = 0; i < numRoaches; ++i)
  indices[i] = i;
double bestError = double.MaxValue;
double[] bestPosition = new double[numRoaches];
int displayInterval = tMax / 20;

The array named indices holds values (0, 1 2, . . numRoaches-1). The array will be shuffled in the main processing loop so that the order in which the roaches are processed is different every time. Local variables bestPosition and bestError hold the best position/solution and associated error found by any roach at any time. Local variable displayInterval determines when a progress message will be displayed to the console. Next, an array-of-arrays style matrix is instantiated to hold the distance between all pairs of roaches:

double[][] distances = new double[numRoaches][];
for (int i = 0; i < numRoaches; ++i)
  distances[i] = new double[numRoaches];

For example, if distances[0][3] = 7.89 then distances[3][0] is also 7.89 and the distance between roach 0 and roach 3 is 7.89. Note that the redundant data isn’t serious because in most cases you won’t have a huge number of roaches. Next, the main processing loop starts:

while (t < tMax)
{
  if (t > 0 && t % displayInterval == 0) {
    Console.Write("time = " + t.ToString().PadLeft(6));
    Console.WriteLine(" best error = " +
      bestError.ToString("F5"));
  }
...

Then the distances between roaches is calculated:

for (int i = 0; i < numRoaches - 1; ++i) {
  for (int j = i + 1; j < numRoaches; ++j) {
    double d = Distance(herd[i].position,
                 herd[j].position);
    distances[i][j] = distances[j][i] = d;
  }
}

The distance values are calculated using helper method Distance, which will be presented shortly. The array indexing here is a bit tricky but I’ll leave it to you to verify if you’re curious. Next, the distance values are copied from the distances matrix into an array so they can be sorted and then the median distance can be determined:

double[] sortedDists =
  new double[numRoaches * (numRoaches - 1) / 2];
int k = 0;
for (int i = 0; i < numRoaches - 1; ++i) {
  for (int j = i + 1; j < numRoaches; ++j) {
    sortedDists[k++] = distances[i][j];
  }
}

The size of the array is best explained by example. Suppose there are n = 4 roaches. Then the distances matrix will have size 4x4. The values on the diagonal, [0][0], [1][1], [2][2] and [3][3] will be 0.0 and shouldn’t be included. That leaves the 6 values at [0][1], [0][2], [0][3], [1][2], [1][3] and [2][3]. You don’t need the identical distance values at symmetric indices [1][0], [2][0] and so on. So, there are n * (n-1) / 2 distinct distance values. Next, the median distance between roaches is calculated and the roach indices are randomized:

Array.Sort(sortedDists);
double medianDist = sortedDists[sortedDists.Length / 4];
Shuffle(indices, t); // t is used as a random seed

Here, because I divide by 4, the distance is one-fourth from the beginning of the sorted distances array so the result really isn’t a median, it’s a quartile. The original research paper used the actual median (by dividing by 2), but I found that the quartile worked better than the median. The idea is that the median or quartile determines how many neighbors a roach has, which in turn influences how closely the roaches are grouped. Using the quartile keeps the roaches further apart, giving them a better chance to find a tricky global minimum value for the target function to minimize.

The roach indices are randomized using helper method Shuffle, which will be presented shortly. Notice that the time index variable, t, is passed to the Shuffle method and acts as a seed for the Shuffle random number generator. Next, the loop to process each roach begins:

for (int i = 0; i < numRoaches; ++i)  // Each roach
{
  int idx = indices[i]; // Roach index
  Roach curr = herd[idx]; // Ref to current roach
  int numNeighbors = 0;
...

A reference to the current roach, herd[idx], is created and named curr. This is just for convenience. Next, the number of neighbors of the current roach is calculated:

for (int j = 0; j < numRoaches; ++j) {
  if (j == idx) continue;
  double d = distances[idx][j];
  if (d < medianDist) // Is a neighbor
    ++numNeighbors;
}

The condition j == idx is used to prevent the current roach from being counted as a neighbor to itself. Next, the effective number of neighbors is determined:

int effectiveNeighbors = numNeighbors;
if (effectiveNeighbors >= 3)
  effectiveNeighbors = 3;

Recall that the purpose of calculating the number of neighbors is to determine the probability that neighbors will exchange information. But the probability of information exchange is the same for 3 or more neighbors. Next, the algorithm determines if information should be exchanged:

for (int j = 0; j < numRoaches; ++j) {
  if (j == idx) continue;
  if (effectiveNeighbors == 0) continue;
  double prob = rnd.NextDouble();
  if (prob > A[effectiveNeighbors - 1]) continue;
...

The current roach is compared against all other roaches. If the current roach has no neighbors then there’s no information exchange. If the current roach has one or more neighbors, the A array of probabilities is used to decide if information should be exchanged or not. Next:

double d = distances[idx][j];
if (d < medianDist) { // a neighbor
  if (curr.error < herd[j].error) { // curr better than [j]
    for (int p = 0; p < dim; ++p) {
      herd[j].groupBestPos[p] = curr.personalBestPos[p];
      curr.groupBestPos[p] = curr.personalBestPos[p];
    }
  }
...

When information exchange between neighbor roaches occurs, the group best position and associated error of the better of the two roaches is copied to the worse roach. The second branch of the information exchange code is:

...
    else { // [j] is better than curr
      for (int p = 0; p < dim; ++p) {
        curr.groupBestPos[p] = herd[j].personalBestPos[p];
        herd[j].groupBestPos[p] = herd[j].personalBestPos[p];
      }
    }
  } // If a neighbor
} // j, each neighbor

After information exchange between neighbor roaches is taken care of, the current roach moves if it’s not hungry. The first part of the move process is to calculate the new velocity of the current roach:

if (curr.hunger < tHunger) {
  for (int p = 0; p < dim; ++p)
    curr.velocity[p] = (C0 * curr.velocity[p]) +
     (C1 * rnd.NextDouble() * (curr.personalBestPos[p] -
       curr.position[p])) +
     (C1 * rnd.NextDouble() * (curr.groupBestPos[p] -
       curr.position[p]));

The new velocity has three components. The first component is the old velocity, which is sometimes called inertia in particle swarm terminology. Inertia acts to keep a roach moving in the same direction. The second component is the roach’s best known position, which is sometimes called  the cognitive term. The cognitive component prevents a roach from moving to bad positions. The third component is the best known position of the roach’s neighbors. This component is more or less unique to RIO and doesn’t have a standard name. This third term acts to keep groups of roaches together.

After the velocity of the current roach has been calculated, the roach is moved:

for (int p = 0; p < dim; ++p)
  curr.position[p] = curr.position[p] + curr.velocity[p];
double e = Error(curr.position);
curr.error = e;

After the current roach is moved, its new position is checked to see if it’s a new best for the roach:

if (curr.error < curr.personalBestErr) {
  curr.personalBestErr = curr.error;
  for (int p = 0; p < dim; ++p)
    curr.personalBestPos[p] = curr.position[p];
}

Next, the new position is checked to see if it’s a new global best, and the hunger counter is incremented:

if (curr.error < bestError) {
    bestError = curr.error;
    for (int p = 0; p < dim; ++p)
      bestPosition[p] = curr.position[p];
  }
  ++curr.hunger;
} // If not hungry

The each-roach loop finishes by dealing with hungry roaches:

else { // Roach is hungry
  {
    herd[idx] = new Roach(dim, minX, maxX, tHunger, t);
  }
} // j each roach

If a roach’s hunger counter reaches the tHunger threshold, the roach moves to a new, random location. After all roaches have been processed, the algorithm finishes by checking if it’s time for a global extinction, incrementing the main loop time counter and returning the best position found by any roach:

if (t > 0 && t % extinction == 0) { // Extinction?
      Console.WriteLine("Mass extinction at t = " +
        t.ToString().PadLeft(6));
      for (int i = 0; i < numRoaches; ++i)
        herd[i] = new Roach(dim, minX, maxX, tHunger, i);
    }
    ++t;
  } // Main while loop
  return bestPosition;
} // Solve

Notice that the algorithm is contained in a method named Solve­Rastrigin rather than a more general name such as Solve. The idea here is that RIO is really a meta-heuristic, rather than a prescriptive algorithm, and needs to be customized to whatever minimization problem you’re trying to solve.

The Helper Methods

Method SolveRastrigin calls helper methods Distance, Error and Shuffle. Helper method Distance returns the Euclidean distance (square root of the sum of squared term differences):

static double Distance(double[] pos1, double[] pos2)
{
  double sum = 0.0;
  for (int i = 0; i < pos1.Length; ++i)
    sum += (pos1[i] - pos2[i]) * (pos1[i] - pos2[i]);
  return Math.Sqrt(sum);
}

Helper method Error returns the squared difference between the calculated value of Rastrigin’s function at a given roach position x and the true minimum value of zero:

public static double Error(double[] x)
{
  double trueMin = 0.0; double rastrigin = 0.0;
  for (int i = 0; i < x.Length; ++i) {
    double xi = x[i];
    rastrigin += (xi * xi) - (10 * Math.Cos(2 * Math.PI * xi)) + 10;
  }
  return (rastrigin - trueMin) * (rastrigin - trueMin);
}

Method Shuffle randomizes the order of the values in an array using the Fisher-Yates mini-algorithm:

static void Shuffle(int[] indices, int seed)
{
  Random rnd = new Random(seed);
  for (int i = 0; i < indices.Length; ++i) {
    int r = rnd.Next(i, indices.Length);
    int tmp = indices[r]; indices[r] = indices[i];
    indices[i] = tmp;
  }
}

The original research version of RIO doesn’t randomize the roach processing order, but I’ve found that this approach almost always improves the accuracy of bio-inspired optimization algorithms.

A Few Comments

So, just how effective is roach-inspired optimization compared to other numerical optimization techniques? Based on a few limited experiments I’ve performed, RIO appears to solve some benchmark problems better than other algorithms, but is weaker on other problems. I conclude that while RIO is not a fantastic new general-purpose optimization algorithm, it does have promise and could be useful for certain specific minimization problems. And for sure, roach infestation optimization has one of the most unusual names in all of computer science.


Dr. James McCaffrey works for Microsoft Research in Redmond, Wash. He has worked on several Microsoft products including Internet Explorer and Bing. Dr. McCaffrey can be reached at jammc@microsoft.com.

Thanks to the following Microsoft technical experts for reviewing this article: Todd Bello, Marciano Moreno Diaz Covarrubias and Alisson Sol