Toward Competent Genetic Algorithms: Linkage Learning

We endorse the term Competent Genetic Algorithms to those GAs that solve hard problems quickly, accurately and reliably [1]. We already know that GAs process building blocks (BB): low order---few specific bits---and low length---small distance between specific bits---schema with above average fitness. However, crossover may disturb these BB. Ideally, crossover should identify the fundamental BB of the problem at hand and mix them well, but in the real world this phenomenon scarcely happens. In order to tackle this issue a radical approach is required: remove the classic selecto-recombinative operators out of the GA loop and develop strategies that automatically identify BBs ensuring that these are not disrupted. Researchers call this strategy as Linkage Learning [2].

Estimation of Distribution Algorithms (EDAs) use probabilistic models that perform the task. They learn a probabilistic model and then build new solutions by sampling candidates from the model.

One of the simplest forms of EDA is the so-called compact genetic algorithm (cGA, [2]). CGA uses a probability vector to represent populations of strings. Furthermore, population is completely replaced by this probability vector---i.e., no explicit population is stored in memory, hence its name. At each iteration cGA generates two solutions out of the probability vector. Then it evaluates these two solutions and finally it updates the probability vector according to the fitness computation.


I coded a simple cGA in R solving the trap-n function: a simple boolean function that is deceptive---i.e., it is misleading toward local optima [1].  The situation is the following: for n = 5, the learner has to reach the chromosome 11111, but the fitness computation misleads the search towards 00000 (a local optima!). Figure 1 depicts this situation in the trap-5 function. This problem is hard for a traditional GA (specially for the simple GA), but cGA solves it quickly and accurately.



In the following I provide the code for the cGA in R. 


1:  #Class definition.  
2:  setClass( "Individual", representation( chromosome="array", fitness="numeric", size="numeric" ) )  
3:  #Constructor.  
4:  Individual <- function( indSize, p ) {  
5:      ind <- array( 0 , indSize )  
6:      for( i in 1:indSize ) {  
7:          if( rbinom( 1, 1, 0.5 ) < p[i] ) {  
8:              ind[i] = 1  
9:          }  
10:      }  
11:      #Trap-5 function fitness computation.  
12:      fit <- 0  
13:      no <- sum( ind )  
14:      if( no <= (indSize - 1) ) {  
15:          fit <- (indSize - 1 ) - no  
16:      } else {  
17:          fit <- indSize  
18:      }  
19:        
20:      new( "Individual", chromosome = as.array( ind ), fitness = fit, size = indSize )  
21:  }  
22:  #Accesor Methods.  
23:  getFitness <- function( obj ) obj@fitness  
24:  getSize <- function( obj ) obj@size  
25:  getChromosome <- function( obj ) obj@chromosome  
26:    
27:  #It updates the probability vector.  
28:  updatePVector <- function( x, y, p, indSize, popSize ) {  
29:      d <- 1 / popSize  
30:      for( i in 1:indSize ) {  
31:          if( x@chromosome[i] > y@chromosome[i] ) {  
32:              p[i] <- p[i] + d  
33:              if( p[i] > 1.0 ) {  
34:                  p[i] = 1.0  
35:              }  
36:          } else if( x@chromosome[i] > y@chromosome[i] ) {  
37:              p[i] <- p[i] - d  
38:              if( p[i] < 0.0 ) {  
39:                  p[i] = 0.0  
40:              }  
41:          }  
42:      }  
43:      return( p )  
44:  }  
45:    
46:  #The main cGA algorithm.  
47:  cGA <- function( indSize, popSize, iterations ) {  
48:      p <- array( 0.5, indSize ) #Vector of probabilities  
49:      t <- 0 #Time stamp.  
50:      while( t != iterations ) {  
51:          x <- Individual( indSize, p )  
52:          y <- Individual( indSize, p )  
53:          if( getFitness( x ) > getFitness( y ) ) {  
54:              p <- updatePVector( x, y, p, indSize, popSize )  
55:          } else {  
56:              p <- updatePVector( y, x, p, indSize, popSize )  
57:          }  
58:          t <- t + 1  
59:      }  
60:      return( p )  
61:  }  
62:  #Usage: cGA( 5, 4, 30 )  
63:    


References

[1] Goldberg, D.E. "The Design of Innovation: Lessons from and for Competent Genetic Algorithms." Kluwer Academic Publishers, Norwell, MA, USA, 2002

[2] Harik, G.R.; Lobo, F.G.; Goldberg, D.E., "The compact genetic algorithm," Evolutionary Computation, IEEE Transactions on, vol. 3, no. 4, pp.287, 297, 1999

Edit: notice that Figure 1 depicted a distinct trap-5 function---in the picture version of the function I forgot to count 0 as a valid solution. Also notice that the fitness function leads the system toward 00000 and not 00001.

Comments

Popular posts from this blog

Data Streams and VFML

Toward ensemble methods: A primer with Random Forest

Optimization: Simplex Algorithm