next previous
Up: Determination of orbital parameters


Appendix

The aim of this Appendix is only to introduce and describe some of the most elementary features of GAs. For a much more complete discussion of GAs and their performance compared with other algorithms, see Holland (1975), Davis (1991), or Mitchell (1996). For a review of GAs in astronomy and astrophysics, the article by Charbonneau (1995) is strongly recommended.
  
\begin{figure}

\psfig {figure=ds7239f6.ps,height=5.0cm}\end{figure} Figure 6: A typical chromosome of an individual in the inverse-function example. The first gene codes for the first decimal of x, the second gene for the second decimal of x etc. Decoding the chromosome, the values x = 0.539 and y = 0.618 are obtained
  
\begin{figure}

\psfig {figure=ds7239f7.ps,height=5.0cm}\end{figure} Figure 7: The crossover procedure. The chromosomes are divided at the crossover point, which is indicated by a thick line, and the parts are joined as shown above

GAs have been applied in many different subjects, including machine learning, population genetics, neural network design, economics etc. Among other things, GAs are well suited for search and optimization, and are particularly useful when the search spaces are large.

Since GAs are inspired by natural evolution, the terminology often involves terms from biology, such as genes, populations, fitness etc. For an introduction to the terminology, see e.g. Charbonneau (1995) or Mitchell (1996). Whenever such terms are introduced for the first time in this Appendix, they will be put in italics, and hopefully their meaning should be clear from the context.

When a GA is to be applied to an optimization problem, the variables of the problem are first encoded in strings (of given length) of integers. Initially, a population (i.e. a set) of $N_{\rm pop}$ individuals are formed by randomly generating such a string for each individual. Each string constitutes the chromosome (i.e. the genetic material) for the individual. The encoding can be either binary or decimal such that the values at the different genes (i.e. locations) along the string are integers in the range [0, 1] (for the binary case) or [0, 9] (for the decimal case). The whole set of $N_{\rm pop}$ individuals with their corresponding chromosomes constitutes the first generation.

As a trivial example, (the "inverse function example"), imagine that one wishes to find a pair of numbers (x0,y0) such that a given function h(x,y) takes a particular value h0 = h(x0,y0). For simplicity, assume that h(x,y) takes the value h0 only at the one point (x0,y0) and that x and y both lie in the interval [0, 1]. If decimal encoding with three digit accuracy is used, the chromosome of an individual could have the form shown in Fig. 6.

  
\begin{figure}

\psfig {figure=ds7239f8.ps,height=4.5cm}\end{figure} Figure 8: The mutation procedure. For each of the six locations along the string, a random number r between 0 and 1 is generated and compared with $p_{\rm mut}$. If r is smaller than $p_{\rm mut}$, a new, random value is assigned to the gene

When the first generation has been formed, the fitness of its constituent individuals should be evaluated. Thus, for each individual, the variables are obtained by decoding the chromosome. Given those variables, the relevant computation can be carried out. In the inverse function example, the computation consists of forming h(x,y) using the values of x and y. Then the result of the computation is compared with the desired result, and a fitness value is assigned such that the smaller the deviation from the desired result, the higher the fitness.

If $h(x,y) = {\rm e}^{x y}$ in the inverse function example, and one is looking for values x and y in [0,1] such that $h(x,y) = {\rm e}$ (the correct solution of course being x=1, y=1), then the individual shown in Fig. 6 would give the value h(0.539,0.618) = 1.39529, the deviation would be $\delta = e - 1.39529$, and the corresponding fitness value f could be defined as $f = 1/(1+\delta)$.

When all the individuals of the first generation have been evaluated and fitness values have been assigned, the second generation is formed by applying various procedures inspired by natural evolution to the chromosomes of the individuals in the first generation. These procedures include selection (followed by crossover) and mutation.

In order to perform a crossover between two chromosomes, two parents are selected from the generation just evaluated. The choice of parents is made in such a way that individuals with higher fitness have a greater probability of being selected than individuals with lower fitness. The fitness values can either be used directly, or some more sophisticated method can be employed. Linear fitness ranking is one example of such a method, in which the individuals are sorted according to their fitness and the best individual is assigned a new fitness equal to $N_{\rm pop}$, the second best is assigned a new fitness equal to $N_{\rm pop} - 1 $, and so on. This procedure enhances the differences between the individuals, especially if their original fitness values (before ranking) are very similar to each other.

There exist several methods of choosing parents, and here only one of the simplest shall be discussed, namely roulette-wheel selection. When this selection method is used, the sum of the fitnesses $f_{i}, i = 1,2,...,N_{\rm pop}$ is formed, a random number r between 0 and $\sum_{i}f_{i}$ is generated, and the first individual i which satisfies the condition
\begin{displaymath}
\sum_{j=1}^{i}f_{j} \geq r,\end{displaymath} (7)
is selected as a parent. As an example, if $N_{\rm pop} = 3$ and the fitness values are 2, 5, and 3, the first individual is selected if $r \leq 2$, the second is selected if $2 < r \leq 7$ and the third is selected if r > 7. When two parents have been chosen (usually with replacement, i.e. such that an individual can be chosen several times), crossover is performed by dividing the chromosomes of the two parents into two parts, and joining the parts as shown in Fig. 7. The point at which the cut is performed is called the crossover point.

When crossover is carried out, two partial solutions to the problem can be joined to form a full solution. Returning to the inverse function example with $h(x,y) = {\rm e}^{x y}$ as above, it is clear that the two parents in Fig. 7 would both be rather far from the correct solution x=1, y=1. However, the second of the two new individuals (bottom row in the figure) formed by crossover would be much closer to the solution and would obtain a high fitness value.

Thus, in this way, a new set of chromosomes is formed. Usually not all new chromosomes are formed by crossover. Instead, a crossover rate (denoted $p_{\rm c}$), is introduced such that crossover is applied to a given pair of parents only if $q \leq p_{\rm c}$, where q is a random number between 0 and 1. If $q \gt p_{\rm c}$ the two parents are copied without modification. Finally, mutation is applied to the new chromosomes. In order to perform mutation on a chromosome, a random number r is generated for each gene along the string, and the condition $r < p_{\rm mut}$, where $p_{\rm mut}$ is the mutation probability, is tested. If the condition is satisfied, the value of the gene is changed to a new random value. The procedure is illustrated in Fig. 8.

Sometimes, the best chromosome(s) are copied directly into the next generation (i.e. without crossover or mutation). This is referred to as elitism.

The chromosomes thus obtained (or, more strictly, the individuals corresponding to the chromosomes) constitute the second generation. The individuals of the second generation are then evaluated and fitness values are assigned to each individual, after which the third generation is formed etc. This process continues until an acceptable solution to the problem has been found.

The description above only scratches the surface of the vast subject of GAs and the interested reader is again referred to the references cited at the beginning of the Appendix.

Acknowledgements

I would like to thank Dr. K.J. Donner for carefully reading the manuscript, and an anonymous referee for many helpful and constructive comments. I dedicate this work to the memory of my friend and thesis advisor Dr. Björn Sundelius.


next previous
Up: Determination of orbital parameters

Copyright The European Southern Observatory (ESO)