- 3.1 The simulations
- 3.2 Evaluation of the simulations
- 3.3 The genetic algorithm
- 3.4 Notes on the simulations

As unknown variables I have taken the relative (systemic) velocities between the galaxies in
the plane of the sky ( and ), the separation along the
line of sight (), the masses (*m _{1}* and

Observations provide values of the relative systemic velocities along the line of sight () and the separations in the plane of the sky ( and ).

For each simulation, values of the unknown variables are obtained as described
in Sect. 3.3 below, and the orbital parameters can then be determined from
the masses (*m _{1}* and

The orbital parameters are of course not absolutely necessary for orbit integration
- the cartesian coordinates and velocities would suffice - but the
orbital parameters have the advantage of giving information which is easier to
visualize. For instance, the value of *e* immediately shows whether an
orbit is elliptical or hyperbolic.

When the orbital parameters have been found, i.e. decoded from the chromosome (see Sect.
3.3), the positions of the two galaxies are
integrated backwards in time. During the backwards integration, the galaxies are
represented by point particles moving on a two-body orbit. Starting at
time *t _{0}* the backward integration lasts for steps. At each
step

I use sign conventions such that *a* < 0 for hyperbolic orbits.
Using the orbital parameters, the relative positions and velocities
of the two galaxies can be obtained, through the transformations discussed in e.g.
Danby (1988) or Deutsch (1963).

At the end of the backward integration, a disc of particles is added to each of the
galaxies. Before the first simulation is carried out, the program reads two copies of a
standard disc of unit mass and unit scale length. However, the masses and scale lengths
of the two galaxies are generally different from unity, and therefore the positions and
velocities of the particles in each disc are, for each simulation, scaled to appropriate values.
The directions of rotation of the two discs are determined by the values of the spin parameters,
*s _{1}* and

Then, the orbit is integrated forward in time until the final step (corresponding to the time of the observation) is reached, at which point the position and velocity data are stored in the manner described below, and the next simulation can begin.

For hyperbolic orbits the backward integration is straightforward and can be terminated when the galaxies are at sufficient distance (a few galactic radii, say) from each other. The situation for elliptical orbits is more difficult: If the duration of the backward integration is badly chosen, the galaxies may not be sufficiently separated at the start of the forward integration. To avoid this problem, the value of can be chosen individually for each orbit in such a way that the backward integration proceeds until the apocentre is reached. Furthermore, the problems encountered for elliptical orbits are more general than that, since previous encounters may have damaged the discs of the galaxies, especially for short period orbits. However, it would be unfortunate to completely neglect elliptical orbits, and thus in Sect. 5 some cases of interacting galaxies on such orbits will be considered as well.

Since I only use artificial observations here, I will use as data the amount of mass at each pixel rather than the amount of light. When real data is used, a scale factor (the mass-to-light ratio) must be introduced.

Thus, at the end of each run a grid is superposed on the two galaxies, and the amount of mass in each grid cell is stored, each particle contributing if it belongs to the first galaxy or if it belongs to the second, where and are the number of particles used for the first and the second galaxy, respectively.

The data from the orbit integration which provides the (artificial) observation is stored in the same way, and is read by the program before the first simulation is carried out.

The number of grid points in the *x*- and *y*-directions, denoted *n*_{x} and *n*_{y},
as well as the size, , of the (quadratic) grid cells are input parameters
to the program, and should obviously be chosen in such a way that the relevant parts of the
two galaxies are contained within the grid. For the data presented in Fig. 1,
the *x*- and *y*-coordinates ranged from -14.0 to 14.0 in program units,
so if the grid parameters are taken to be, for example, *n*_{x} = *n*_{y} = 4
and *L* = 7.0, the corresponding data will be the matrix of mass values in the
right of Fig. 1.

In order to evaluate a given simulation, the deviation between its results and the
observational data should be measured. The deviation measure can be defined in
different ways, and it will here be defined as

(1) |

When real data is used, it is the amount of light at each grid point,
rather than the mass, that is detected, and the deviation measure could instead be
defined as, for example,

(2) |

Note that, with the deviation measure defined in Eqs. (1) and
(2), the GA *does not* make use of the radial velocity field of the
two interacting galaxies: The only velocity information used are the systemic
velocities of the two galaxies. It is important that the algorithm should be
able to function without knowledge of the complete radial velocity field of
the interacting system since, in many cases, one does not have access to
both accurate position *and* velocity data. However, if the velocity
information is available, it should of course be used as it may help the GA
to distinguish between orbits for which the position data appear similar.
Thus, in analogy with Eq. (1), a velocity deviation measure can be
defined as

(3) |

When the chromosomes have been initialized, the program loops through all of the
individuals in the first generation. For each individual, the
chromosome is decoded and the orbital parameters are computed as outlined in Sect.
3.1. Then, the two orbit integrations (backward and forward) are carried out,
the result is compared with the observational data in the manner
described in Sect. 3.2, and the resulting value of the deviation is
computed and stored. In cases where only position data is available, the function

(4) |

(5) |

In order to produce the genetic material, i.e. the chromosomes, of the second generation, the chromosome of the best individual in the first generation is copied twice, and the remaining individuals of the second generation are formed through the process of parent selection followed by crossover and finally mutation as described in the Appendix. When all new individuals have been generated, the genetic material of the first generation is deleted and the evaluation of the second generation can begin. After the second generation has been evaluated, the third generation is formed, all its constituent individuals are evaluated, etc.

In this process, individuals with high fitness values
will have a greater chance of spreading their genetic material to the next generation than
individuals with low fitness values. To those unfamiliar with GAs, this process may not,
at a first glance, seem to be very efficient. However, as many authors have concluded
(e.g. Charbonneau 1995) and as we shall see in Sects.
4 and 5, GAs are in fact *very*
efficient for solving optimization problems for which the search spaces are large.

A typical 100 generation computer run with 1000 particles per galaxy and requires 12.3 CPU hours on a Sun Enterprise 2.

The units used are such that *G*, the gravitational constant, is equal to 1. The unit
of length can be taken to be 1 kpc, the unit of time 1.05 Myr, and the unit of
mass . The unit of velocity is then equal to 931
km s^{-1}. Other scalings to physical units can, of course, be used as
well. In the genetic encoding scheme, the galaxy masses range from 0.01 to
99.99, the separation along the line of sight () ranges from
-999.9 to 999.9, and the relative velocities ( and ) range from -9.999 to 9.999. The spins can take the values -1 and
1. The total number of combinations is (!).

Often, however, the full ranges just described need not be used. For instance, if two galaxies are strongly interacting and connected by a bridge, one does not expect them to be almost a thousand kpc apart along the line of sight! If, for instance, the values of can be restricted to the interval -50 to 50, say, the number of possible values of is reduced from 20000 to 1000. Even with such reductions, the search space will still be very large, and an efficient search method is therefore needed.

In all runs, the crossover rate (see Appendix) was equal to 1.0, and elitism was used.

Copyright The European Southern Observatory (ESO)