As unknown variables I have taken the relative (systemic) velocities between the galaxies in the plane of the sky ( $\Delta v_{x}$ and $\Delta v_{y}$ ), the separation along the line of sight ( $\Delta z$ ), the masses (m₁ and m₂), and the spins (s₁ and s₂) of the two galaxies. The encoding of the unknown parameters in the strings used by the GA is described in Sect. 3.3.

Observations provide values of the relative systemic velocities along the line of sight ( $\Delta v_{z}$ ) and the separations in the plane of the sky ( $\Delta x$ and $\Delta y$ ).

For each simulation, values of the unknown variables are obtained as described in Sect. 3.3 below, and the orbital parameters can then be determined from the masses (m₁ and m₂), the three components of the separation vector ( $\Delta x$ , $\Delta y$ , and $\Delta z$ ), and the three components of the relative velocity ( $\Delta v_{x}$ , $\Delta v_{y}$ , and $\Delta v_{z}$ ). The orbital parameters are a (semi-major axis), e (eccentricity), i (inclination), $\Omega$ (longitude of the ascending node), $\omega$ (the argument of pericentron), and T (the time of the pericentre passage). The transformations from cartesian coordinates and velocities to the orbital parameters are straightforward and well known, and will not be given here. For a discussion of the transformations involved, see e.g. Danby (1988) or Deutsch (1963).

The orbital parameters are of course not absolutely necessary for orbit integration - the cartesian coordinates and velocities would suffice - but the orbital parameters have the advantage of giving information which is easier to visualize. For instance, the value of e immediately shows whether an orbit is elliptical or hyperbolic.

When the orbital parameters have been found, i.e. decoded from the chromosome (see Sect. 3.3), the positions of the two galaxies are integrated backwards in time. During the backwards integration, the galaxies are represented by point particles moving on a two-body orbit. Starting at time t₀ the backward integration lasts for $N_{\rm step}$ steps. At each step j, the new time is computed as $t = t_{0} - j {\rm d}t$ , where ${\rm d}t$ is the length of the time step, and Kepler's equation is then solved to yield the eccentric anomaly E (for elliptic orbits) or the `hyperbolic' anomaly F (for hyperbolic orbits). The case e=1 exactly, i.e. parabolic orbits, is neglected. However, e can be arbitrarily close to 1.

I use sign conventions such that a < 0 for hyperbolic orbits. Using the orbital parameters, the relative positions and velocities of the two galaxies can be obtained, through the transformations discussed in e.g. Danby (1988) or Deutsch (1963).

At the end of the backward integration, a disc of particles is added to each of the galaxies. Before the first simulation is carried out, the program reads two copies of a standard disc of unit mass and unit scale length. However, the masses and scale lengths of the two galaxies are generally different from unity, and therefore the positions and velocities of the particles in each disc are, for each simulation, scaled to appropriate values. The directions of rotation of the two discs are determined by the values of the spin parameters, s₁ and s₂, as discussed in Sect. 3.3 below.

Then, the orbit is integrated forward in time until the final step (corresponding to the time of the observation) is reached, at which point the position and velocity data are stored in the manner described below, and the next simulation can begin.

For hyperbolic orbits the backward integration is straightforward and can be terminated when the galaxies are at sufficient distance (a few galactic radii, say) from each other. The situation for elliptical orbits is more difficult: If the duration of the backward integration is badly chosen, the galaxies may not be sufficiently separated at the start of the forward integration. To avoid this problem, the value of $N_{\rm step}$ can be chosen individually for each orbit in such a way that the backward integration proceeds until the apocentre is reached. Furthermore, the problems encountered for elliptical orbits are more general than that, since previous encounters may have damaged the discs of the galaxies, especially for short period orbits. However, it would be unfortunate to completely neglect elliptical orbits, and thus in Sect. 5 some cases of interacting galaxies on such orbits will be considered as well.

3.2 Evaluation of the simulations

Since I only use artificial observations here, I will use as data the amount of mass at each pixel rather than the amount of light. When real data is used, a scale factor (the mass-to-light ratio) must be introduced.

Thus, at the end of each run a grid is superposed on the two galaxies, and the amount of mass in each grid cell is stored, each particle contributing $m_{1}/N_{{\rm p}, 1}$ if it belongs to the first galaxy or $m_{2}/N_{\rm p, 2}$ if it belongs to the second, where $N_{\rm p,1}$ and $N_{\rm p,2}$ are the number of particles used for the first and the second galaxy, respectively.

The data from the orbit integration which provides the (artificial) observation is stored in the same way, and is read by the program before the first simulation is carried out.

The number of grid points in the x- and y-directions, denoted n_x and n_y, as well as the size, $L \times L$ , of the (quadratic) grid cells are input parameters to the program, and should obviously be chosen in such a way that the relevant parts of the two galaxies are contained within the grid. For the data presented in Fig. 1, the x- and y-coordinates ranged from -14.0 to 14.0 in program units, so if the grid parameters are taken to be, for example, n_x = n_y = 4 and L = 7.0, the corresponding data will be the matrix of mass values in the right of Fig. 1.

In order to evaluate a given simulation, the deviation between its results and the observational data should be measured. The deviation measure can be defined in different ways, and it will here be defined as

$\begin{displaymath} \delta = \frac{n_{\rm typ}^2}{n_{x} n_{y}} \sum_{i,j}\vert m... ...j} - m^{\rm obs}_{i,j}\vert/(m_{\epsilon} + m^{\rm obs}_{i,j}),\end{displaymath}$

(1)

where m_i,j is the mass in cell (i,j) obtained from the simulation, $m^{\rm obs}_{i,j}$ is the same quantity obtained from the observational data, and the sum extends over the whole grid. The $m_{\epsilon}$ in the denominator is needed to prevent a divergence in the cases where $m^{\rm obs}_{i,j}$ is zero, and its value is taken to be the typical mass of a particle in a galaxy of unit mass, i.e. $1/N_{\rm p}$ where $N_{\rm p}$ is the number of particles used in the simulations. The factor in front of the summation sign is a normalization factor. Its sole purpose is to make $\delta$ independent of the number of grid points used in the data comparison. Thus, $n_{\rm typ}^2$ is a typical value of the number of pixels, here taken to be 49, and n_x n_y is the number of pixels used in the computer run. In the runs discussed in Sects. 4 to 6, n_x = n_y = 7, yielding a normalization factor equal to 1. This choice is somewhat arbitrary. However, the values of n_x and n_y must neither be too small nor too large since, in the former case, the tidal features used by the GA will not be resolved and, in the latter case, the data from the observation will be unnecessarily detailed and thereby more sensitive to the noise which is always present in real data.

When real data is used, it is the amount of light at each grid point, rather than the mass, that is detected, and the deviation measure could instead be defined as, for example,

$\begin{displaymath} \delta \propto \sum_{i,j}\vert g_{i,j} - g^{\rm obs}_{i,j}\vert/(\nu + g^{\rm obs}_{i,j}),\end{displaymath}$

(2)

where g denotes the shading of pixel (i,j). If the g values were normalized to lie between 0 (black, no light) and 1 (white, maximum light), the value of $\nu$ could be, say, 0.001. The deviation measures just defined punish strongly those simulations which try to put many particles in regions which are only supposed to contain a few.

Note that, with the deviation measure defined in Eqs. (1) and (2), the GA does not make use of the radial velocity field of the two interacting galaxies: The only velocity information used are the systemic velocities of the two galaxies. It is important that the algorithm should be able to function without knowledge of the complete radial velocity field of the interacting system since, in many cases, one does not have access to both accurate position and velocity data. However, if the velocity information is available, it should of course be used as it may help the GA to distinguish between orbits for which the position data appear similar. Thus, in analogy with Eq. (1), a velocity deviation measure can be defined as

$\begin{displaymath} \delta_{v} = \frac{n_{\rm typ}^2}{n_{x} n_{y}} \sum_{i,j}\ve... ...overline{v}}^{\rm obs}_{i,j}\vert + {\overline{v}}_{\epsilon}),\end{displaymath}$

(3)

where ${\overline{v}}_{i,j}$ and ${\overline{v}}^{\rm obs}_{i,j}$ denote the average radial velocities in cell (i,j) obtained from the simulation and from observational data, respectively. The actual value of ${\overline{v}}_{\epsilon}$ ,which serves the same purpose as $m_{\epsilon}$ in Eq. (1), is not of great importance as long as it is not too large, and it can be taken equal to the systemic velocity of the larger of the two galaxies. If the systemic velocity is very large, a typical rotation velocity of either of the galaxies can be used instead.

3.3 The genetic algorithm

$\begin{figure} \psfig {figure=ds7239f2.ps,height=6.0cm}\end{figure}$

Figure 2: Top: The encoding scheme for the seven unknown variables. Genes 1-4 encode the mass of the first galaxy such that m₁ = g₁*10¹ + g₂*10⁰ + g₃*10^-1 + g₄*10^-2, where g_i denotes the value of gene i. The other variables are encoded in a similar fashion. Genes 9, 14, and 19 encode the signs of $\Delta z$ , $\Delta v_{x}$ , and $\Delta v_{y}$ , respectively, such that the sign is negative if the corresponding value is odd and positive if it is even. The spin of the first galaxy is -1 (clockwise) if the value of gene 24 is odd, and +1 if it is even. The spin of the second galaxy is encoded, in the same way, in gene 25. Bottom: an example of a chromosome. Decoding the chromosome, the values m₁ = 1.21, m₂ = 0.85 etc. are obtained

When the chromosomes have been initialized, the program loops through all of the $N_{\rm pop}$ individuals in the first generation. For each individual, the chromosome is decoded and the orbital parameters are computed as outlined in Sect. 3.1. Then, the two orbit integrations (backward and forward) are carried out, the result is compared with the observational data in the manner described in Sect. 3.2, and the resulting value of the deviation is computed and stored. In cases where only position data is available, the function

$\begin{displaymath} f = \frac{1}{1+\delta}.\end{displaymath}$

(4)

is used as a fitness measure. The constant in the denominator is introduced to prevent a divergence in the (unlikely) case of $\delta$ being equal to zero. If both position and velocity data are available, the fitness measure is instead defined as

$\begin{displaymath} f = \frac{1}{\sqrt{(1+\delta)(1+\delta_{v})}}.\end{displaymath}$

(5)

Thus, with the fitness functions just defined, f = 1 corresponds to a perfect match. When all individuals of the first generation have been evaluated, the fitnesses are ranked using linear fitness ranking as described in the Appendix. Thus, as an example, for the fitness measure defined in Eq. (4), the exact form of the mapping from $\delta$ to f will be of no importance when new generations are formed, and any fitness function for which the fitness increases with decreasing $\delta$ could have been used. However, as discussed in Sect. 4.2 below, the fitness values play an important role when the quality (i.e. the goodness of fit) of a set of orbital parameters is evaluated.

In order to produce the genetic material, i.e. the chromosomes, of the second generation, the chromosome of the best individual in the first generation is copied twice, and the remaining $N_{\rm pop}-2$ individuals of the second generation are formed through the process of parent selection followed by crossover and finally mutation as described in the Appendix. When all $N_{\rm pop}$ new individuals have been generated, the genetic material of the first generation is deleted and the evaluation of the second generation can begin. After the second generation has been evaluated, the third generation is formed, all its constituent individuals are evaluated, etc.

In this process, individuals with high fitness values will have a greater chance of spreading their genetic material to the next generation than individuals with low fitness values. To those unfamiliar with GAs, this process may not, at a first glance, seem to be very efficient. However, as many authors have concluded (e.g. Charbonneau 1995) and as we shall see in Sects. 4 and 5, GAs are in fact very efficient for solving optimization problems for which the search spaces are large.

3.4 Notes on the simulations

A typical 100 generation computer run with 1000 particles per galaxy and $N_{\rm pop} = 500$ requires 12.3 CPU hours on a Sun Enterprise 2.

The units used are such that G, the gravitational constant, is equal to 1. The unit of length can be taken to be 1 kpc, the unit of time 1.05 Myr, and the unit of mass $2\ 10^{11}M_{\odot}$ . The unit of velocity is then equal to 931 km s^-1. Other scalings to physical units can, of course, be used as well. In the genetic encoding scheme, the galaxy masses range from 0.01 to 99.99, the separation along the line of sight ( $\Delta z$ ) ranges from -999.9 to 999.9, and the relative velocities ( $\Delta v_{x}$ and $\Delta v_{y}$ ) range from -9.999 to 9.999. The spins can take the values -1 and 1. The total number of combinations is $3.2 \times 10^{21}$ (!).

Often, however, the full ranges just described need not be used. For instance, if two galaxies are strongly interacting and connected by a bridge, one does not expect them to be almost a thousand kpc apart along the line of sight! If, for instance, the values of $\Delta z$ can be restricted to the interval -50 to 50, say, the number of possible values of $\Delta z$ is reduced from 20000 to 1000. Even with such reductions, the search space will still be very large, and an efficient search method is therefore needed.

In all runs, the crossover rate (see Appendix) was equal to 1.0, and elitism was used.

3 The method

3.1 The simulations

3.2 Evaluation of the simulations

3.3 The genetic algorithm

3.4 Notes on the simulations