In order to facilitate the quantitative studies of voids we have developed an Automated Void Search and Analysis System (AVSAS). Applications of earlier versions of this system are given in Stavrev ([1990a,1990b,1990c,1991,1998]).

AVSAS is a programme package of about 50 modules executing the following functions: (1) preliminary data analysis - homogenization, comparison and combination of data from different sources; (2) local analysis of galaxy and cluster samples - construction and visualization of the nearest-neighbour distance field, search for and parameterization of voids (generation of void catalogues), visualization of the 2-D and 3-D distributions of voids, comparison of void catalogues, identification of the void population and the shell population, construction of void profiles; (3) statistical analysis - statistical comparison of observed samples with random samples, statistical analysis of void catalogues.

The void-search algorithm in AVSAS is based on a definition of the voids as space regions completely devoid of a certain type of objects (clusters, groups, galaxies).

To construct the search algorithm we introduce the concept of a distance field (Stavrev [1990a]). Such an approach has been applied also by Frisch et al. ([1995]), Lindner et al. ([1995]), and recently by Aikio & Mähönen ([1998]).

Let S be a sample of n objects A_i, i = 1, ..., n, with Cartesian coordinates x_i, y_i, z_i in a 3-D coordinate system as described in Sect. 2.1. We construct in the spatial volume V_S of sample S a cubic grid G, oriented along the axes $x,\ y,\ z$ , with grid constant $k\ [h^{-1}$ Mpc] and grid nodes g_j with coordinates $x_j,\ y_j,\ z_j$ , where j = 1, ..., m. For each node the distance to its nearest neighbouring object is computed from

$\begin{displaymath}d_j = \min_{i} {\{[(x_i - x_j)^2 + (y_i - y_j)^2} + (z_i - z_j)^2]^{1/2}\},\end{displaymath}$

The local maxima of the d-field indicate regions devoid of objects, while the local minima indicate regions populated by objects. Thus, the task of identification of voids is reduced to the task of determination of the local maxima of a 3-D field.

The local maxima (LM) of the d-field correspond in real space to the centres of the largest empty spheres embedded in the voids. A better approximation of a void (its position, dimension, shape) can be achieved by a system of crossing (overlapping) empty spheres, i.e. by a group of grid nodes defining a local enhancement of the d-field (Stavrev [1991]; see also Stavrev [1990c]). Similar methods for void search have been developed by El-Ad et al. ([1996]) and Aikio & Mähönen ([1998]).The approximation of a void by more than one sphere is shown schematically in Fig. 6.

To determine the LM we introduce first a threshold value d_min corresponding to the radius of the smallest voids which are to be identified, i.e. the search for the LM is restricted over the points of the d-field with $d \geq d_\mathrm{min}$ . The neighbourhood of a point of the d-field with value d_i in which a LM is searched is a sphere with radius r_i just equal to d_i. This means that the search procedure uses a variable neighbourhood which depends on the local properties of the distribution of objects: the neighbourhood is larger in the more sparsely populated regions and smaller in the denser regions. This feature of the procedure makes it suitable for processing spatial distributions of incomplete samples of objects when the number density varies strongly depending on the distance or the galactic latitude.

AVSAS offers two different methods for determining the LM: (1) by direct comparison of the d-field values in the neighbourhood of the d-field points, and (2) by a criterion $\nu\sigma$ for the standard deviation of the d-field values in the neighbourhood of the d-field points.

Let $d_{i}^{\,0} \geq d_\mathrm{min}$ be the value of the d-field in the running node g_i⁰ which has neighbourhood V_i⁰ with radius $d_{i}^{\,0}$ , where i = 1, ..., m₁, $m_1 \leq m$ (m₁ is the number of nodes with $d \geq d_\mathrm{min}$ ). Let V_i⁰ contain N nodes with values d_j, j = 1, ..., N. We define $d_{i}^{\,0}$ as a local maximum point (i.e. a point in the region of a local maximum of the d-field) in V_i⁰ if for all j

If the parameter p = 0 the above condition reduces to $d_{i}^{\,0} \geq d_j$ for all j. In this case $d_{i}^{\,0}$ is an absolute maximum in V_i⁰ and corresponds to the largest empty sphere embedded in the void.

If p > 0 the algorithm identifies $d_{i}^{\,0}$ as a local maximum point not only when it is an absolute maximum in V_i⁰ but also when it is smaller by less than $p\times100$ % than the maximum value of the d-field in V_i⁰. In this case the algorithm describes the LM of the d-field by a larger number of points than for p = 0. Thus, a void is identified (in the next stages of the procedure) as a group of LM points, i.e. the void is approximated by a system of crossing empty spheres.

We shall call hereafter the empty spheres constituting a void constituent spheres (CS). For convenience, the method for the determination of the LM points by direct comparison using the parameter p will be called p-method.

$\begin{figure}\includegraphics[width=8.3cm]{1766f6.eps} \end{figure}$

Figure 6: A schematic representation of the multi-sphere method for the void search. A cross-section of the distance field defined on a grid with constant k is shown with three local maximum points $d_1,\ d_2,\ d_3$ and their nearest neighbours ${A_1,\ A_2,\ A_3}$ , defining three constituent spheres of a void, one of which is the largest empty sphere

The second method optionally used in AVSAS for the determination of the LM points consists of the following. First, we calculate the mean of the d-field in the neighbourhood V_i⁰ of the running node g_i⁰

$\begin{displaymath}\sigma_{i}^{0} = \left\{\frac{1}{N-1}\left[\sum_{j=1}^{N} d_{... ...{N} \left(\sum_{j=1}^{N} d_{j}\right)^{2}\right]\right\}^{1/2}.\end{displaymath}$

Applying any of the two described methods on all $d_i \geq d_\mathrm{min}$ , the search procedure produces a set of LM points (or CS). Part of them lie at the boundaries of the examined space volume. These peripheral LM points can optionally be removed from further analysis if it is judged that they are of low significance because of boundary effects or/and sample incompleteness.

In the next stage, the search algorithm groups the LM points on the basis of criteria for their mutual positions. Each defined group of LM points constitutes a separate void in the spatial distribution of the sampled objects. The process of grouping of the LM points is complicated by the percolation of the neighbouring voids, especially when samples of rare tracers of the LSS, such as the rich clusters of galaxies, are used. Often the percolation is strengthened by the sample incompleteness.

Because of the uncertainty in delimiting neighbouring voids the search algorithm allows for a certain degree of overlap among them. It offers a choice between three options for grouping the LM points: (1) compact grouping, (2) medium-compact grouping, and (3) loose grouping. According to the first option two LM points affiliate to the same group if the distance between their positions $\Delta r = k$ , where k is the grid constant. This option leads to compact groups of a comparatively small number of LM points, and consequently a large number of voids with a high degree of percolation of neighbouring voids. The criterion for the medium-compact grouping is $\Delta r \leq \sqrt{3} \, k$ (i.e. a distance equal to the diagonal of the elementary grid cube). This option produces voids with more complicated configurations of CS and less overlap with neighbouring voids than the first option. The criterion for the third option - loose grouping -, unlike the first two options, does not depend on k: two LM points with values d₁ and d₂ are grouped together if $\Delta r < \max(d_1, d_2)$ . This criterion assures that the center of at least one of the two CS lies within the other CS. The voids produced by loose grouping of LM points may have complicated (irregular) shapes and a large number of CS, i.e. large sizes. However, the overlap of neighbouring voids is less (relative to void size) compared with the first two options.

The voids identified with option 1 can be considered as substructures of the voids identified with options 2 and 3, and vice versa - the voids identified with option 3 can be considered as unifications of the voids identified with options 1 and 2.

The choice of the grouping option from 1 to 3 acts parallel to the choice of a higher value of the parameter p and a lower value of the parameter $\nu$ , i.e. towards larger voids with more complicated shapes.

As is seen from the above the void-search algorithm needs choices of suitable values for the following free parameters: $k,\ d_\mathrm{min},\ p/\nu$ , and the grouping option. The question arises: given the other values, what values of p and $\nu$ , respectively, are most suitable for the void search? To answer this question a number of tests of the p-method and the $\nu$ -method have been carried out over two test samples (AR/L and AR/N, limited to volume V2, see Sect. 2.3) using two values for k (10 and 20 h^-1 Mpc), two values for the minimum void diameter $D_\mathrm{min} = 2\,d_\mathrm{min}$ (50 and 80 h^-1 Mpc), and two grouping options (medium-compact and loose grouping). Then, for a certain combination of $k,\ D_\mathrm{min}$ , and the grouping option we check how the number of identified voids and CS changes as a function of p or $\nu$ .

The values of D_min have been chosen to be roughly equal to or larger than the mean separation of clusters which is $\sim$ 53 h^-1 Mpc and $\sim$ 42 h^-1 Mpc for the $R \geq 1$ and $R \geq 0$ Abell clusters, respectively (Batuski et al. [1991]). With these D_min the values of k are chosen to be sufficiently small, so that the LM corresponding to the smallest identified voids can be determined from statistics over at least several dozens of d-field points. The only combination for which this requirement is not fulfilled is the one with D_min = 50 h^-1 Mpc, k = 20 h^-1 Mpc.

The choice of a sufficiently small value of k is also of importance for the accuracy of the determined void parameters. From elementary considerations follows that the error of the positions of the CS centres due to the discreteness of the d-field can be estimated as $\frac{\sqrt{3}}{4} \, k$ . For the two chosen values of k (20 and 10 h^-1 Mpc) the error is 8.66 and 4.33 h^-1 Mpc, respectively. In the latter case the error is comparable to the diameter of Abell clusters (3 h^-1 Mpc), and it is less than 10% of the diameter of the smallest identified voids (50 h^-1 Mpc). Let us also note that the background contamination, the large peculiar velocities of the cluster members, and the use of estimated redshifts may lead to positional errors larger than both chosen values for k.

$\begin{figure}\begin{tabular}{cc} \vspace{-4cm} \includegraphics[width=6cm]{1766... ...766f7c.eps} & \includegraphics[width=6cm]{1766f7d.eps}\end{tabular} \end{figure}$

Figure 7: Number of voids as functions of the parameters p a and b) and $\nu$ c and d) for medium-compact (left panel) and loose (right panel) grouping of the local maximum points, for two test samples (AR/L and AR/N) and four combinations of the grid constant k and the minimum void dimension D_min

The results from the tests are shown in Fig. 7. Meaningful values of p and $\nu$ lie in the ranges p $\stackrel{\textstyle <}{\sim}$ 0.20 and $\nu$ $\stackrel{\textstyle >}{\sim}$ 1.0. Outside these ranges the search procedure produces a too large number of LM points and this causes difficulties in delimiting neighbouring voids because of increased overlap among them. For the case of the medium-compact grouping (Fig. 7, left panel) the number of voids increases continuously with increasing p and decreasing $\nu$ , while for the loose grouping (Fig. 7, right panel) the curves show well defined maxima in the number of voids. For values of $p \approx$ 0 and $\nu \approx$ 2.6-2.7 the search algorithm identifies only the most significant voids - deep, spherical and well isolated regions, approximated by single empty spheres. The increase of p, respectively the decrease of $\nu$ , relaxes the criterion for the selection of LM points, hence the number of voids grows. At the same time, voids enlarge and start to overlap more due to the increased number of CS. This causes the merging of neighbouring voids and consequently a decrease of the number of voids. In the case of the loose grouping of the LM points, for certain values of the parameters p and $\nu$ the process of merging overtakes the increase of the number of LM points, hence the number of voids decreases. If p is left to grow very large or $\nu$ to become very small the number of voids tends to 1: a single, continuous, network structure is formed. For the medium-compact option the process of merging is not strong enough to stop the growth of the number of voids with the growth of p, respectively with the decrease of $\nu$ . (This is true for the case of compact grouping of the LM points, too.)

The test curves for the p-method, loose grouping (Fig. 7b) show maxima approximately for the same value p_m = 0.10-0.12. The $\nu$ -method, however, shows two distinct maxima depending on the value of k: $\nu_\mathrm{m} =$ 1.8 for k = 10 h^-1 Mpc and $\nu_\mathrm{m} = 1.4-1.5$ for k = 20 h^-1 Mpc. This stronger dependence of the $\nu$ -method on k in comparison with the p-method is well outlined in Fig. 7c. On the other hand, it is seen from Figs. 7a-d that the $\nu$ -method is less dependent on D_min than the p-method.

Figure 7b shows that for the p-method the combination k = 20 h^-1 Mpc, D_min = 50 h^-1 Mpc leads to a substantially higher number of voids than all other combinations. This case differs from the other three combinations by its very low ratio D_min/k = 2.5. As a result part of the LM points are defined in neighbourhoods V_i⁰ containing a very small number of grid points. This probably has the effect of relaxing the criterion for the selection of voids, hence increasing the number of voids. The $\nu$ -method is less sensitive to this effect. We conclude that values of D_min/k lower than 4-5 should be avoided, especially when the p-method is used.

If we exclude this peculiar case and compare the number of voids for all other cases at p_m (Fig. 7b) with the number of voids at $\nu_\mathrm{m}$ (Fig. 7d) we see that for both methods the number of voids is in the same range of about 20 - 30 depending slightly on the combination (k, D_min), i.e. the agreement between the two methods is good.

In order to check how p_m and $\nu_\mathrm{m}$ depend on the number density we have used test samples with growing numbers of objects in the same volume. These tests show that p_m decreases slightly with increasing number density, while $\nu_\mathrm{m}$ is almost independent of it. The reason for this difference between the two methods is that the $\nu$ -method is more efficient than the p-method at rejecting the smaller voids (near D_min) in the denser regions as insignificant fluctuations of the d-field. For the p-method the increase of the sample density leads to an increase of the number and the overlap of the CS, hence the number of voids decreases. This effect is compensated by a smaller value of p_m which tightens the criterion for selection of LM points.

The tests of the void-search procedure suggest that it can be optimized by choosing values of p = p_m and $\nu = \nu_\mathrm{m}$ . That leads to the detection of the maximum number of voids at a low degree of overlapping of the neighbouring voids in the case of the loose grouping of the LM points. These values of p and $\nu$ can be applied as well in the other two options - compact and medium-compact grouping, in spite of the fact that the number of detected voids is not maximized, but for the sake of keeping a low degree of overlapping of the neighbouring voids, hence better delimited voids.

3.2 Determination of the void parameters

After voids are identified, AVSAS analyses their properties and defines the following parameters:

(1) Cartesian, equatorial and galactic coordinates of the CS centres and of the void centre. The latter is defined as the centre of the largest CS of the void, and alternatively, as the centroid of all CS of the void. The centroid is determined as the centre of gravity of the system of CS. If a void is composed of n_cs > 1 CS, each one with volume V_i, i = 1, ..., n_cs, and weight

$\begin{displaymath}x_{\mathrm{c}} = \sum_{i=1}^{n_{\mathrm{cs}}} {w_i x_i},\ \ ... ...},\ \ z_{\mathrm{c}} = \sum_{i=1}^{n_{\mathrm{cs}}} {w_i z_i},\end{displaymath}$

The Cartesian coordinates $x_{\mathrm {L}},\ y_{\mathrm {L}}$ in a Lambert equal-area projection of the CS centres and of the centroid are computed, too, as described in Sect. 2.1.

(2) Distances are computed to the centres of all CS, as well as to the CS centroid from the Cartesian coordinates:

$\begin{displaymath}r_i = \sqrt{x_i^2 + y_i^2 + z_i^2}\ \ \ [h^{-1}\,\mathrm{Mpc}],\end{displaymath}$

$\begin{displaymath}r_\mathrm{c} = \sqrt{x_{\mathrm{c}}^2 + y_{\mathrm{c}}^2 + z_{\mathrm{c}}^2}\ \ \ [h^{-1}\,\mathrm{Mpc}].\end{displaymath}$

(a) Diameters of the CS are computed directly from the values of the d-field as $D_i = 2 \, d_i \ \ [h^{-1}\,\mathrm{Mpc}]$ .

(b) Void dimensions along $x,\ y,\ z$ axes are determined from the projected distances between the centres of those two CS of the void, which are the farthest apart along each axis, plus their radii. Thus, for the x axis

$\begin{displaymath}D_x = \Delta_x + d_\mathrm{A} + d_\mathrm{B} \ \ [h^{-1}\,\mathrm{Mpc}]. \end{displaymath}$

(c) The maximum void dimension is determined from the distance between the centres of the two most widely separated CS of the void plus the radii of these two CS:

$\begin{displaymath}D_\mathrm{max} = \Delta_\mathrm{max} + d_\mathrm{A} + d_\mathrm{B} \ \ [h^{-1}\,\mathrm{Mpc}]. \end{displaymath}$

(d) The equivalent void diameter is the diameter of the sphere whose volume is equal to the total volume V_T of the void (see below):

$\begin{displaymath}D_\mathrm{e} = \left( \frac{6}{\pi}\,V_\mathrm{T}\right)^{1/3} \ \ [h^{-1}\,\mathrm{Mpc}].\end{displaymath}$

(4) Sphericity s. This parameter is introduced for a characterization of the void shape. It is defined as

$\begin{displaymath}V_i = \frac{4}{3}\,\pi\,d_{i}^{\,3} \ \ [h^{-3}\,\mathrm{Mpc^3}].\end{displaymath}$

(b) The total void volume V_T is computed numerically because of the complicated void shape (especially when the void is composed of a large number of CS), by counting the number of elementary grid cubes n_g in the void (strictly speaking, the cubes whose centres are inside the void). Then

$\begin{displaymath}V_\mathrm{T} = n_{\mathrm{g}}k^3 \ \ [h^{-3}\,\mathrm{Mpc^3}],\end{displaymath}$

(6) The objects surrounding the void are defined as the nearest neighbours to the centres of the CS of the void. (See Sect. 6 for a more complete identification of the objects surrounding a void.)

(7) The neighbouring voids of the void are determined from a condition for overlapping of at least one CS of the void with at least one CS of the neighbouring void. If d₁ and d₂ are the radii of two CS belonging to two different voids, they overlap if $\Delta r < d_1 + d_2$ , where $\Delta r$ is the distance between the centres of the two CS. (Note the difference between this criterion and the criterion for loose grouping of the LM points in Sect. 3.1.)

3 Automated void search and analysis

3.1 The void-search algorithm

3.2 Determination of the void parameters