2. The core-sampling method

2.1. Basis of core-sampling method

The mathematical basis for this method had been described by Buryak et al. (1991) and Buryak et al. (1994). A recent detailed discussion can be found in Doroshkevich et al. (1996).

First let us give the salient points of the core-sampling method.

The distribution of structure elements along a random straight line is assumed to be Poissonian. Thus, a 1D cluster analysis is utilized to discriminate the structure elements among the sample of points and to find the number and the mean separation of structure elements.
For the following cluster analysis the fields are then all organized into an "equivalent single field" by combining the separate 1D distributions one after the other along a line, with the first point of a field placed on top of the last point of the preceding field.
The dependence of the number and mean separation of structure elements on the diameter of the core allows a rough discrimination between the filament and sheet-like populations of the structure elements and yields the fundamental characteristics of the structure, namely, the surface density of filaments, , and the linear density of sheets, .
The sample of points under consideration is reduced by rejecting poorer and sparser structure elements. In this manner, the mean characteristics can be found as a function of the threshold richness.

Let us emphasize here that in practice the one-dimensional analysis is very convenient in many respects. We are using here and in numerical simulations cylinders around straight lines, but they can be easily replaced by cones for observational surveys. Thus, the core-sampling method can be directly used to analyze pencil beam surveys as well as almost two-dimensional samples like the slices of the deep Las Campanas survey or real three-dimensional surveys. Moreover, in case of two- and three-dimensional samples it allows measurement of structure parameters in different directions, for instance, both along the line of sight and the transverse direction. Thus, redshift space distortions can be extracted (Doroshkevich et al. 1996).

The assumption of a functional form for the distribution of structure elements along the core is essential for our method, because it enables transformation of the point distribution into a distribution of structure elements. Once this transformation is accomplished, the appropriateness of the assumed functional form can be tested and the values of the functional parameters can be determined.

As stated above, we assume a 1D Poissonian law for the distribution of structure elements (not galaxies) along the axis of cylinder. It is a simple distribution, and we shall see that it is also a reasonable assumption. Indeed, the Poissonian distribution arises naturally for some theoretical models (White 1979; Buryak et al. 1991) when the mean separation of structure element exceeds the correlation length. The validity of this assumption, however, cannot be tested a priori, and a possible difference between the assumed and the actual distributions depends on the sample in question, thus limiting the precision of the final results. This assumption was valid for all cases in previous investigations. Here we are testing the method with Voronoi tessellations, for which our assumption has to be tested as well.

2.2. Sample preparation

The first step of core-sampling analysis is the preparation of a sample with suitable parameters. This sample depends both on the radius of the cylinder and the linking length of the culling procedure. The sample is the basis for the further analysis.

The diameter of the cylinder must be a few times smaller than the expected size of the cells in order to avoid a masking effect due to possible overlapping of projected elements. On the other hand, the cylinder must be wide enough to allow during further analysis the sequential reduction of the radius by a factor of at least 2, preferably more, because the radius is that diagnostic parameter which allows one to distinguish between filaments and sheets. The number of particles has to be large enough to guarantee stability and reliability of results even for the smallest radius used for analysis. In practice one observes that this can be achieved if a structure element contains, in the mean, at least 5, or preferably more, particles.

The final step of the sample preparation is sample reduction, which is performed sequentially during the analysis. Sample reduction provides an additional method of distinguishing between different populations of filaments and sheets according to the mean density or, in other words, according to the multiplicity of the structure elements. It has to be performed before any further steps of analysis.

To do the sample reduction, a 1D cluster analysis of each field is performed with decreasing linking length of the culling procedure. All clusters with fewer members than a constant threshold multiplicity are rejected. Then the remaining number of particles in the sample depends on the current linking length and the threshold . For each linking length a certain fraction of points remains ( is the total number of particles). It is convenient to use the fraction f together with the threshold multiplicity , as parameters characterizing the reduced sample.

This procedure reduces the full sample, rejecting more and more points associated with poorer or sparse clumps. Only the tighter clumps are retained for further analysis. This approach emphasizes the local density within the structure elements. It allows the rejection of the low-density haloes from the structure elements and sparse structure elements as a whole.

2.3. Detection of the mean separation of structure elements

The next step of analysis is the detection of the mean separation of structure elements. At first, the axes of all cylinders are randomly combined to one line by identifying the last point of one axis with the first of the next. At this line a 1D cluster analysis with increasing linking length R is performed. Assuming a Poisson distribution, the number of clusters as a function of R is given by the relation

For a true Poissonian sample, these values are related to the length of the sample, , defined by the nearest and the farthest points in an average field, via the relationship

Thus, the difference between the linear density of structure elements and their mean separation obtained from Eqs. (1 (click here)) and (2 (click here)) can be considered a measure of the systematic error due to the deviation of the actual from the assumed (Poissonian) distribution of structure elements along the analyzed cylinder. To decrease this error we use an automatic procedure to find the optimal interval for R for the fit to Eq. (1 (click here)). The upper limit to this interval is fixed by the input of a minimum number , whereas the lower limit can vary and is defined by the condition

where is a desirable precision of fitting parameters. In general, a reasonable precision () can be achieved in the analysis.

If a desirable precision cannot be achieved, the reason is usually that the distribution used is far from Poissonian and the sample under consideration must be changed (e.g., by changing the cylinder radius).

2.4. Identification of structure elements

The final step of analysis is the discrimination of filaments and sheet-like structure elements and determination of the parameters and for both populations. To this end, we use a simple geometrical model for the structure elements (Buryak et al. 1994). According to this model, in each narrow cylinder the structure can be considered as a system of randomly distributed lines (filaments) and planes (sheets) which contain all points. In this model, the filaments are considered straight lines, and the sheets are considered flat planes. Of course, this approach is limited, and it cannot be used as an accurate description of the true matter distribution on large scales; one can best characterize it as an intermediate step between the local description of the matter distribution with density and velocity fields and the global description obtained by the topology or Minkowski techniques.

As it was mentioned above, we characterize the random distribution of straight lines (filaments) by their surface density, , i.e., the mean number of lines intersecting an unit area of arbitrary orientation. The random distribution of planes can be characterized by the linear density, , i.e., the mean number density of planes (sheets) crossing an arbitrary straight line. Equivalently, we can use the values and as typical measures of the mean separation of structure elements. To characterize the structure as a whole we can estimate the mean separation of structure elements -- filaments and sheets combined -- by , which can be thought of as the diameter of a sphere containing, on average, two structure elements,

The main characteristics of the structure can be obtained by fits to the radial dependence of the linear density of clusters, , and to the radial dependence of the mean separation of structure elements, ,

and

over the range of variation of core radius . Here is the mean depth of field.

Clearly, these parameters also depend on the sample being studied. In our analysis, Eqs. (5 (click here)) and (6 (click here)) were fitted by a maximum likelihood technique, and the resulting mean values of and were accepted as the final estimates of the structure parameters. The difference of the estimates from Eqs. (5 (click here)) and (6 (click here)) was included into the errors of the final values.

2.5. Methodological remarks

For the idealized model considered above, the core-sampling method would be expected to reproduce well the characteristics of the sample under consideration. However, in reality, several factors may distort the final results. For example, filaments may be apparently detected as sheets, if they pass through the center of the core. Also the noise of randomly distributed particles must be taken into account. Therefore, a specific technique must be used to obtain stable results.

The analysis of the Las Campanas Redshift Survey (Doroshkevich et al. 1996) has shown that the stability of core-sampling with respect to various accidental or systematic variations of density is very high. During the sequential random rejection of particles, the final estimates of mean separation of structure elements increased only as . In general, the precision and stability of the core-sampling method depends on the density contrast in structure elements relative to the mean number density of points. In case of pure Poissonian point distribution the "structure" parameters have been found with errors of about 50% (Buryak et al. 1994). Similar errors were found for the Soneira-Peebles model. On the contrary, for the observed Las Campanas Redshift Survey errors were found to be about 10% only.

Sample preparation.
During the first step of analysis, when the sample is prepared, one has to check whether the resulting 1D cluster distribution is Poissonian as assumed. One has to ensure this requirement by the variation of the radius of cylinder and the range of fit in Eq. (1 (click here)). During the further analysis the range of core radii used for the fit of Eqs. (5 (click here)) and (6 (click here)) is important to get the two values in agreement. These questions have to be solved before the final analysis. This means that during each of the intermediate steps leading to Figs. 1 (click here) to 3 (click here), the validity of all assumptions must be checked. Otherwise, the final result will be meaningless.
The surface density of filaments.
The richness of a filament in the core depends both on the actual properties of the point distribution analyzed and the geometry, i.e. the position and orientation of the filament relative to the core. Only the first piece of information is of interest; the geometry can be characterized analytically and used to improve the stability of the method.
The surface density of filaments depends on the fraction of particles retained in the sample and the multiplicity threshold. The main characteristic of filamentary structure is the full surface density of filaments, . However, at this point the noise is usually high and it is desirable to improve the estimate of this value.
Basing on the geometrical model described above one can calculate the distribution function for the length of intersection of identical filaments with the core. Thus, calculating the dependence of both the fraction f and the surface density on this length, we find in linear approximation

This relation estimates the fraction of poor filaments for the ideal case without noise. The linear fit of Eq. (7 (click here)) over some range of f allows one also to improve the estimate of for a moderate noisy sample. However, in the case of very noise samples with a significant fraction of filaments consisting only of one point, it can be only used for small linking lengths, , of the culling procedure. Indeed, if the main part of noise particles will be rejected (see Fig. 8 (click here), right). Evidently, for smaller values of f, the function resembles the actual properties of the filaments (see, e.g. Doroshkevich et al. 1996).
The linear density of sheets.
The richness of the elements of the sheet-like population both in observational data and simulations is usually quite high. Thus, the linear density of sheets can be estimated for a wide range of parameters f and , providing more reliable results. However, even in this case the final results are sometimes distorted because during reduction the fields are shortened as a result of the appearance of empty edges. To avoid these distortions, a correction procedure (see Doroshkevich et al. (1996) was used.
The next problem is a filament population masquerading as sheets. These apparent sheets are detected by the program if filaments cross the cylinder near its axes. Obviously, this effect will be proportional to the thickness of the filaments. The surface density of these apparent sheets is , where is the radius of the filament (Buryak et al. 1994). Usually the influence of this effect is negligible, but it can become important for thick filaments. Moreover, apparent sheets can be generated by the branch points (knots) of filamentary structure or other clumps in the point distribution if their size is comparable to the diameter of the core. The contribution of these knots can be estimated as , where and are respectively the volume density and the effective radius of the clump. For a random network structure, the volume density of the knots is approximately .

Up: The accuracy of