next previous
Up: Density estimation with

5. Conclusions

In this paper we have studied the efficiency of three recent density estimators, namely the adaptive kernel method introduced by Pisani (1993), the maximum penalized likelihood described by Merritt & Tremblay (1994), and our own wavelet-based technique. Wavelets have already been used to recover density estimations from a discrete data set (Pinheiro & Vidakovic 1995), but with a thresholding strategy involving the average energy of the wavelet coefficients at a given scale. Here the thresholding is defined with respect to the local information content, which enables us to obtain a better estimate from the statistical point of view. Several dedicated examples were used to compare these methods by means of extensive numerical simulations. These tests were chosen in order to cover several cases of astronomical interest (cluster identification, subclustering quantification, detection of voids, etc.).

Both experimental and "noiseless'' simulations indicate that the kernel and the wavelet methods can be used with reliable results in most cases. Nevertheless, it appears that the best solution is always provided by the wavelet-based estimate when few data points are available. The situation is more intricate when the number of points is large. Whereas the adaptive kernel estimator fails to clearly detect a small broad structure superimposed on a larger one, it can yield better results for separating two close, similar structures. As regards void detection, the wavelet estimate gives more confident results, but exhibits wider tails and higher spurious bumps on both sides of the underdensity.

Accounting for the genuine voids properly in the experimental distribution appears to be the main reason for the differences between the two approaches. The kernel method associates a smoothing function to each data point and the information coming from gaps in the data is not explicitly used for recovering the density function. On the contrary, the wavelet transform is able to detect both overdensities and underdensities in the same way. This approach is therefore more efficient in analyzing data sets where both highly contrasted features occur, which is especially the case in poor samples. When the contrast is reduced owing to an increase in the number of data, both methods give similar estimates.

The MPL method performs as well as the kernel- and wavelet-based approaches, as indicated by the "noiseless'' simulations. It appears that the results are somewhat intermediate between those obtained by means of the other two methods. However, it strongly suffers from the computational cost of the minimization algorithm adopted, which prevents its use for large data sets.

The three methods were applied to two redshift catalogues of galaxies which had already been used to check the efficiency of the kernel method and of another wavelet approach, respectively. The bimodality of the A3526 galaxy cluster is displayed by all the methods, as well as the existence of a background group of galaxies. Both results confirm the previous claims. A redshift sample from a survey of the Corona Borealis region was also analyzed. There also, all the estimates are consistent, mainly indicating a more intricate bimodality than in the A3526 sample. When compared to the alternative wavelet-based algorithm proposed by Pinheiro & Vidakovic (1995), our solutions indicate that the wavelet approach we have developed performs better from the point of view of density estimation.

In conclusion, taking into account the computational inefficiencies of the MPL method, both the kernel and wavelet methods can be used to obtain confident estimates of the underlying density related to discrete data samples. Wavelet solutions are to be preferred in searching for subclustering, especially in the case of few data points. Kernel estimations are more robust and perhaps easier to implement. Hence, this approach appears to be very useful for arriving at reliable solutions, if it does not matter that some small-scale details may not be detected. However, only the wavelet approach enables one to naturally decompose the restored density function in terms of single structures. Such decomposition is one of the main goals to be achieved for a deeper understanding of the dynamical status of galaxy clusters.

Acknowledgements

We are grateful to Frederic Rué for many stimulating discussions about the subtleties of the wavelet restoration algorithm. F.D. wishes to thank the Observatoire de la Côte d'Azur for its kind hospitality and Prof. F. Mardirossian for his friendly support.


next previous
Up: Density estimation with

Copyright by the European Southern Observatory (ESO)
web@ed-phys.fr