next previous
Up: The use of minimal distribution


Subsections

2 The MST: Theory and calibration

 

2.1 Theory

 The MST is a geometrical construction issued from the graph theory: the used definitions are given in Dussert (1988). Very briefly, it is a tree joining all the points of a given set, without a loop and with a minimal length; each point is visited by the tree only 1 time. The main aspect here is the unicity of such a construction. For a given set of points, there are more than 1 MST, but the histogram H of the MST edges is unique. This is fundamental because it is then possible to completely characterize a set of points with H.

Traditionally, only the two first momenta of H mean m and dispersion $\sigma $ are used. These two parameters are efficient for a Gaussian distribution. To characterize some non Gaussian distributions, we have to use more advanced momenta like the skewness s and/or the curtosis c. We have tested the use of these parameters in the following. Below, we describe the methodology:

This algorithm is designed to be the fastest one to have a MST on a given set of points. We normalize the lengths of the MST by using the Beardwood et al. (1959) study. A good approximation of the total length of a MST constructed with a random set of N points in an area S is $\frac{\sqrt{S\times N}}{N-1}$. So, we divide all the length by this factor (where S is the area of the maximum rectangle of the point set). We calculate finally the mean m, the dispersion $\sigma $, the skewness s and the curtosis c of H.

2.2 Calibrations of the method

  We test our algorithm using simulations. We calculate m, $\sigma $, s, c for different sets of simulated points. The points are generated in 500$\times $500 boxes. We note here that we normalize the distances and so, the unity of the box size is not important.

In order to characterize the cuspiness degree of the distributions, we use three kind of 2D density profiles: points randomly distributed (Poisson distribution), distributed with a centered King profile (flat profile in the center) and distributed with a centered NFW profile (cusped profile in the center: Navarro et al. 1995). We note here that the NFW expression was for a 3D distribution. Applying it for a 2D set of points generate a more cusped profile compared to the original 3D NFW. However, we will speak of "NFW profiles'' hereafter. The way we generated sets of points with a given profile is described in Adami (1998) and is related to the techniques described in Press et al. (1992). If $\rho $ is the density and r the radius, we have:

\begin{displaymath}
\rho _{\rm 2D\_King}(r)=\frac 1{1+(\frac r{r_{\rm c}})^2}\end{displaymath}

and

\begin{displaymath}
\rho _{\rm 2D\_NFW}(r)=\left(\frac 1{\frac r{r_{\rm c}}\left(1+\frac r{r_{\rm c}}\right)^2}\right)^{2/3}.\end{displaymath}

We will call hereafter $r\rm _c$ the characteristic radius of a given profile. For the King profile, it is the core radius and for the NFW profile it is a characteristic radius (no core for this profile). We simulate 8 sets of random distributions: with 10, 25, 60, 125, 250, 500, 750 and 1000 points.

For each set of points with a given profile and a given size, we proceed 100 realizations and so 100 calculations of m, $\sigma $, s, c. From these data, we are able to compute the mean value and the dispersion of each parameter m, $\sigma $, s or c.

2.2.1 Poisson distributions

  We plot in Fig. 1 (m, $\sigma $) and (s, c) for a Poisson distribution.

  
\begin{figure}
\includegraphics[angle=-90,width=8.8cm,clip]{fig1.ps}\end{figure} Figure 1: Variation with the size N of the sample of: up: (m, $\sigma $), m is the upper line and $\sigma $ is the lower line and down: (s, c), s is the line with error bars and c is the line without error bar

The parameters m and $\sigma $ are asymptotically equal to 0.66 $\pm $ 0.02 and 0.31 $\pm $ 0.02 in perfect agreement with Dussert (1988). The error bars are 3% of the mean value. The final value is reached for a number N of points in the simulation greater than 125.

The skewness s is well defined for $N\geq 250$ with a final value of 0.29 $\pm $ 0.14. The errors are greater: about 50% of the mean value.

The curtosis c is well defined only for $N \geq 750$ with very large error bars ($\sim $100% of the mean value). The final value is 0.33.

2.2.2 King and NFW profiles

 
  
\begin{figure}
\includegraphics[angle=-90,width=17cm,clip]{fig2.ps}\end{figure} Figure 2: Variation with the characteristic radius of the King distribution of: upper left: (m, $\sigma $) and 250 points (the increasing line is m and the decreasing line is $\sigma $), upper right: (s, c) and 250 points (s is the line with error bars and c is the line without error bar), lower left: (m, $\sigma $) and 500 points (the increasing line is m and the decreasing line is $\sigma $), lower right: (s, c) and 500 points (s is the line with error bars and c is the line without error bar)

We calculate (m, $\sigma $, s, c) for different characteristic radii $r_{\rm c}$ of King and NFW profiles. We use $r_{\rm c}=50$, 75, 100, 125, 150, 175, 200, 225, 250, 275 and 300 kpc. For a given profile and a given characteristic radius, we simulate 2 sets of points: 250 and 500. We plot the variation of (m, $\sigma $) and (s, c) with $r_{\rm c}$ for these 2 sets of points in Fig. 2 for the King profiles. The parameters vary significantly with $r_{\rm c}$at the 3 $\sigma $ level. The size of the errors are similar to those for the Poisson case: very small for (m, $\sigma $), median for s and very large for c. The parameters (m, $\sigma $) are not significantly different from the Poisson case for large characteristic radii ($\geq $225 kpc). The skewness is significantly different at the 1 $\sigma $ level from the Poisson case whatever the characteristic radius. The mean value of the curtosis is also different, but not significantly because of the large error bars.

  
\begin{figure}
\includegraphics[angle=-90,width=17cm,clip]{fig3.ps}\end{figure} Figure 3: Variation with the characteristic radius of the NFW distribution of: upper left: (m, $\sigma $) and 250 points (the increasing line is m and the decreasing line is $\sigma $), upper right: (s, c) and 250 points (s is the line with error bars and c is the line without error bar), lower left: (m, $\sigma $) and 500 points (the increasing line is m and the decreasing line is $\sigma $), lower right: (s, c) and 500 points (s is the line with error bars and c is the line without error bar)

In Fig. 3 we plot the variations for the NFW profiles and the trends are very different. All the parameters (m, $\sigma $, s, c) differ significantly at the 1 $\sigma $ level from the Poisson case. We also notice an important degeneracy between m and $\sigma $.

2.2.3 Discrimination between the 3 profiles

 We want to determine a parameter based on m, $\sigma $, s and c which is able to discriminate the three profiles. We want to test the distance in a n dimensional space with n=2, if we use (m, $\sigma $), n=3 if we use (m, $\sigma $, s) and n=4 if we use (m, $\sigma $, s, c). More generally, the distance in a space of n dimensions between (p1, p2, ..., pn) and (q1, q2, ..., qn) is

\begin{displaymath}
\Delta =\sqrt{\sum_{i=1}^n(p_i-q_i)^2} .\end{displaymath}

The error on a such distance is calculated by derivation:

\begin{displaymath}
d\Delta =\frac{\sum_{i=1}^n(p_i-q_i)(dp_i+dq_i)}\Delta \end{displaymath}

where dpi and dqi are the errors on pi and qi.

Therefore, we define three distances: $\Delta _{m,\sigma }$, $\Delta _{m,\sigma ,s}$, and $\Delta _{m,\sigma ,s,c}$.

We calculate these distances for the Poisson distribution and the King and the NFW profiles for 3 different sets of points (50, 125 and 500) and for all characteristic radii. We plot (with errors) in Figs. 4, 5 and 6 these distances as a variation of the characteristic radius. We symbolize the null distance as a solid line and we plot the errors on the determination of the parameters of the Poisson distributions.

  
\begin{figure}
\includegraphics[angle=-90,width=17cm,clip]{fig4.ps}\end{figure} Figure 4: Variation with the characteristic radius of the 3 tested distances between the Poisson and the King distribution (crosses) and the Poisson and the NFW distribution (circles). We have 50 points in the samples. We plot the error bar on each points and we symbolize the error on the parameters of the Poisson distribution with the two horizontal dashed lines. The horizontal solid line symbolizes the null distance to the Poisson distribution. The left part of the figure is for the distance $\Delta _{m,\sigma }$, the lower right part is for $\Delta _{m,\sigma ,s}$ and the upper right part is for $\Delta _{m,\sigma ,s,c}$

  
\begin{figure}
\includegraphics[angle=-90,width=17cm,clip]{fig5.ps}\end{figure} Figure 5: Variation with the characteristic radius of the 3 tested distances between the Poisson and the King distribution (crosses) and the Poisson and the NFW distribution (circles). We have 125 points in the samples. We plot the error bar on each points and we symbolize the error on the parameters of the Poisson distribution with the two horizontal dashed lines. The horizontal solid line symbolizes the nul distance to the Poisson distribution. The left part of the figure is for the distance $\Delta _{m,\sigma }$, the lower right part is for $\Delta _{m,\sigma ,s}$ and the upper right part is for $\Delta _{m,\sigma ,s,c}$

  
\begin{figure}
\includegraphics[angle=-90,width=17cm,clip]{fig6.ps}\end{figure} Figure 6: Variation with the characteristic radius of the 3 tested distances between the Poisson and the King distribution (crosses) and the Poisson and the NFW distribution (circles). We have 500 points in the samples. We plot the error bar on each points and we symbolize the error on the parameters of the Poisson distribution with the two horizontal dashed lines. The horizontal solid line symbolizes the nul distance to the Poisson distribution. The left part of the figure is for the distance $\Delta _{m,\sigma }$, the lower right part is for $\Delta _{m,\sigma ,s}$ and the upper right part is for $\Delta _{m,\sigma ,s,c}$

From these figures, we notice the following:

We therefore choose $\Delta _{m,\sigma ,s}$ to discriminate between the aggregation degree of a set of points. The limiting factor is a number of objects greater than 125. We note here that, whatever the used distance, we are not able to discriminate between different characteristic radii for a given profile, but this is not the goal of this work. The NFW profiles are more distant to the Poisson distributions than the King ones (whatever the characteristic radius). According to the cusped and non-cusped shape of the NFW and King profiles, we can say that the sets of points (with a given number of points) with a great distance $\Delta _{m,\sigma ,s}$ compared to a Poisson distribution with the same number of points, are more concentrated than those ones with a low $\Delta _{m,\sigma ,s}$.


next previous
Up: The use of minimal distribution

Copyright The European Southern Observatory (ESO)