next previous
Up: Expansions for nearly Gaussian


5 The Edgeworth asymptotic expansion

 A random variable X can be normalized to unit variance by dividing by its standard deviation $\sigma$. The Edgeworth expansion is a true asymptotic expansion of the PDF of the normalized variable $X/\sigma$ in powers of the parameter $\sigma$, whereas the Gram-Charlier series is not. This difference between the Gram-Charlier series and the Edgeworth expansion was pointed out by Juszkiewicz et al. (1995), and in independent work by Bernardeau & Kofman (1995), followed by Amendola (1994) and Colombi (1994).

Juszkiewicz et al. (1995) and Bernardeau & Kofman (1995) use 2 or 3 terms of the Edgeworth expansion derived e.g. in Cramér (1957). We note that the full explicit expansion for arbitrary order s was obtained already in 1962 by Petrov (1962), see also Petrov (1972, 1987).

Petrov derived a powerful generalization of the Edgeworth expansion for a sum of random variables. In this section we give a simplified derivation of the Edgeworth series, following Petrov (1972). This derivation, for an arbitrary order, is somewhat simpler than the derivation given for example by Bernardeau & Kofman (1995) for the third order only of the Edgeworth expansion.

The characteristic function $\Phi(t)$ of a random variable X is the expectation ${\sf E}\exp(itX)$ as a function of t,  
 \begin{displaymath}
\Phi(t) \equiv \int_{-\infty}^{\infty} {\rm e}^{itx} {\rm d}F(x) \; ,
 \end{displaymath} (25)
that is the Fourier transform of p(x) if the probability density $p(x)={{\rm d}F(x)/{\rm d}x}$ exists. The definition (25) implies that if the moment $\alpha_k$ (2) of X exists,  
 \begin{displaymath}
\Phi^{(k)}(0)=i^k\alpha_k \; .
 \end{displaymath} (26)
Hence the Taylor series for $\Phi(t)$ is given by  
 \begin{displaymath}
\Phi(t) \sim 1 +
 \sum_{k=1}^{\infty} {\alpha_k \over k!} (it)^k\; .
 \end{displaymath} (27)
A similar series for $\ln \Phi(t)$,  
 \begin{displaymath}
\ln \Phi(t) \sim
 \sum_{n=1}^{\infty} {\kappa_n \over n!} (it)^n\; ,
 \end{displaymath} (28)
involves cumulants (semi-invariants) $\kappa_n$ defined by  
 \begin{displaymath}
\kappa_n \equiv {1\over i^n}
 \Bigr[{ {\rm d}^n \over {\rm d}t^n} \ln\Phi(t) \Bigl]_{t=0} \; .
 \end{displaymath} (29)

In Appendix A: we prove a fundamental lemma of calculus for the n-th derivative of a composite function $f\!\circ\! g(x) \equiv f(g(x))$,which reads
   \begin{eqnarray}
\lefteqn{{{\rm d}^n \over {\rm d}x^n} f(g(x)) =} \nonumber \\  ...
 ...{1 \over k_m!}
 \left({1 \over m!} g^{(m)}(x)\right)^{k_m} \; ,}
 \end{eqnarray}
(30)
where r=k1+k2+ ... +kn and the set $\{k_m\}$ consists of all non-negative integer solutions of the Diophantine equation  
 \begin{displaymath}
k_1+2k_2+ ... +n k_n=n \; .
 \end{displaymath} (31)

Using (30) we derive a useful relation (Petrov 1987) between the cumulants $\kappa_n$ and the moments $\alpha_k$ of a PDF in Appendix B:,  
 \begin{displaymath}
\kappa_n =
 n! \sum_{\{k_m\}}(-1)^{r-1}(r-1)!
 \prod_{m=1}^n {1 \over k_m!}
 \left({\alpha_m \over m!} \right)^{k_m} \; .
 \end{displaymath} (32)
Here summation extends over all non-negative integers $\{k_m\}$satisfying (31) and r=k1+k2+ ... +kn. We describe a simple algorithm for obtaining all solutions of Eq. (31) in Appendix C.

Now we are ready to begin with the derivation of the Edgeworth expansion. Consider a random variable X with ${\sf E}X=0$ (this can always be achieved by an appropriate choice of origin), and let X have dispersion $\sigma^2$.If X has the characteristic function $\Phi(t)$, then the normalized random variable $X/\sigma$ has the characteristic function $\varphi(t)=\Phi(t/\sigma)$. Therefore we have from Eqs. (28) and (29) that  
 \begin{displaymath}
\ln \varphi(t)=\ln \Phi(t/\sigma) \sim
 \sum_{n=2}^{\infty} {\kappa_n \over \sigma^n n!} (it)^n\; .
 \end{displaymath} (33)
Here the sum starts at n=2 because ${\sf E}X=0$. Moreover, since $\kappa_2=\sigma^2$ (see Eq. (32)) we obtain  
 \begin{displaymath}
\varphi(t) \sim {\rm e}^{-t^2/2} \exp \left\{
 \sum_{n=3}^{\infty} { S_n \sigma^{n-2} \over n!} (it)^n
 \right\} \; ,
 \end{displaymath} (34)
with  
 \begin{displaymath}
S_n \equiv \kappa_n / \sigma^{2n-2} \; .
 \end{displaymath} (35)
Let us write the exponential function in (34) as a formal series in powers of $\sigma$,  
 \begin{displaymath}
\exp \left\{
 \sum_{r=1}^{\infty} { S_{r+2} \sigma^r \over (...
 ...ht\} \sim 1 + \sum_{s=1}^\infty
 {\cal P}_s (it)\sigma^s \; ,
 \end{displaymath} (36)
where the coefficient of the power s is a function ${\cal P}_s(it)$. Now, using $g(x) \equiv \sum_{r=1}^{\infty}\{S_{r+2}(it)^{r+2} x^r /(r+2)!\}$and $f\equiv \exp$ in (30), we find that
   \begin{eqnarray}
\lefteqn{ {\cal P}_s(it) \equiv { 1 \over s!}
 {{\rm d}^s \over...
 ..._m!}
 \left({S_{m+2} (it)^{m+2} \over (m+2)!}\right)^{k_m} \; ,}
 \end{eqnarray}
(37)
where the summation extends again over all non-negative integers $\{k_m\}$satisfying (31). Thus the function ${\cal P}_s$ is just a polynomial.

Suppose that the probability density p(x) of a random variable X exists. Then the PDF for $X/\sigma$ is $q(x) \equiv \sigma p(\sigma x)$, and it is the inverse Fourier transform of the characteristic function $\varphi$:  
 \begin{displaymath}
q(x) =
 {1\over 2\pi} \int_{-\infty}^{\infty} {\rm e}^{-itx}\varphi(t) {\rm
 d}t \; .
 \end{displaymath} (38)
If $\Phi(t)$ is the Fourier transform of a function p(x), then $(-it)^n\Phi(t)$ is the transform of the n-th derivative of p(x),  
 \begin{displaymath}
{{\rm d}^n \over {\rm d}x^n} p(x) = {1\over 2\pi}
 \int_{-\infty}^{\infty} {\rm e}^{-itx}(-it)^n \Phi(t) {\rm d}t \;
 .
 \end{displaymath} (39)
The Fourier transform of the Gaussian distribution Z(x) in (5) is $\exp(-t^2/2)$ , see e.g. Bateman & Erdélyi (1954). Therefore each (it)n, multiplied by $\exp(-t^2/2)$ in the expansion of $\varphi$ (see Eqs. (34) to (37)), generates (according to Eq. 38) the n-th derivative of Z(x),  
 \begin{displaymath}
(-1)^n {{\rm d}^n \over {\rm d}x^n} Z(x) =
 \int_{-\infty}^{\infty} {\rm e}^{-itx}(it)^n \exp(-t^2/2) {\rm d}t
 \; ,
 \end{displaymath} (40)
in the corresponding expansion for q(x),
   \begin{eqnarray}
\lefteqn{ q(x)= Z(x) + \sum_{s=1}^\infty \sigma^s } \nonumber \...
 ...m d}^{m+2} \over {\rm d}x^{m+2}} \right)^{k_m}\! Z(x) \right\}.}
 \end{eqnarray}
(41)
Here the set $\{k_m\}$ in the sum consists of all non-negative integer solutions of the equation  
 \begin{displaymath}
k_1+2k_2+ ... +s k_s=s \; .
 \end{displaymath} (42)
Using (10) and r=k1+k2+ ... +ks we can rewrite (41) in terms of the Chebyshev-Hermite polynomials:
   \begin{eqnarray}
\lefteqn{ q(x) = \sigma p(\sigma x) = Z(x) 
 \left\{ 1 + \sum_{...
 ... k_m!} \left({S_{m+2} \over (m+2)!} 
 \right)^{k_m} \right\} \; .}\end{eqnarray}
(43)

This is the Edgeworth expansion for arbitrary order s. See Petrov (1972, 1987) for a more general form of the expansion (for non-smooth cumulative distribution functions F(x) and for a sum of random variables) and for the proof that the series (43) is asymptotic (see also the classical references Cramér 1957 and Feller 1966). This means that if the first N terms are retained in the sum over s, then the difference between q(x) and the partial sum is of a lower order than the N-th term in the sum (Erdélyi 1956; Evgrafov 1961). Convergence plays no role in the definition of the asymptotic series.

Strictly speaking, Petrov (1972) proves the asymptotic theorems for sums of $\nu$ independent random variables only when $\sigma \sim
1/\nu^{1/2}$, and not for any $\sigma$, which we used in our derivation. But in all practical applications where nearly Gaussian PDFs occur (and in all applications that we consider in the present work), those PDFs basically are the sums of random variables, and the proofs of the asymptotic theorems are relevant. In the next section we show how the theory works in practice.

  
\begin{figure}
\resizebox {8cm}{!}{\includegraphics{H0596F7.ps}}\end{figure} Figure 7: The normalized $\chi^2$ PDF (15) for $\nu=5$ (dashed line) and its Edgeworth-Petrov approximations with 12 terms in the expansion (solid)
  
\begin{figure}
\resizebox {8cm}{!}{\includegraphics{H0596F8a.ps}}

 
\resizebox {8cm}{!}{\includegraphics{H0596F8b.ps}}\end{figure} Figure 8: The normalized $\chi^2$ PDF (15) for $\nu=20$ (dashed line), and its Edgeworth-Petrov approximations with 2 and 4 terms in the expansion (solid)

Figures 5-8 show some examples of the Edgeworth expansion for the $\chi^2$ distribution. It is clear that for strongly non-Gaussian cases, like $\chi^2_\nu$ for $\nu=2$, it has a very small domain of applicability in practical cases since it diverges like the Gram-Charlier series for a large number of terms (Fig. 5). But already in this case one can check that the order of the last term retained gives the order of the error correctly, and one can truncate the expansion when the last term becomes unacceptably large. For nearly Gaussian distributions the situation is much better: compare the cases for $\nu=5$ and $\nu=20$ in Figs. 6, 7 and 8.

  
\begin{figure}
\resizebox {8cm}{!}{\includegraphics{H0596F9.ps}}

 \vspace{4mm}\end{figure} Figure 9: Edgeworth expansion up to 10th order for the PDF of peculiar velocities from cosmic strings, within an analytical model for the string network, for two values of the number of strings per Hubble volume (Moessner 1995)
  
\begin{figure}
\resizebox {8cm}{!}{\includegraphics{H0596F10.ps}}

 \vspace{4mm}\end{figure} Figure 10: Relative deviation $\Sigma_N$ from a normal distribution of the Edgeworth expansion up to Nth order of the PDF of peculiar velocities from cosmic strings, for ns=1. Also shown is the error tN of this deviation associated with the expansion
  
\begin{figure}
\resizebox {8cm}{!}{\includegraphics{H0596F11.ps}}\end{figure} Figure 11: Relative error $t_N/(1+\Sigma_N)$ in the Edgeworth expansion of q(x)/Z(x) for the PDF of peculiar velocities from cosmic strings, for two values of N and ns

next previous
Up: Expansions for nearly Gaussian

Copyright The European Southern Observatory (ESO)