Up: Expansions for nearly Gaussian
A random variable
X can be normalized to unit variance by dividing by its
standard deviation
. The
Edgeworth expansion is a true asymptotic expansion of the
PDF of the normalized variable
in
powers of the parameter
, whereas the Gram-Charlier series is
not.
This difference between the Gram-Charlier series and the
Edgeworth expansion was pointed out by Juszkiewicz et al.
(1995), and in independent work by Bernardeau &
Kofman (1995), followed by Amendola (1994) and
Colombi (1994).
Juszkiewicz et al. (1995) and Bernardeau &
Kofman (1995) use 2 or 3 terms of the Edgeworth
expansion derived e.g. in Cramér (1957). We note
that the full explicit expansion for arbitrary order s was obtained
already in 1962 by Petrov (1962), see also Petrov
(1972, 1987).
Petrov derived a powerful generalization of
the Edgeworth expansion for a sum of random variables. In this
section we
give a simplified derivation of the Edgeworth series, following
Petrov (1972).
This derivation, for an arbitrary order, is somewhat
simpler than the derivation given for example by Bernardeau & Kofman
(1995) for the third order only of
the Edgeworth expansion.
The characteristic function
of a random variable X is
the expectation
as a function of t,
|  |
(25) |
that is the Fourier transform of p(x) if the probability density
exists. The definition (25) implies
that if the moment
(2) of X exists,
|  |
(26) |
Hence the Taylor series for
is given by
|  |
(27) |
A similar series for
,
|  |
(28) |
involves cumulants (semi-invariants)
defined by
| ![\begin{displaymath}
\kappa_n \equiv {1\over i^n}
\Bigr[{ {\rm d}^n \over {\rm d}t^n} \ln\Phi(t) \Bigl]_{t=0} \; .
\end{displaymath}](/articles/aas/full/1998/10/h0596/img63.gif) |
(29) |
In Appendix A: we prove a fundamental lemma of calculus for the
n-th derivative of a composite function
,which reads
|  |
|
| (30) |
where r=k1+k2+ ... +kn and
the set
consists of all non-negative integer solutions of
the Diophantine equation
|  |
(31) |
Using (30) we derive
a useful relation (Petrov 1987)
between the cumulants
and the moments
of a
PDF in Appendix B:,
|  |
(32) |
Here summation extends over all non-negative integers
satisfying (31) and r=k1+k2+ ... +kn.
We describe a simple algorithm for obtaining all solutions of
Eq. (31) in Appendix C.
Now we are ready to begin with the derivation of the Edgeworth
expansion. Consider
a random variable X with
(this can always be achieved
by an appropriate choice of origin), and let X have
dispersion
.If X has the characteristic function
, then the
normalized random variable
has the characteristic function
. Therefore we have from Eqs. (28)
and (29) that
|  |
(33) |
Here the sum starts at n=2 because
. Moreover, since
(see Eq. (32))
we obtain
|  |
(34) |
with
|  |
(35) |
Let us write the exponential function in (34) as a
formal series in powers of
,
|  |
(36) |
where the coefficient of the power s is a function
. Now, using
and
in (30), we find that
|  |
|
| (37) |
where the summation extends again over all non-negative integers
satisfying (31). Thus the function
is just
a polynomial.
Suppose that the probability density p(x) of a random variable X
exists. Then the PDF for
is
, and
it is the inverse Fourier transform of the characteristic function
:
|  |
(38) |
If
is the Fourier transform of a function p(x),
then
is the transform of the n-th
derivative of p(x),
|  |
(39) |
The Fourier transform of the Gaussian
distribution Z(x) in (5) is
,
see e.g. Bateman & Erdélyi (1954).
Therefore each (it)n, multiplied by
in the
expansion of
(see Eqs. (34) to
(37)), generates (according to Eq. 38) the n-th
derivative of Z(x),
|  |
(40) |
in the corresponding expansion for q(x),
|  |
|
| (41) |
Here the set
in the sum consists of all non-negative
integer solutions of the equation
|  |
(42) |
Using (10) and r=k1+k2+ ... +ks we can rewrite
(41) in terms of the Chebyshev-Hermite polynomials:
|  |
|
| (43) |
This is the Edgeworth expansion for arbitrary order
s. See Petrov (1972, 1987) for a more
general form of the expansion (for non-smooth cumulative distribution
functions F(x) and for a sum of random variables) and for the proof that
the series (43) is asymptotic (see also the classical references
Cramér 1957 and Feller 1966). This means
that if the first N terms are retained in the sum over s, then the
difference between q(x) and the partial sum is of a lower order
than the N-th term in the sum (Erdélyi 1956; Evgrafov
1961). Convergence
plays no role in the definition of the asymptotic series.
Strictly speaking, Petrov (1972)
proves the asymptotic theorems for
sums of
independent random variables only when
, and not for any
, which we used in our derivation.
But in all practical applications where nearly Gaussian PDFs occur (and in
all applications that we consider in the present work), those PDFs basically
are the sums of random variables, and the proofs of the asymptotic
theorems are relevant. In the next section we show how the theory
works in practice.
 |
Figure 7:
The normalized PDF
(15) for (dashed line) and
its Edgeworth-Petrov approximations
with 12 terms in the expansion (solid) |
 |
Figure 8:
The normalized PDF
(15) for (dashed line), and
its Edgeworth-Petrov approximations
with 2 and 4 terms in the expansion (solid) |
Figures 5-8 show
some examples of the Edgeworth expansion for the
distribution. It is clear that for strongly non-Gaussian
cases, like
for
, it has a very small domain of applicability
in practical cases since it diverges like the Gram-Charlier series
for a large number of terms (Fig. 5).
But already
in this case one can check that the order of the last term retained
gives the order of the error correctly, and one can truncate the
expansion when the last term becomes unacceptably large.
For nearly Gaussian distributions the situation is much better: compare
the cases for
and
in Figs. 6, 7 and
8.
 |
Figure 9:
Edgeworth expansion up to 10th order for the PDF of peculiar
velocities from cosmic strings, within an analytical model
for the string network, for two values of the
number of strings per Hubble volume (Moessner
1995) |
 |
Figure 10:
Relative deviation from a normal distribution of
the Edgeworth expansion up to Nth order of the PDF of peculiar
velocities from cosmic strings, for ns=1. Also shown is the error
tN of this deviation associated with the expansion |
 |
Figure 11:
Relative error in the Edgeworth expansion
of q(x)/Z(x) for the PDF of peculiar velocities from cosmic strings,
for two values of N and ns |
Up: Expansions for nearly Gaussian
Copyright The European Southern Observatory (ESO)