Up: The art of fitting
Some of the properties of MLE were given by Toutain & Appourchaux
(1994). We repeat them here for completeness. We also address 2
issues that were not covered in their
paper: are MLE biased?, and how significant are the estimated
The aim of this section is to introduce some definitions and properties of
MLE. A comprehensive study of this area of statistics can be found, e.g.
in Kendall & Stuart (1967).
Given a random variable x with a probability distribution ,
where is a vector of p parameters.
We define the logarithmic likelihood function of N independent
measurements xk of x
where L is the likelihood. The main property of is that the position of its minimum in
the -space gives an estimate of the most likely value of , denoted hereafter as . Hence is the solution of
the set of p simultaneous equations:
Moreover, in the limit of very large sample () this
tends to have a multi-normal probability distribution. In this case, this
estimator is asymptotically unbiased with minimum variance; which implies its
expectation and variance are respectively:
where cii are the diagonal elements of the inverse of the Hessian
matrix h, with elements:
The covariances between any 2 components of are given by
the corresponding off-diagonal elements of the inverse matrix. Equation
(5) is used when computing the so-called formal error bars on
; as a matter of fact according to the Cramer-Rao theorem,
Eq. (5) gives only a lower bound to the
error bars (Kendall & Stuart 1967, reference therein).
Toutain & Appourchaux (1994) showed that Eq. (5) is
valid for most purpose in helioseismology.
The fact that MLE are asymptotically unbiased does not necessarily mean that
this property is kept for a finite amount of data. As an
example, it is well known that an estimator of the standard deviation ()
of N measurement of a normally distributed random variable x is
where xi is the i-th measurement of the random variable x and is an estimate
of the mean. It is well known that the of Eq.
(6) is unbiased. In this case, MLE would give the following
Clearly the MLE expression give a bias that vanish asymptotically for
an infinite number of points. It is often difficult to derive
explicit relation, similar to Eq. (7) between the estimator
and the finite number of data points. When analytical expression
can not be found, we advice to use Monte-Carlo simulations to verify
the unbiasness; an example for l = 1 splittings is given in Chang
(1996) and Appourchaux et al. (1997).
In any case MLE are intrinsically biased estimators because they are
also minimum variance estimators (Kendall & Stuart 1967). It
may be useful to find other estimators that do not bias the estimates
(Quenouille 1956); they might not necessarily have minimum
variance. These estimators are yet to be found.
When one uses Least Square for fitting data, one can test the
significance of its fitted parameters using the so-called R test
For MLE, a useful test can be used: the likelihood ratio test.
It was first used by Appourchaux et al. (1994).
This method requires to maximize the likelihood e of a
given event where p parameters are used to described the line profile.
If one wants to describe the same event with n additional parameters, the
will have to be maximized. The likelihood ratio test consists in making the ratio
of the two likelihood (Brownlee 1965). Using the logarithmic
likelihood, we can define the ratio as:
If is close to 1, it means that there is no improvement in the maximized
likelihood and that the additional parameters are not significant. On the other
hand, if , it means that and that
the additional parameters are very significant. In order to define a significance for the n additional parameters, we need to know the statistics of under the null hypothesis, i.e. when the n additional parameters are not significant. For this null hypothesis,
Wilks (1938) showed that for large sample size the distribution
of -2ln tends to the distribution.
Up: The art of fitting
Copyright The European Southern Observatory (ESO)