Up: The art of fitting
Subsections
Before applying Eq. (32) to real data, it is always advisable
to test the power of MLE on synthetic data, i.e. performing
Monte-Carlo simulations. They are not merely for playing games; these
simulations are real tools for understanding what we fit and how we fit it.
Assuming that the statistics of the real solar spectra is known, performing Monte-Carlo
is useful for the following reasons:
- Assessing the model of the mode and noise covariance
- Assessing the statistical distribution of the parameters
- Assessing the precision of parameters.
First, the model of the covariances can be imperfect. The effect of
an imperfect knowledge of the covariance can help us understand how
these will influence the determination of the parameters, i.e. deriving the sensitivity of the systematic errors to this imperfect
knowledge.
Second, the parameters derived by the MLE should have the desirable
properties of having a normal distribution; if not we advise to apply
a change of fitted parameters. For example, as we will see later on,
we do not fit the linewidth itself but the log of the linewidth. A
normal distribution is necessary to derive meaningful error bars,
this is the assumption behind Eq. (5).
Third, in order to be able to derive a good estimate of the error
bars using one realization, the standard deviation of a large sample of fitted
parameters should be equal to the mean of formal errors return by the
fit (See Eq. (4)).
The performance of this instrument has been described in Appourchaux et al.
(1997). Briefly, it is a small instrument made of 12 pixels for
detecting solar intensity fluctuations. The p-mode signals were generated
in the Fourier spectra by using the following:
| |
(39) |
where is the observed vector of 2l+1 Fourier spectra,
is the leakage matrix given by Eq.
(19), is a complex random vector with 2l+1 components (each
component represents the signal of an l,m mode, with uncorrelated real and imaginary
part), are computed as in Eq. (20)
using spherical harmonics, and pi is the
noise for a given pixel i. The variance of the real or imaginary part
of the m-th
component of is given by ; the mean of is 0. The function
describes the profile of each m which is displaced from m by an
amount which is given by:
| |
(40) |
where the are derived from the Clebsch-Gordan
coefficients, the expression of which can be found in Ritzwoller &
Lavelly (1991); they are normalized such that
. Here we assumed a common linewidth for the l,n mode,
and different amplitudes for the 2l+1 components.
The profile are symmetrical in the shape of a lorentzian.
The variance of the pixel noise is assumed to be the same for the
pixels with the same shape. The mean of the pixel noise is 0.
For the LOI with its 12 pixels, there are 3 different shapes giving 3
independent noises.
After generating the synthetic signals according to Eq.
(39), the data are fitted by minimizing the likelihood of
Eq. (32). Figure 1 shows an example of Fourier
spectra generated synthetically. The typical signal-to-noise ratio
in the power spectra is about 20-30.
The frequency resolution is equivalent to 4 months of data. We
performed 1000 simulations of the spectra.
|
Figure 1:
(Left) Power spectra of a synthetic l=1 as it would be
observed by the LOI. The frequency resolution corresponds to 4 months
of data. The signal-to-noise ratio is about 20-30. The traces from
bottom to top corresponds to m=-1, 0, +1. (Right) Fourier spectra for
l=1 (same data). The first, third and fifth traces from the bottom
represents the real part of the spectrum of m=-1,0 and 1,
respectively; the other traces are the imaginary parts. The leakage
between m=-1 and m=+1 is 0.45 in the Fourier spectra |
The data are fitted assuming a perfect knowledge of the leakage and noise
covariance matrices, i.e. we know what we fit. Figure 2
shows the distribution of the fitted parameters: the central frequency,
splitting, (linewidth), 3(amplitude),
3(pixel noise). For the last 7 parameters, we fit the of the
parameter because this transformation give a statistical distribution
closer to a normal distribution (or log-normal distribution). It can
be observed that the parameters derived are in most cases unbiased.
Figure 3 shows the distribution of the error bars returned
by the fit. In most case the mean of the error bars (returned by the
fit) is not very
different from the 1- deviation of the parameter distribution.
Similar simulations have been performed for various degree (up l=3).
They show the same typical results as for Figs. 2 and
3, i.e. the fitted parameters are not, or weakly, biased, and
the error bars returned by the fit give a good estimate of the real
error bars.
|
Figure 2:
Histograms for the fitted parameters: (Plain line) Data,
(Dashed line) Normal distribution with the same mean and as
the fitted parameters. (Top) Frequency (in Hz),
splitting a1 (in Hz), ( in
Hz); (Middle) (Amplitude) for
m=-1, 0, 1; (Bottom) (pixel noise). For each histogram, the
target value, the mean fitted value and the 1- fitted valued
are displayed. The Kolmogorov-Smirnov test (Kol.) is displayed for each
histogram; a number close to 0 show that the distribution is not normal |
|
Figure 3:
Histograms for the error bars: (Plain line) Data,
(Dashed line) Normal distribution with the same mean and . (Top)
Frequency error (in Hz),
splitting a1 error (in Hz), error ( in Hz); (Middle)
(Amplitude)
error for
m=-1, 0, 1; (Bottom) (pixel noise) error. For each histogram, the
target value, the mean fitted value and the 1- fitted valued
are displayed. The Kolmogorov-Smirnov test is displayed for each
histogram; a number close to 0 show that the distribution is not normal |
As was shown by Eq. (36), fitting p-mode spectra for which
the leakage matrix is explicitly diagonal is equivalent to fitting
p-mode spectra for which the matrix is not diagonal. Of
course, it is always possible to construct data with a purely diagonal
leakage matrix using Eq. (33), but we do so assuming that we
know the leakage matrix . As a matter of fact, what
matters is not to have the identity matrix as leakage matrix, but
more the knowledge of the latter.
Hereafter, we have investigated the
influence of a wrongly assumed leakage matrix on the fitted parameters
of l = 1. We made 100 realizations and change the leakage parameter
between m=-1 and m=+1 by % from a nominal value for the LOI of 0.45.
Figure 4 shows the influence of varying the assumed leakage
element on the fitted parameters. It is quite interesting to note
that the inferred central frequency is insensitive to mistakes in the
leakage matrix. The linewidth becomes underestimated when the error
is larger than 20%, while the amplitudes become overestimated. The most
important result is the fact that the systematic error made on the
splitting is not linear but quadratic. This systematic error can become as
large as the error bars. For example, with 1 year of LOI data and
averaging over 10 modes, the error bars on the mean splitting is about
15 nHz; this should be compared to a systematic error of 10 nHz for an
error of 10% of the l=1 leakage elements.
Another test similar to that of the l=1
was performed with the l=2 mode. We have assumed that
all the off-diagonal elements of the leakage matrix were wrong by the
same fixed amount. Figure
5 shows the results only for the splitting coefficients (from a1 to
a4). The other parameters
linewidth, amplitudes and noises behave in the same manner as for
l=1. The systematic error on the splitting has also the same
quadratic dependence as for l=1. For l=2 the splitting error bars
are typically smaller than for l=1. In this case the
systematic errors become larger than the error bars, and therefore
start to influence the inverted solar rotation.
It means that
it is quite easy to underestimate the splitting whenever we under- or
overestimate the leakage element. As a matter
of fact, this behaviour was also found in the GONG data for l=1 and
2 (Rabello-Soares & Appourchaux 1998, in preparation).
On the other hand, errors in the
leakage matrix will not result in overestimating the splitting. If
the splitting is overestimated, the most likely source should be the
presence of other degrees not taken into account in the analysis.
We also checked the
correlation of the splitting coefficients derived for l = 2.
Figures 6 and 7 show respectively the variance and
the covariance of the splitting coefficients as a function of the leakage elements error.
It can be concluded that the splitting coefficients become correlated
only when a large overestimation of about 50% is made for the
off-diagonal leakage elements. This result is only valid when
fitting Fourier spectra. For other methods, such as fitting power
spectra, possible correlation amongst the splitting coefficients could
have drastic consequences for the inverted solar rotation profiles.
|
Figure 4:
Influence of the fitted parameters to relative changes of the assumed
leakage element between m=-1 and m=+1 for l=1. (Top) Frequency,
splitting a1, ; (Middle) (Amplitude) for
m=-1, 0, 1; (Bottom) (pixel noise). The target parameters are the
same as for Fig. 2. Please note the parabolic
shape for the splitting |
|
Figure 5:
Influence of the fitted splitting parameters to relative changes of the assumed
of the assumed off-diagonal leakage element for l=2. (Top, left)
a1, target value: 410 nHz; (Top, right) a2, target value: -30
nHz; (Bottom, left) a3, target value: -10 nHz; (Bottom, right)
a4, target value: +50 nHz |
|
Figure 6:
Diagonal elements of the covariance matrix of the splitting
coefficient, for l=2. They are given as a function
of the relative change of the assumed
off-diagonal leakage element. (Top, left) For a1; (Top, right)
For a2; (Bottom, left) For a3; (Bottom, right) For a4 |
|
Figure 7:
Off-diagonal elements of the covariance matrix of the splitting
coefficient, for l=2. They are given as a function
of the relative change of the assumed
off-diagonal leakage element. (Top, left) For a1 and a2; (Top, right)
For a1 and a3; (Middle, left) For a1 and a4;
(Middle, right) For a2 and a3; (Bottom, left) For a2 and
a4; (Bottom, right) For a3 and a4 |
We have also performed Monte-Carlo simulations to compare the different fitting techniques commonly used. For each simulation of a mode, the data were fitted in 3 different ways:
- Assuming that the 2l + 1 power spectra are independent of each other and are not influenced by m leaks, i.e. a single mode is present for a given m. This is the way the GONG data are commonly fitted.
- Assuming that the 2l + 1 power spectra are independent of each other but are influenced by m leaks, i.e. we use only the diagonal of the mode covariance matrix. This is the way the LOI data were fitted by
Appourchaux et al. (1995).
- Assuming that the 2l + 1 Fourier spectra are dependent of each other and are influenced by m leaks. This is the way described in this paper after the work of
Schou (1992).
Figure 8 shows the results of this comparison only for the splitting which is the parameter the most sensitive to the fitting way. The splittings derived by the GONG fitting way are obviously underestimated. This underestimation will eventually disappear with the degree but for l=1 the bias is unacceptable. The splittings derived using the old LOI way are slightly overestimated with a bias of
about 10 nHz for l=1. Although this bias is not substantial for the LOI data, it can be of the same order of magnitude as the error bars for instrument with better signal-to-noise ratio than used in the simulations, such as GONG, and for observation time longer than these simulations. Last but not least, the Fourier spectra fitted according to the guidelines given in this paper provides a splitting without substantial bias and also with smaller error bars. This latter way of fitting will considerably improve the consistency of the splittings measured.
|
Figure 8:
Comparison of low-degree splittings measured using 1000 Monte-Carlo
simulations by 3 different fitting techniques. The nominal splitting is
410 nHz. The + is the technique commonly used by GONG, the asterisk is the
technique used by Appourchaux et al. (1995), the diamond is the
technique described in this paper and also used for the SOI/MDI data. The
error bars are the formal error bars for a single realization |
Up: The art of fitting
Copyright The European Southern Observatory (ESO)