We have already pointed out the main problems when helioseismic data are inverted with standard techniques. The method we have developed, called Optimal Mesh Distribution (OMD), is a mixture of both the aforementioned techniques, although it is based mainly on RLS.
OMD tries to solve the main disadvantages of RLS by taking care of looking for the optimal mesh and defining the smoothing function by a deep analysis of the properties of the basis function of the problem. In particular, a different smoothness to the solution at each point of the mesh is used.
In order to illustrate the method developed here, we will take as di the rotational splittings parameterized as Clebsh-Gordan coefficients, so f(r) will be a measurement of the solar rotational rate (e.g. Ritzwoller & Lavely 1991). To test the method, a set of artificial data has been created. These consist in s=1 Clebsh-Gordan coefficients, obtained from the rotation curve shown in Fig. 6 (click here), for 1380 p-modes with l ranging from 1 to 150 and frequencies in the interval Hz Hz. Noise and error levels are taken from actual observations. We will call this set test 1.
In Sect. 3 we will apply the method to a more complex problem, where instead of one function f(r), we work with two, sound speed and density.
The radial resolution that can be achieved in a given inverse problem depends on the mode data set used through the associated kernels. Since the kernels scan the solar interior in a significantly non-homogeneous way, the resolution strongly depends on r. In fact, one of the main problems when a solution is obtained, is that it can show oscillations due to an excessive number of points in the mesh. But then, it seems a good idea to decrease the number of points according to the spatial resolution; that is, we will have a non-equally spaced mesh point distribution.
The basic idea in obtaining the spatial resolution is as follows. Since the result is obtained by means of linear combinations of kernels, the solution at a given radial point can contain oscillatory patterns only with larger periods than those given for such a linear combination. Thus, in the procedure we will consider a given spatial oscillatory pattern (a sinusoidal function) and search for the combination of kernels that better fit this behaviour. If the fit is good at a given radial point, then the solution at this point is allowed for having such an oscillatory pattern. The smallest period for which a good fit is obtained at a given r gives the cut-off spatial frequency, and hence the spatial resolution. Finally, the mesh is built by taking only one radial point within the interval given by the spatial resolution.
To calculate the spatial resolution given by the kernels, a variant of the
OLA method has been developed.
As a measurement of the width of the averaging kernel, OLA uses a quadratic
form. A variant of OLA, called SOLA (Pijpers & Thompson 1994),
uses the quadratic deviation of the combination of kernels from a "target''
function, in the ideal case a Dirac delta function. In our case, the
"target'' functions are sine waves of different frequencies; namely,
where is the spatial frequency. The amplitude a is
obtained by assigning the area of the sine function as unity.
Let us call b the coefficients of the linear combination of kernels,
the vector of dimension M we want to determine with this analysis.
We want to minimize the quantity
where the vector P is defined by
The corresponding normal Gaussian equations are
where C is the kernel's covariance matrix of dimensions , with
elements cij given by
and y is the "target'' function vector,
Then the unknown vector b can be obtained by inverting
relation (10 (click here)).
Since C is a quasi-singular matrix, we obtain b as in the RLS
technique, by keeping and imposing the constraint
, weighted by a parameter
, i.e. we minimize the
quadratic form s defined by
Therefore, the estimated solution is given by
The parameter is introduced in order to avoid solutions with
undesirable oscillatory components. Nevertheless, its value is very small,
and there are a wide range of values of for which the fit to the
target function is almost invariant and very good.
Let we consider test 1. As illustrative cases, Figs. 1 (click here)a and 1 (click here)b show the best fit of the kernel combinations , for two values of : 0.018 R and 0.11 R. The sine with spatial period of 0.11 R is well fitted by the combination of kernels at almost all the radial points. However for a period of 0.11 R, only points with r/R > 0.75 can be fitted properly.
Figure 1: a) Best fit to a sine with period 0.11 R.
b) The same but for a period of 0.018 R. test 1 has been
considered
Once b is obtained for a given spatial frequency ,
we need to define the goodness of the fit quantitatively.
To do this,
we compute the quadratic deviation as a function of the radial
point ri and for a given :
where and ri=(r1,i+r2,i)/2.
In Fig. 2 (click here) is represented the value of as a function of the spatial period , and for two points with ri=0.3R and ri=0.61R. The fit is good for values of for which is smaller than a given tolerance factor . In particular, we have taken , but as can be seen in Fig. 2 (click here), there is a sharp change in the behaviour of , so there is a wide range of values for for which the result remains almost unchanged.
Figure 2: as a function of spatial period
for ri=0.3R (solid line) and for ri=0.61R
(dashed lines)
We define the cut-off frequency as the value of for which . The spatial resolution at a point ri is defined as the inverse of the cut-off frequency. Figure 3 (click here) shows the spatial resolution obtained for test 1. Since only p modes are used, the resolution is worse at deeper points.
Figure 3: Spatial resolution of the solution for test 1
We can now build the inversion mesh by including only one radial point within the interval given by the spatial resolution. The resulting number of points is too low, in particular for test 1 we have N=89. Therefore the discretisation of relation (1 (click here)) could not be accurate enough. To avoid this problem, we have increased the actual spatial resolution by a factor of four, so that the number of points in the mesh grows in the same amount. With such a mesh, the discretization is very good, but some oscillations will appear in the solution.
As in the RLS method, to avoid oscillations in the solution, it is necessary to define and give a weight to the smoothing constraint. The other novelty of OMD is the way the smoothing function is defined directly from the spatial resolution analysis, and how it is weighted differently for each radial point.
Concerning the penalty function, we know
that the mesh has four times the optimal number of points, so we will
apply a fourth difference smoothing function H. To do this, we have
to minimize the quadratic difference between the value of the solution at
one point fi and the average value of the four surrounding points
Let we now consider the weighting of the penalty function. When the RLS technique was explained, a constant weighting parameter was defined, so the smoothing constraint is applied in the same way to all the solution. But then the weighting parameter is mainly determined by the result at the points that have a higher sensitivity to the kernels and, hence, the smoothing at other points can be dragged. To avoid this problem, we have applied a different weighting parameter at each point in the solution. We give now the general expression for the penalty function and then describe how our method compute .
In general, the minimization of relation (16 (click here))
can be expressed by a matrix , where Q is
defined by
in which case
Notice that since the are included in Q, and hence in H,
Eqs. (4 (click here)) and (5 (click here)) are replaced by
The parameters have been calculated following a method developed
by Ruiz Cobo & del Toro-Iniesta (1994) of inverting the Stokes
polarization line profiles. The matrix A, defined in Eq. (2 (click here)),
can be written by using Spectral Value Decomposition (SVD) as
where is a diagonal matrix that contains the eigenvalues of A.
U and V are orthonormal matrixes, with V containing the eigenvectors
of A.
Each point ri of the mesh has its own contribution to the eigenvalue
matrix . Let us introduce the matrixes defined by
where the elements wijk of Wi are related to the elements vjk
of V by the relation
It is straightforward to demonstrate that
The contribution of each point ri is taken as the maximum
value of the corresponding diagonal of the matrix ,
and the weighting factors are given by
where c is just a proportionality factor. In Fig. 4 (click here), the
contributions of points for test 1 are presented. As expected,
the contributions for outer points are higher than for inner points.
Figure 4: Point contributions for test 1
In summary, we have obtained a method, similar to RLS, that smoothes each point in the solution independently. The form of the smoothing function is given by the spatial resolution, which also gives the distribution of points in the inversion mesh. The only free variable of the method is the proportionality factor c, introduced in Eq. (23 (click here)), which will play the role of the trade-off parameter between error magnification and accuracy of the solution.
Here, we are going to see the performance of the method by inverting artificial data corresponding to test 1. To obtain the solution, we give different values to c, the only free variable of the method, and calculates and the error propagation for each solution. The error propagation of the solution is defined as a quadratic average over all the points of the mesh. Figure 5 (click here) shows versus the error propagation for different values of c. The curve has two trends: for small values of c, is almost invariant, while the error decreases very fast. For high c values the invariant behaviour is obtained for the error propagation (oversmooth solution). We have chosen as the optimal c the one where the two different trends concur (the "elbow''of the curve).
Figure 5: versus error propagation for different values of
the trade-off parameter c
Figure 6 (click here) shows the solution obtained by our code for test 1. The result is quite good and fits the actual solution below the error level.
Figure 6: Result for test 1. The dashed line represent the actual
solution and solid line shows the inverted solution with its error
bars
OMD is a mixture of standard methods and is neither the fastest nor the slowest one. Over of computing time is dedicated to obtain the spatial resolution, while is used by the inversion analysis. The rest of the time is used to read files. The time ratio between RLS (the fastest), OMD and OLA is, approximately 1:12:40 for test 1.