2. Implementations of the SOLA method

In the following we summarize the SOLA method. The first step is to introduce a proper matrix formulation of the problem. Assume that the data-errors are described by the covariance matrix and that has a lower triangular Cholesky factor , such that . In case the data are uncorrelated with equal variance, will be a scalar times the identity matrix. If the statistics of the data are unknown it is common to set equal to the identity. Also assume that we use a quadrature rule with weights on the grid , to compute the necessary integrals, i.e.,

We now define the matrix ,

To express the integrals involved we define the diagonal matrix with diagonal elements

Finally we introduce the three n-vectors , with elements , and with elements , .

Now we can write the discretized version of expression (8 (click here)) subject to the constraint in Eq. (5 (click here)): is given as the solution to
eqnarray370
and we obtain the discretized version of the averaging kernel as

The kernels are often nearly linearly dependent, and consequently the same holds for the rows of which are ``samples'' of the kernels. Therefore the matrix is usually very ill-conditioned, and this motivates the additional regularization term in Eq. (9 (click here)).

Typically the number of kernels m is much larger than the number of points n in the discretization. This means that the system of equations in Eq. (9 (click here)) is under-determined if . Only when we include the statistical information about the data, represented in the covariance matrix , to regularize the problem, do we obtain a unique solution. A convenient way to incorporate the statistical information into the least squares problem is to work with the transformed matrix and right-hand side , where is the Cholesky factor of , and the covariance for the errors in is now the identity matrix and the variance of the error in due to the noise is simply . To simplify our presentation, we assume that this transformation to unit covariance form is applied to the data before the SOLA method is applied.

In the following we will describe different algorithms for solving Eq. (9 (click here)). In Sect. 2.1 (click here) we describe the algorithm suggested in , Pijpers & Thompson (1992, 1994). As we show in Sect. 2.2 (click here), another possibility is to reduce Eq. (9 (click here)) to a standard unconstrained regularized least squares problem, using the method proposed in Lawson & Hanson (1974), Chap. 20. For the unconstrained least squares problem a variety of efficient regularization methods are available; see, e.g., the survey in Hanke & Hansen (1993). As we will demonstrate, the latter approach is superior in terms of computational efficiency, especially when many values of are needed. We also discuss an iterative algorithm which is useful when is either sparse or structured.

In the derivation of the algorithms below we have restricted ourselves to the case where the estimate of f is computed for a single point . The extension of the algorithms to multiple points is trivial and is given in the summaries at the end of each section. Finally we note that we measure the work associated with the algorithms by the number of floating point operations (flops) used. A ``flop'' is a simple arithmetic operation such as a floating point multiplication or addition.

2.1. SOLA algorithm using a Lagrange multiplier

The standard solution procedure, which is the one used in Pijpers &\ Thompson (1992), incorporates the constraint using a Lagrange multiplier and solves the augmented normal equations of (9 (click here)). This means solving the system

where is the Lagrange multiplier. This approach has several disadvantages:

There is a large potential for loss of accuracy in forming .
The augmented system is very large: .
Although the augmented normal equations are symmetric they are not necessarily positive definite. Therefore the Cholesky factorization, which is normally used to solve the normal equations, cannot be applied to Eq. (11 (click here)), neither can an iterative method such as conjugate gradients. In Pijpers & Thompson (1994) it is suggested that Eq. (11 (click here)) is solved using Gaussian elimination. This approach doubles the operation count compared to a Cholesky factorization.

A solution to the last problem is to apply the Bunch-Kaufman algorithm for symmetric indefinite systems, which computes an

factorization using diagonal pivoting (see, e.g., Sects. 4.4.4-5 in Golub & Van Loan 1989), requiring

flops, the same as the usual Cholesky factorization.

To avoid forming the large system of normal equations, it has been suggested by Christensen-Dalsgaard & Thompson (1993) to reduce the dimension of the system by first computing an SVD of the kernel matrix :

where and . Then one switches to a system that uses the transformed quantities:

Working with and instead of and reduces the computational effort by a factor of . Now the most expensive part is the computation of and of the SVD which requires flops (note that is not computed explicitly; is computed ``on the fly''). Details on how to compute the SVD are given in, e.g., Golub & Van Loan (1989).

The algorithm, which is summarized below, describes the general case where is computed at points . The matrix below is the matrix whose columns contain ``samples'' of the q target function: , , .

Up: Efficient implementations of