Appendix A: General derivation of the Backus-Gilbert index

In Sect. 3, we described the B-G inverse in rather practical terms, taking care to relate the description to the quantities obtained in, and the concerns relevant to, real observations. Such a physical understanding of the method is essential if it is to be used properly, but we can obtain other insights into the method by reexamining it in a more formal way.

Equation (4) above describes an operator $K: P\to D$ mapping an object from a source space P into a data space D. Including noise $n\in D$ , we have

$\begin{displaymath} f = Ku + n \relax \end{displaymath}$ (A1)

for $f\in D$ and $u\in P$ . Here P is a real Hilbert space, parametrised by r, with a symmetric inner product

$\begin{displaymath} \relax (a\vert b)_{P}=\int_0^1 a(r)b(r)\,\relax {\rm d}\relax r\quad, \forall a,b\in P,\end{displaymath}$

and D is a finite-dimensional Euclidean space with

$\begin{displaymath} \relax (a\vert b)_{D }= \sum_i a_i b_i.\end{displaymath}$

We wish to make an estimate $\hat u_r\in \relax \bf{R}\relax$ of a single component of the object u, based on the data f. To this end, we wish to find a $q\in D$ (depending on r), such that

$\begin{displaymath} \hat u_r = \relax (q\vert f)_{D}. \relax \end{displaymath}$

(A2)

We find this q as the solution of a minimisation problem. Introducing the adjoint operator K^*: $D\to P$ , and assuming $\relax E\left(\relax (q\vert n)_{D}\right)=0$ , we have

$\begin{displaymath} \relax E\left(\hat u_r\right) = \relax (q\vert Ku)_{D }= \relax (K^*q\vert u)_{P}. \relax \end{displaymath}$

(A3)

Recalling that $\relax (K^*q\vert u)_{P}=\int_0^1(K^*q)(r')u(r')\,\relax {\rm d}\relax r'$ , we see that $(K^*q)(r')\in P$ can be identified with the averaging kernel $\Delta(r,r')$ , and that Eq. (A3) will be a good estimate of u_r when q is transformed by K^* into the basis vector $e_r\in P$ corresponding to the component r of u. That is, Eq. (A3) would be exact if K^*q=e_r. The object K^*q will instead be a linear combination of basis vectors "close'' to e_r, and we can measure its "scatter' around e_r with the operator $Q:P\to P$ such that $Qx_{r'}\equiv (r-r')^2x_{r'}, \forall x_{r'}\in P$ . Define

$\begin{displaymath} \relax \mathcal{A} \equiv \relax (K^*q\vert QK^*q)_{P }= \relax (q\vert KQK^*q)_{D }= \sum_{i,j} q_i W_{ij} q_j, \relax \end{displaymath}$

(A4)

defining the (self-adjoint) operator $W=KQK^*:D\to D$ . We may also define a measure of the stability of $\hat u_r$ ,

$\begin{displaymath} \relax \mathcal{B} \equiv \relax (q\vert Sq)_{D}, \relax \end{displaymath}$

(A5)

by analogy with Eq. (11), where the operator $S\in D$ is such that $\relax (e_i\vert Se_j)_{D }= S_{ij}$ , where S_ij is the positive definite noise covariance matrix. The demand that $\Delta(r,r')$ have unit area translates into the constraint $\relax (K^*q\vert 1)_{P}=1$ , where $1\in P$ is the all-1 vector in P. Writing $R\equiv K1\in D$ , this is equivalent to the constraint

$\begin{displaymath} \relax (q\vert R)_{D }= 1, \relax \end{displaymath}$

(A6)

restricting q to a hypersurface in D, with normal R.

If we now introduce the functional $c: D\to\relax \bf{R}\relax$ , such that

$\begin{displaymath} c(q)\equiv \frac12 [\relax \mathcal{A}(q) + \lambda\relax \... ...cal{B}(q)] =\frac12 \relax (q\vert(W+\lambda S)q)_{D}, \relax \end{displaymath}$ (A7)

the minimisation problem becomes that of finding the q which minimises c(q), subject to $\relax (q\vert R)_{D}=1$ .Considering small variations $\epsilon\in D$ in the hyperplane (that is $\{\epsilon:\relax (\epsilon \vert R)_{D}=0\}$ , and defining the gradient $\nabla c(q)=(W+\lambda S)q$ ( $W+\lambda S$ is self-adjoint since both W and S are), we have

$\begin{displaymath} c(q+\epsilon) = c(q) + \relax (\epsilon\vert\nabla c(q))_{D }+ \frac12 \relax (\epsilon\vert(W+\lambda S)\epsilon)_{D}.\end{displaymath}$

This is extremised at $q_\lambda$ such that $\relax (\epsilon\vert\nabla c(q_\lambda))_{D}=0$ , and is a minimum if $\relax (\epsilon\vert(W+\lambda S)\epsilon)_{D }\ge 0$ .The operator S is positive-definite by definition, and the operator W is positive-definite, since Q is. The operator c(q) is therefore minimised when $\relax (\epsilon\vert(W+\lambda S)q_\lambda)_{D}=0, \forall\epsilon$ , or

$\begin{displaymath} (W+\lambda S)q_\lambda = \alpha R, \relax \end{displaymath}$

(A8)

for any $\alpha\in\relax \bf{R}\relax$ . Imposing the constraint Eq. (A6), we thus find

$\begin{displaymath} q_\lambda = \frac{(W+\lambda S)^{-1} R}{\relax (R\vert(W+\lambda S)^{-1)_{ }R}D}, \relax \end{displaymath}$

(A9)

from which we can obtain Eq. (13), from the definition of $\relax (\cdot\vert\cdot )_{D}$ .

Up: Inversion of polarimetric data