next previous
Up: Inversion of polarimetric data


Appendix A: General derivation of the Backus-Gilbert index

  In Sect. 3, we described the B-G inverse in rather practical terms, taking care to relate the description to the quantities obtained in, and the concerns relevant to, real observations. Such a physical understanding of the method is essential if it is to be used properly, but we can obtain other insights into the method by reexamining it in a more formal way[*].

Equation (4) above describes an operator $K: P\to D$ mapping an object from a source space P into a data space D. Including noise $n\in D$, we have

 
 \begin{displaymath}
 f = Ku + n
\relax \end{displaymath} (A1)
for $f\in D$ and $u\in P$. Here P is a real Hilbert space, parametrised by r, with a symmetric inner product

\begin{displaymath}
\relax (a\vert b)_{P}=\int_0^1 a(r)b(r)\,\relax {\rm d}\relax r\quad, \forall a,b\in P,\end{displaymath}

and D is a finite-dimensional Euclidean space with

\begin{displaymath}
\relax (a\vert b)_{D }= \sum_i a_i b_i.\end{displaymath}

We wish to make an estimate $\hat u_r\in \relax \bf{R}\relax $ of a single component of the object u, based on the data f. To this end, we wish to find a $q\in D$ (depending on r), such that  
 \begin{displaymath}
 \hat u_r = \relax (q\vert f)_{D}.
\relax \end{displaymath} (A2)
We find this q as the solution of a minimisation problem. Introducing the adjoint operator K*: $D\to P$, and assuming $\relax E\left(\relax (q\vert n)_{D}\right)=0$, we have  
 \begin{displaymath}
 \relax E\left(\hat u_r\right) = \relax (q\vert Ku)_{D }= \relax (K^*q\vert u)_{P}.
\relax \end{displaymath} (A3)
Recalling that $\relax (K^*q\vert u)_{P}=\int_0^1(K^*q)(r')u(r')\,\relax {\rm d}\relax r'$, we see that $(K^*q)(r')\in P$ can be identified with the averaging kernel $\Delta(r,r')$, and that Eq. (A3) will be a good estimate of ur when q is transformed by K* into the basis vector $e_r\in P$corresponding to the component r of u. That is, Eq. (A3) would be exact if K*q=er. The object K*q will instead be a linear combination of basis vectors "close'' to er, and we can measure its "scatter' around er with the operator $Q:P\to P$ such that $Qx_{r'}\equiv (r-r')^2x_{r'}, \forall x_{r'}\in P$. Define  
 \begin{displaymath}
 \relax \mathcal{A} \equiv \relax (K^*q\vert QK^*q)_{P }= \relax (q\vert KQK^*q)_{D
 }= \sum_{i,j} q_i W_{ij} q_j,
\relax \end{displaymath} (A4)
defining the (self-adjoint) operator $W=KQK^*:D\to D$. We may also define a measure of the stability of $\hat u_r$, 
 \begin{displaymath}
 \relax \mathcal{B} \equiv \relax (q\vert Sq)_{D},
\relax \end{displaymath} (A5)
by analogy with Eq. (11), where the operator $S\in D$ is such that $\relax (e_i\vert Se_j)_{D }= S_{ij}$, where Sij is the positive definite noise covariance matrix. The demand that $\Delta(r,r')$ have unit area translates into the constraint $\relax (K^*q\vert 1)_{P}=1$, where $1\in P$ is the all-1 vector in P. Writing $R\equiv K1\in D$, this is equivalent to the constraint  
 \begin{displaymath}
 \relax (q\vert R)_{D }= 1,
\relax \end{displaymath} (A6)
restricting q to a hypersurface in D, with normal R.

If we now introduce the functional $c: D\to\relax \bf{R}\relax $, such that  
 \begin{displaymath}
 c(q)\equiv \frac12 [\relax \mathcal{A}(q) + \lambda\relax \...
 ...cal{B}(q)]
 =\frac12 \relax (q\vert(W+\lambda S)q)_{D},
\relax \end{displaymath} (A7)
the minimisation problem becomes that of finding the q which minimises c(q), subject to $\relax (q\vert R)_{D}=1$.Considering small variations $\epsilon\in D$ in the hyperplane (that is $\{\epsilon:\relax (\epsilon \vert R)_{D}=0\}$, and defining the gradient $\nabla
c(q)=(W+\lambda S)q$ ($W+\lambda S$ is self-adjoint since both W and S are), we have

\begin{displaymath}
c(q+\epsilon) = c(q) + \relax (\epsilon\vert\nabla c(q))_{D }+
 \frac12 \relax (\epsilon\vert(W+\lambda S)\epsilon)_{D}.\end{displaymath}

This is extremised at $q_\lambda$ such that $\relax (\epsilon\vert\nabla c(q_\lambda))_{D}=0$, and is a minimum if $\relax (\epsilon\vert(W+\lambda S)\epsilon)_{D }\ge 0$.The operator S is positive-definite by definition, and the operator W is positive-definite, since Q is. The operator c(q) is therefore minimised when $\relax (\epsilon\vert(W+\lambda S)q_\lambda)_{D}=0,
\forall\epsilon$, or  
 \begin{displaymath}
 (W+\lambda S)q_\lambda = \alpha R,
\relax \end{displaymath} (A8)
for any $\alpha\in\relax \bf{R}\relax $. Imposing the constraint Eq. (A6), we thus find  
 \begin{displaymath}
 q_\lambda = \frac{(W+\lambda S)^{-1} R}{\relax (R\vert(W+\lambda S)^{-1)_{ }R}D},
\relax \end{displaymath} (A9)
from which we can obtain Eq. (13), from the definition of $\relax (\cdot\vert\cdot )_{D}$.


next previous
Up: Inversion of polarimetric data

Copyright The European Southern Observatory (ESO)