D. Matrix solution techniques

The methods discussed in this paper require nonlinear minimisation of the variance of a function of several matrices wrt one of them. One way to handle it is by writing out all equations and their derivatives in terms of the real and imaginary parts of all matrix elements and then applying standard nonlinear solution methods ([Press et al. 1989]). This is the approach used in AIPS++ (1998); it is cumbersome and the resultant code is complex and difficult to verify (T. Cornwell, private communication).

In the methods to be described below, entire matrices are the atomic variables. Thus, full advantage is taken of the conceptual efficiency of matrix algebra, which in turn reflects in simple solution algorithms that are very easily coded. The algorithms explicitly exploit the structure of the equations; for this reason they may well be more efficient than the general-purpose approach. They also lend themselves to quick experiments in an environment such as AIPS++ (1998) in which matrix operations can be coded directly.

D.1. Differentiation

The problem is that of finding the matrix $\vec{V}$ that minimises the variance of some matrix function $\vec{M}$ of $\vec{V}$ . Attacking this problem in the conventional way requires the definition of the derivative of $\mbox{Var}\ \vec{M}(\vec{V})$ with respect to $\vec{V}$ . An equivalent approach that requires no new definitions is to consider the differentials themselves rather than their quotient. For a variation $\delta \vec{M}$ , the corresponding variation in $\mbox{Var}\ \vec{M}$ is (cf. Eq. 41)

$\displaystyle \delta \,\mbox{Var}\ \vec{M}$ = $\displaystyle \mbox{Tr}\ ( \vec{M}\,\delta\vec{M}^{\dagger}+ \delta\vec{M}\,\vec{M}^{\dagger})$
= $\displaystyle 2\, \mbox{Tr\,Re\ }( \vec{M}\,\delta\vec{M}^{\dagger}).$	(43)

In the applications of interest, $\vec{M}\,\delta\vec{M}^{\dagger}$ is a sum of (products of) several other matrices, one of which is $\delta\vec{V}$ . In each of these products, we may cyclically permute the factors to move $\delta\vec{V}$ to the trailing position (cf. Appendix C.5). Thus we convert each product term in Eq. (43) to the form

$\displaystyle \mbox{Tr\,Re\ }( \vec{Z}\,\delta\vec{V}) = 0.$

(44)

If $\vec{V}$ is constrained, e.g. to being diagonal or unitary, corresponding constraints are to be imposed upon $\delta\vec{V}$ .

If, for any permitted variation $\delta\vec{V}$ , the variation $i\,\delta\vec{V}$ is also allowed, Eq. (44) can be simplified by omitting the ${\rm Re\,}$ operator. If, moreover, $\delta\vec{V}$ is completely free, Eq. (44) implies that $\vec{Z}$ itself is $\vec {0}$ .

D.2. Self-alignment decomposition

I show a simple least-squares self-aligment algorithm as an example. Given a set of observed coherencies $\vec{W}_{jk \vphantom{'}}$ and a source model $\vec{E}'_{jk \vphantom{'}}$ , we seek to fit values $\vec{J}'_{j \vphantom{'}}$ that minimise the noise power at the interferometer inputs:

$\begin{eqnarray*}S = \sum_{jk \vphantom{'}}\mbox{Var}\ ( \vec{J}{_{j}^{'-1}} \ve... ...{'}}\vec{J}{_{k}^{'-1}}^{\dagger}- \vec{E}'_{jk \vphantom{'}}). \end{eqnarray*}$

For a change $\delta\vec{J}{_{j}^{'-1}}$ we have, from Eq. (43)

$\begin{eqnarray*}\begin{array}{ll} \delta S = 2 \mbox{Tr\,Re\ }\big( \sum_k &... ...elta(\vec{J}{_{j}^{'-1}}{^{\dagger}}) \,\big) = 0 . \end{array} \end{eqnarray*}$

Since $\delta(\vec{J}{_{j}^{'-1}}^{\dagger})$ is arbitrary, it follows that

$\begin{eqnarray*}\vec{J}{_{j}^{'-1}} \sum_k \vec{W}'_{jk \vphantom{'}}\vec{J}{_{... ...'}}\vec{J}{_{k}^{'-1}} \vec{W}'_{jk \vphantom{'}}{^{\dagger}} . \end{eqnarray*}$

Given a set of estimates $\vec{J}'{_{k}^{-1}}$ , this equation provides the basis for an iterative algorithm by producing a new estimate for $\vec{J}'{_{j}^{-1}}$ .

Note the similarity of the first three factors on the lefthand side to $\sum \vec{E}'_{jk \vphantom{'}}$ . In the same way as dimension comparisons in physics, this similarity provides a partial check on the correctness of an equation.

This method is easily generalised to a more proper $\chi^2$ form for the case where the four polarisation channels in each interferometer carry the same noise level. This is probably an adequate assumption in most practical cases.

D.3. Feed-error minimisation

Section 8.2 poses the problem of minimising

$\displaystyle S = \sum_{j \vphantom{'}}\mbox{Var}\ ( \vec{D}'_{j \vphantom{'}}- \mathbf{I}),$

(45)

where

$\begin{displaymath}\vec{D}'_{j \vphantom{'}}= \vec{G}'{_{j}^{-1}} \,\vec{J}'_{j \vphantom{'}}\,\vec{Y}'^{-1} \,\vec{F}_{j}^{-1} \end{displaymath}$

$\vec{G}'_{j \vphantom{'}}$ are unknown diagonal gain matrices and $\vec{Y}'$ is the unknown unitary polrotation matrix (which is not necessarily unimodular). Taking differentials

$\begin{eqnarray*}\delta S = 2\, \sum_{j \vphantom{'}}\mbox{Tr\,Re\ }\big(\, ( \v... ...bf{I}) \,\delta \vec{D}'_{j \vphantom{'}}{^{\dagger}} \,\big) . \end{eqnarray*}$

$\vec{G}'{_{j}^{-1}}$ is found given the current value of $\vec{Y}'$ in the way indicated above. The constraint on $\delta\vec{G}'{_{j}^{-1}}$ is that it be diagonal. The result is

$\begin{eqnarray*}(\vec{G}'{_{j}^{-1}})_{ll} = ( \vec{J}_{j \vphantom{'}}\,\vec{J... ...\,\vec{J}'_{j \vphantom{'}}{^{\dagger}})_{ll} \,,\quad l=1,2 . \end{eqnarray*}$

To solve for $\vec{Y}'$ given the current values of the $\vec{G}'{_{j}^{-1}}$ , we begin by applying unitary transformations $\vec{F}{_{j}^{-1}}$ to the summands in Eq. (45) to obtain (cf. Appendix A.5)

$\begin{eqnarray*}S = \sum_{j \vphantom{'}}\mbox{Var}\ ( \vec{F}_{j}^{-1}\,\vec{G... ...{-1}} \,\vec{J}'_{j \vphantom{'}}\,\vec{Y}'^{-1} - \mathbf{I}). \end{eqnarray*}$

We may now minimise S by invoking the minimum-variance theorem of Appendix C.5, in combination with the fact that

$\begin{eqnarray*}\sum_{j \vphantom{'}}\mbox{Var}\ (\, \vec{Z}_{j \vphantom{'}}\vec{Y}'^{-1} - \mathbf{I}\,) \end{eqnarray*}$

and

$\begin{eqnarray*}\mbox{Var}\ \big( \sum_{j \vphantom{'}}(\, \vec{Z}_{j \vphantom{'}}\vec{Y}'^{-1} - \mathbf{I}\,) \big) \end{eqnarray*}$

are minimal for the same value of $\vec{Y}'$ .

Up: Understanding radio polarimetry