3 Scalar self-calibration

The instability of our instruments limits the possibility of external calibration against cosmic or man-made standards. The high dynamic ranges that are now the norm depend entirely on self-calibration or selfcal ([Thompson et al. 1986]; [Perley et al. 1994]).

So far, selfcal has been known only in scalar form. Although it is almost universally appplied, we have no complete and compelling theory to describe it. In practice it shows a strong propensity to converge to a unique solution, -- provided it is given enough data. Yet we have no formal proof of this uniqueness.

My aim now is to study matrix selfcal by approaching it as a generalisation of the scalar variant. In doing so, the best one may expect is to reach an analogously incomplete understanding. As we shall see, this is enough for making interesting inferences.

I begin by reviewing scalar selfcal. I ignore the effect of noise, except to note that it introduces an element of uncertainty into the entire process that may in unfavourable cases subvert the apparent uniqueness of our solution. As a rule this does not appear to happen in practice.

3.1 Scalar self-calibration

Scalar selfcal works on the basis of two assumptions:

All instrumental effects are antenna-based, i.e. the correlator is error-free. Thus our observed visibility is given by

$\begin{eqnarray*}v(\vec{r}_{jkt \vphantom{'}}) = g_{jt \vphantom{'}}\,e(\vec{r}_{jkt \vphantom{'}})\, g{_{kt \vphantom{'}}}^*. \end{eqnarray*}$
The sky is "relatively empty'': The source brightness is nonzero only in a minor fraction of the observed field, the source's support. In practice, it turns out that the support need not be known a priori, but can be found and successively refined by inspection of provisional "dirty'' images.

Given a set of observations, selfcal seeks to find antenna gains $g'_{jt \vphantom{'}}$ and visibilities $e'(\vec{r}_{jkt \vphantom{'}})$ that are consistent with them:

$\displaystyle v(\vec{r}_{jkt \vphantom{'}}) = g'_{jt \vphantom{'}}\,e'(\vec{r}_{jkt \vphantom{'}})\, g'{_{kt \vphantom{'}}}^*.$

(10)

Obviously, one solution consists in the true gains and visibilities. In addition, Eq. (10) is satisfied by the combination

$\displaystyle \begin{array}{lcl} g'_{jt \vphantom{'}}= g_{jt} \,x^{-1}_{jt}, \q... ...ntom{'}})\ x{_{kt \vphantom{'}}}^*, \\ &&\qquad j,k = 1, \ldots, N \end{array}$

(11)

for any conceivable set of multipliers $x_{jt \vphantom{'}}$ . For each of these, the visibilities $e'(\vec{r}_{jkt \vphantom{'}})$ in turn correspond to a source model B' according to Eq. (4):

$\displaystyle x_{jt \vphantom{'}}\ e(\vec{r}_{jkt \vphantom{'}})\ x_{kt \vphant... ...sum_{\vec{l} \vphantom{'}}w(\vec{l},\vec{r}_{jkt \vphantom{'}})\ B'\,(\vec{l}).$

(12)

If the source support is limited as assumed, the sum contains only a limited number L of terms for which $B'(\vec{l})$ can differ from zero. For a properly conditioned observation, the number of visibility samples (of order MN(N-1)/2) is much greater than that of unknowns: MN values of the $x_{jt \vphantom{'}}$ plus L values of $B'(\vec{l})$ .

The system is now overdetermined, but we have already seen that it admits at least one solution. It is not unique, however. Indeed, if all the $x'_{jt \vphantom{'}}$ equal one value x, Eq. (12) can be rewritten as

$\begin{eqnarray*}x\ e(\vec{r}_{jkt \vphantom{'}})\ x^* = \displaystyle \sum_{\v... ...om{'}}w(\vec{l},\vec{r}_{jkt \vphantom{'}})\ x\ B(\vec{l})\ x^* \end{eqnarray*}$

which defines a brightness solution

$\displaystyle B'(\vec{l}) = x\ B(\vec{l})\ x^*.$

(13)

Obviously, B' is confined to the support of B. Actually it is an exact but scaled replica. Other solutions are unlikely to exist. If we should allow the $x_{jt \vphantom{'}}$ to take independent values, this results in scattering of brightness away from the source to other parts of the image. It is reasonable to conjecture that it is impossible for any "wild'' combination of $x_{jt \vphantom{'}}$ values to produce a false brightness image that nonetheless vanishes everywhere outside the source support. Practical experience of two decades supports this conjecture, -- but I repeat that a formal proof is lacking and the solution may not always be robust against the effect of noise.

The above argument pinpoints the support limitation as the agent that makes selfcal work. This idea does not seem to have been systematically exploited before, but Leppänen et al. (1995) advance it in discussing the construction of the polarized part for a source model whose total intensity is already available (cf. Sect. 7.2).

3.2 Calibration versus alignment

Equation (14) does not represent a complete calibration: out of the infinite number of solutions that mutually differ by their positive scale factors xx^*, the selfcal procedure arbitrarily selects one. This non-uniqueness is fundamental. On the basis of selfcal alone, we have no way of knowing what value x has. We must fix the brightness scale afterwards by other means.

What selfcal does achieve is to reduce all the errors $x_{jt \vphantom{'}}$ in the individual visibility measurements to a single value x: it lines up the measurements, forcing them all to conform to one common scale factor. As a result, extremely high dynamic ranges can be attained even though the absolute brightness scale is unknown. Strictly, the calibration is incomplete and we ought to replace "self-calibration'' by the more precise term "self-alignment''. The distinction is a bit academic here, but will become crucially important when we explore matrix selfcal.

It is not immediately clear from the present discussion that the absolute sky position is also lost in self-calibration. To establish this, one must consider the properties of the Fourier-transform relation Eq. (12). The effect is not directly relevant here, but it should not be forgotten.

Up: Understanding radio polarimetry