next previous
Up: Inversion of polarimetric data


Subsections

5 Polarization data inversion

One strength of the Backus-Gilbert method is that we can do a lot of the analysis before we have any data. This gives us an understanding of the limitations of our analysis, and allows us to pick an optimal value for the smoothing parameter $\lambda$.

5.1 Theoretical limits on polarization inversions

The Backus-Gilbert method is one of many inverse problem methods in which one functional of the recovered solution, $\relax \mathcal{A}$, is minimised subject to regulation by another functional $\relax \mathcal{B}$. In this case, functional $\relax \mathcal{A}$ is the width of the averaging kernel, and this is regulated by the variance of the estimate. In other inverse problem methods, the functional $\relax \mathcal{A}$ is some measure of the goodness of fit between the data and the forward problem, such as a $\chi^2$,regulated by the demand that the solution be smooth, or that some non-linear functional of the solution, such as its negentropy, be minimised.

The Backus-Gilbert scheme produces a one-dimensional family of solutions, parametrised by the smoothing parameter $\lambda$. We can represent this as a solution curve on a graph of standard deviation against of the recovered $\hat u$ against the chosen measure of the width of the resolution function: the resolution is inversely related to the width, so such a curve illustrates the trade-off between accuracy and bias in the recovered solution.

  
\begin{figure}
\centering
\includegraphics[width=8.5cm]{ds6149f2.ps}\end{figure} Figure 2: An example of a trade-off curve for the Backus-Gilbert inversion scheme

In general, one usually considers intermediate values of $\lambda$, around the turning point between the two extremes. Precisely which $\lambda$ one chooses depends on the relative importance of stability and resolution in the particular problem.

In the present case, we are interested in measuring the polarization at the limb, and it is clear from Fig. 2 that the resolution we need can only be obtained at the cost of a high standard deviation in the recovered value. If we try to increase the accuracy, we end up no longer measuring the limb polarization proper, but rather a "blurred'' value of the polarisation, smoothed by convolution with the averaging kernel. If the polarization is a maximum at the limb, the bias introduced by this smoothing will reduce the estimated polarisation: the more sharply the polarization falls away from the limb, the greater this biasing will be.

Looking at the low-$\lambda$ (ie, no-noise) limit, we find that the $\Delta(r,r')$ function is sharply peaked at r=1, so that in principle we can extract a well-resolved limb-polarization. In fact, the quality of this peak degrades substantially for r<1: the method as presented cannot reasonably resolve polarizations on the disk. If we wished to recover these, we might use our knowledge of the kernels and include in the sum in Eq. (7) only those kernels which are non-zero in [0,r]. Such a procedure would give no improvement for r=1.

  
\begin{figure}
\centering
\includegraphics[width=8.5cm]{ds6149f3.ps}\end{figure} Figure 3: The shape of the averaging kernel $\Delta(r,r')$ as a function of r' for three different cases. The increase in the width of $\Delta(r,r')$ as $\lambda$ increases is clear. Note the poor resolution at r = 0.5, even when $\lambda$ is small

As we increase $\lambda$, the peak broadens, but the variance of the recovered $\hat
u_\lambda(r)$ decreases.

  
\begin{figure}
\centering
\includegraphics[width=8.8cm]{ds6149f4.eps}\end{figure} Figure 4: The width of $\Delta(r,r')$, $\relax \log_{10}\relax \relax \mathcal{A}$, versus $\relax \log_{10}\relax \relax \mathcal{B}=\relax \log_{10}\relax \relax \mathop{\rm{Var}}\relax [\hat u]$, for a variety of values of $\lambda$, and a variety of sample sizes n. Along each line, $\relax \log_{10}\relax \lambda$ rises from -2 in the bottom right corner, to +2 in the top left, with integer values of $\relax \log_{10}\relax \lambda$ plotted with a diamond on the two n=60 lines. The resolution improves as we move down the plot, and becomes more stable as we move to the left; both the stability and the resolution improve as we use more data points. These curves are for n=20, 60, 112, as indicated; for the discussion of the line marked "n=60 (uneven)'', see Sect. 5.3. The box in the bottom left encloses those combinations of $\lambda$ and n which will produce an estimator $\hat u$ which is adequately stable and well-resolved
In Fig. 4 we display the width $\relax \mathcal{A}$ and variance $\relax \mathcal{B}$ we obtain when we recover P(1) using the kernel $A_\mathrm{Q}$, for various values of $\lambda$ from 10-2 to 102. Here we can clearly see the trade-off between the well-resolved but unstable recovery in the bottom right corner, and the stable but badly-resolved recovery in the top left.

This diagram in a sense completes the Backus-Gilbert analysis of the inverse problem, as represented by the kernel in Eq. (21), and we are now in a position to move on to invert real or simulated data. Before we can do that, however, we must decide what value of $\lambda$ to use. To make that decision, we must consider the level and approximate functional form of the polarization P(r), and use this to set the scale for the resolution and standard deviation we need to achieve. In turn, this fixes the number of data points n we require in our data and the value of the parameter $\lambda$ we must choose in our inversion. Despite the fact that we are invoking a particular model at this point in our analysis, we emphasise that this introduces no practical model dependence. We are using an approximate model purely to help us understand what counts as "sufficiently stable'' or "sufficiently well resolved'', and after this understanding is gained the numbers we recover remain model independent measurements, as opposed to any method of parameter fitting.

Firstly, Chandrasekhar suggests that the limb polarization is of the order of P(1)=0.1; we therefore need a variance which is at least as small as this, requiring $\relax \log_{10}\relax \relax \mathop{\rm{Var}}\relax [\hat
 u]<-1$. Secondly, if we are not to have an overly biased result, our resolution function must be narrow compared with the width of the underlying function P(r). The resolution we need is therefore of order $\mathop{\rm{Width}}[P(r)]$, with the width functional used in Eq. (9). Taking $P(r)\sim\exp10r$ as representative, we find we need a resolution better than $\relax \log_{10}\relax \mathop{\rm{Width}}[\Delta]=-1.6$. Comparison with Fig. 4 indicates that $\lambda=1$ and $n\mathrel{\mathchoice {\vcenter{\offinterlineskip\halign{\hfil
$\displaystyle ... should therefore give us a satisfactory recovery of P(1). The maximum accuracy achievable for a given resolution (or vice versa) can be read off from the graph.

  
\begin{figure}
\centering
\includegraphics[width=8.8cm]{ds6149f5.eps}\end{figure} Figure 5: The recovered values of P(1) for three realisations of noisy simulated data with the same parameters. The horizontal line is the correct value of P(1)

Figure 5 shows the recovery of the limb polarization from three sets of simulated noisy data, as a function of the smoothing parameter. It is clear that the solution becomes more stable as greater smoothing is imposed. However, there comes a point where further smoothing does nothing to improve the solution, but merely degrades the resolution of the recovery.

5.2 The Algol system: A worked example  

As a more concrete illustration, consider the Algol system. Given the system parameters (from Söderhjelm 1980) we can analyse the observational prospects as in the previous section. The Algol system is more complex than the spherically symmetric cases we consider, but our simplified analysis will provide an upper bound on the resolution achievable with the real system.

The graphs presented by Kemp et al. (1983) have 25 data points in the eclipse phase. The same paper states that, in the Algol primary star, 50% of the polarized flux comes from an annulus on the limb of width less than 0.005 $R_\mathrm{\rm disc}$.

Our Backus-Gilbert analysis of a spherically symmetric model system with the Algol orbital and luminosity parameters indicates that this limb polarization profile cannot be resolved in the Algol system. For the eclipse coverage reported in Kemp et al. (1983) the minimum width $\relax \mathcal{A}_\mathrm{\rm min}$ of the averaging kernel at the limb is 0.0199 $R_\mathrm{\rm disc}$, even in the zero-noise limit.

This indicates that the limb polarization cannot be truly resolved with this data if the polarization profile is really as sharp as the stellar models cited by Kemp et al. predict. Thus, while the data does indicate the detection of limb polarization, it cannot reasonably be used (as was attempted by Wilson & Liou 1993) to make a quantitative estimate of the polarization profile.

The precise dependence of the resolution on the number of data points is beyond the scope of the present paper, but we note here that increasing the number of points from 25 to as many as 1000 does not reduce the kernel width sufficiently (from 0.019 $R_\mathrm{\rm disc}$ to 0.0144 $R_\mathrm{\rm disc}$). For practical purposes, then, the inadequate resolution is intrinsic to the Algol system and will not be alleviated by improved observations.

5.3 Observational strategies  

Given a set of measurements fi at positions si, the Backus-Gilbert method can give us a well-controlled recovery of the underlying function u(r). We can do better than this, however, as the method can suggest how to improve our observational strategy to improve the resolution of the recovery. Clearly, increasing the total number of measurements we take, or improving the noise on each measurement, will improve the quality of our recovery. Equally clearly, simply binning the data (so that $\sigma\sim\sqrt n$) demonstrably improves nothing (this can be thought of as conservation of information).

Even if we assume, however, that we cannot change the number or quality of our data points (telescope time and equipment are limited, after all), we might still have the freedom to adjust the positions si (i.e., the times) at which me make our measurements.

In Eq. (7), we are averaging the data points fi with a weight vector qi. Where qi is relatively large, therefore, fi will be smeared over several qi, or over a range of si, smearing out features in the kernel K(r;si). If we can identify prominent features in the vector qi, and cluster our measurements round the si they correspond to, we should be able to decrease the spread of $\Delta(r,r')$ and so increase the resolution of the recovery.

Our simulations show that this rather informal argument is valid for our case. The line in Fig. 4 captioned "n=60 (uneven)'' has the same number of data points as the line "n=60'', but with the values si chosen so that the "data rate'' in a band $s=1.85\pm0.15$ is double that outside the band. This does not significantly improve the variance of the result (we are not gathering any more information than before), but it does noticeably improve its resolution.

This improvement in our procedure corresponds to concentrating our measurements on the point around the beginning of the eclipse. We might have guessed that this would be a reasonable strategy to adopt, but the Backus-Gilbert method has justified our guess, and would substitute for a lack of intuition in a more obscure situation.


next previous
Up: Inversion of polarimetric data

Copyright The European Southern Observatory (ESO)