next previous
Up: A spatial and

2. The spatial and spectral MEM algorithm

2.1. The concept

Although the data used in subsequent sections were obtained with a linear array of three antennas as noted above, our ultimate goal for the algorithm is to apply it to data from the 5-element solar interferometer at Owens Valley Radio Observatory (OVRO). This array currently consists of two 27-m antennas and three 2-m antennas with frequency-agile receivers, and is capable of generating 2-dimensional maps of solar sources at 45 frequencies in the range tex2html_wrap_inline1483. See Gary & Hurford (1994) for an example of observations with the complete five-element array. The algorithm is thus being developed to handle fully three-dimensional data (one spectral and two spatial dimensions). Since this is the more general case, in this subsection we conceptually describe the complete 3-d algorithm. We implement and apply it to the simpler case of one spatial dimension in subsequent sections.

A typical interferometer array measures the spatial Fourier components of the brightness distribution at each frequency, tex2html_wrap_inline1485, and the observations consist of data pairs of flux amplitude, tex2html_wrap_inline1487, and phase, tex2html_wrap_inline1489, as a function of frequency, tex2html_wrap_inline1491, and baseline, b. Each data pair represents one point in a complex visibility plane:
equation212

The image of the observed radio source or its ``true" map, tex2html_wrap_inline1495, as a function of spatial position and frequency has to be positive everywhere (continuum emission), and is related to the data via a Fourier transform.
eqnarray217
with
eqnarray228

eqnarray234
where k is the Boltzmann constant, x and y are the spatial coordinates and tex2html_wrap_inline1503 and tex2html_wrap_inline1505 are the corresponding spatial frequencies, which are functions of observing frequency, tex2html_wrap_inline1507, and projected baseline lengths, tex2html_wrap_inline1509 and tex2html_wrap_inline1511. For spatially 1-d data (v=0), this relation reduces to
equation240

equation245

The measurement consists of a finite number of generally unevenly spaced points in the Fourier domain. To reconstruct an image of the radio source, unmeasured Fourier components have to be taken into account. If they are simply set to zero, the resulting ``dirty map" will show strong sidelobes, which can make it difficult to interpret the image. Thus, our objective is to approximate the true image with more likely values than zero for the unmeasured components, but without violating the data constraints; a general discussion of the imaging problem for radio data can be found in Christiansen & Hogbom (1985) and Cornwell & Braun (1989). Since the antennas of the OVRO solar array are equipped with frequency-agile receivers and existing algorithms such as CLEAN (Högbom 1974) or standard MEM do not take advantage of the spatial information available at adjacent frequencies, we wish to develop an algorithm that obtains a global solution to the visibilities in both the spatial and spectral domains.

For this task, following the usual procedure for a MEM algorithm, we chose a maximum entropy method which allows us to reconstruct a temperature map consistent with the data with the least amount of added a priori information or with the fewest assumptions. Narayan & Nityananda (1986) give a good description of MEM and its application to radio data, for a more general description of MEM see, for example, Bevensee (1993). The standard least-squares MEM ensures that the resulting image is spatially smooth and that it is positive everywhere. We add to it the information or assumption that a spectrum at a given spatial position changes only smoothly with frequency. Thus, our goal can be defined as finding the temperature map, tex2html_wrap_inline1515, which maximizes the following objective function:


equation256
where tex2html_wrap_inline1517 and tex2html_wrap_inline1519 are the spatial and the spectral entropy terms, tex2html_wrap_inline1521 is the data constraint, tex2html_wrap_inline1523 and tex2html_wrap_inline1525 are two (adjustable) Lagrange multipliers and n is the iteration index.

As in standard MEM, the data constraint is expressed as a least-squares term
equation269
where tex2html_wrap_inline1529 and tex2html_wrap_inline1531 are the measured data and tex2html_wrap_inline1533 and tex2html_wrap_inline1535 are calculated from the iterated temperature map using Eqs. (3) and (4), tex2html_wrap_inline1537 is the number of frequencies and tex2html_wrap_inline1539 is the number of baselines. The squared deviation is weighted by the measurement error tex2html_wrap_inline1541.

The spatial MEM term is defined, again as in standard MEM, as follows
equation289
with map temperature, T, being a function of spatial position and frequency.

The traditional least-squares MEM maximizes the spatial entropy term, tex2html_wrap_inline1545, subject to the data constraints, tex2html_wrap_inline1547. In our modified algorithm, we include a spectral entropy term, tex2html_wrap_inline1549, which we define in analogy to the spatial term
equation303
with
equation310

The interpolated temperature, tex2html_wrap_inline1551, is a function of spatial position and frequency and is logarithmically interpolated from the temperatures at the two neighboring frequencies tex2html_wrap_inline1553 and tex2html_wrap_inline1555 at the same spatial position
equation318
This can be alternatively written as
eqnarray331
or further simplified to
equation346
In our implementation, we use Eq. (13) to calculate tex2html_wrap_inline1557 and as can be easily seen in Eq. (14) the exponents are only frequency dependent and can be calculated outside the iteration loop. The difference tex2html_wrap_inline1559 (Eq. (11)) is small if the map is smooth in frequency and large otherwise, thus ensuring spectral smoothness. Since this difference can be exactly zero, we added a constant of unity so that tex2html_wrap_inline1561 is zero when tex2html_wrap_inline1563 equals tex2html_wrap_inline1565.

To maximize the objective function tex2html_wrap_inline1567 we have to calculate its derivative with respect to the temperature, T, at each spatial position and frequency. This equation is the central part of the algorithm. We like to point out that throughout the rest of this paper, as in the actual algorithm, we use an equivalent formalism and minimize the ``negentropy", -S, instead of maximizing the entropy.
equation366

The data constraint derivative has the following form
eqnarray382
with tex2html_wrap_inline1573 and tex2html_wrap_inline1575 being the pixel size in the x- and y-direction. For 1-d data (v = 0), the data constraint derivative reduces to
eqnarray396

The derivative of the spatial entropy term is simply
equation410

The derivative of the spectral entropy term consists of three terms similar to Eq. (18) since T contributes to tex2html_wrap_inline1585, tex2html_wrap_inline1587, and tex2html_wrap_inline1589. The terms tex2html_wrap_inline1591 and tex2html_wrap_inline1593 are easily derived from Eq. (14). The superscript (n) is omitted for clarity.
eqnarray426
with
equation438

eqnarray443

eqnarray455
where tex2html_wrap_inline1609 is one frequency up from tex2html_wrap_inline1611 and tex2html_wrap_inline1613 is one frequency down from tex2html_wrap_inline1615. This means that while tex2html_wrap_inline1617 is determined from its nearest two neighbors in frequency, tex2html_wrap_inline1619 is calculated using the nearest four neighbors. At the ends of the frequency range, the highest and lowest frequencies, tex2html_wrap_inline1621 and tex2html_wrap_inline1623, have to be treated separately. The ratios of logarithms of frequencies in Eqs. (21) and (22) are about tex2html_wrap_inline1625 and are easily extrapolated for the highest and the lowest frequencies by assuming that the spacing between frequencies stays the same. At the upper boundary, we substitute the values of tex2html_wrap_inline1627 and tex2html_wrap_inline1629 at tex2html_wrap_inline1631 for the ones at the frequencies tex2html_wrap_inline1633 and tex2html_wrap_inline1635, and at the lower boundary, we substitute their values at tex2html_wrap_inline1637 for the ones at the frequencies tex2html_wrap_inline1639 and tex2html_wrap_inline1641.

In standard MEM, the choice of Lagrange multiplier tex2html_wrap_inline1643 is problematic, and generally must be done with trial and error for a particular application. Our algorithm compounds the problem by introducing an additional parameter tex2html_wrap_inline1645. We now discuss the choice of these Lagrange multipliers for this application. The reconstruction starts with an initial temperature map, such as a ``flat" map or a model of the source, and ends with a final map which is the reconstructed image. The Lagrange multipliers, tex2html_wrap_inline1647 and tex2html_wrap_inline1649, have to balance the three terms of tex2html_wrap_inline1651 in Eq. (15). The spatial entropy term, tex2html_wrap_inline1653, is always positive, while the spectral entropy term, tex2html_wrap_inline1655, can be either positive, negative or zero. The data term, tex2html_wrap_inline1657, is negative if the initial map is a flat map. Thus, the temperature, T, will increase if tex2html_wrap_inline1661 is negative and decrease if tex2html_wrap_inline1663 is positive. The initial map is chosen to be smooth everywhere. Therefore, the spectral entropy term will be zero at the first iteration and will remain very small at the first few iterations, and, as a consequence, tex2html_wrap_inline1665 can be determined from the initial map and the data constraint.
equation497

The parameter, tex2html_wrap_inline1667, has to be small enough to allow the temperature to increase from the initial value at any point in the map, where the source of interest is present, while tex2html_wrap_inline1669 should also be large enough to ensure that the temperature stays at the initial value or increases only slightly at other points. If the initial map is a flat map characterized by a single temperature tex2html_wrap_inline1671, this requirement leads to the following relation
equation511
for all points in the map where the source of interest is present. This relation provides an upper limit for tex2html_wrap_inline1673 and since the objective is to reproduce the source and at the same time to reduce ripples due to unmeasured Fourier components, this is the optimum value to use.

For the parameter tex2html_wrap_inline1675, an analogous relation to Eq. (24) is not useful because most of the iterated maps would have to be taken into account, not just the initial one. To determine the best value of tex2html_wrap_inline1677, we reconstructed several maps using different values of tex2html_wrap_inline1679 with the same tex2html_wrap_inline1681, derived from Eq. (24), and then compared the spectra of the different maps. As an initial value of tex2html_wrap_inline1683, we note that the term tex2html_wrap_inline1685 is most likely smaller than tex2html_wrap_inline1687 since tex2html_wrap_inline1689 is the difference of two temperatures, thus tex2html_wrap_inline1691 is a good lower limit. For a small tex2html_wrap_inline1693, the reconstruction will not differ much from a reconstruction using the spatial entropy term alone. With increasing tex2html_wrap_inline1695, the sidelobes will be steadily reduced, and the spectra will appear steadily smoother, but the overall spectral shape should remain the same. When tex2html_wrap_inline1697 gets too large, however, the spectral slopes begin to decrease due to overly weighting the spectral smoothness parameter. We have found that this change in spectral character is very pronounced and easily discerned. The optimum value of tex2html_wrap_inline1699 is then the largest value before which this artificial flattening occurs. We do not yet have enough experience to say whether the optimum value of tex2html_wrap_inline1701 will be similar for all sources (and spectral shapes), or whether the above procedure must be followed for each individual data set.

2.2. The 1-d data

We apply our 1-d algorithm to observations taken with the OVRO frequency-agile interferometer of active region AR 5417 near the solar limb on March 20, 1989 (vernal equinox) using the two 27-m antennas and the 40-m antenna arranged in a linear east-west array with the two 27-m dishes at 122 and 488 m east and the 40-m dish at 1066 m east. The array was operated with an observing sequence that sampled 45 frequencies in the range 1 - 18 GHz, in both the right- and left-handed circular polarization every 10 s. The amplitudes and phases were calibrated with respect to our primary flux calibrator, 3C 84, which in turn was calibrated relative to the flux standard 3C 286. Unfortunately, the receiver on the 40-m antenna experienced a hardware problem that resulted in no phase lock above 10 GHz, and there was external interference at 1.0 GHz, so the data presented here range from tex2html_wrap_inline1705.

For each of the three baselines, we analyze the observations at the 31 frequencies in the range tex2html_wrap_inline1707 taken between 21:05 UT and 23:58 UT. The projected baselines change during this time due to Earth rotation, resulting in fringe spacings that range from 6.8'' at 10 GHz to 276'' at 1.2 GHz. By dividing the data into 80 time samples, we get tex2html_wrap_inline1713 Fourier points at each observed frequency. To adequately map the active region, we use a map size of 256'' (with a shift of 23'' from the nominal phase center to bring the solar limb to the center of the map) and a spatial resolution or pixel size of 1''. The lowest frequency determines the necessary map size, while the highest frequency determines the required spatial resolution.

Since the active region was located close to the solar limb, we have to correct for the interferometer response to the solar limb. For this purpose, we calculated its response to a uniform disk of the size of the Sun when pointed to the edge of the disk. We used the quiet Sun temperatures measured by Zirin et al. (1991) as the frequency-dependent disk temperature. The resulting amplitudes were generally less than 0.1 sfu (solar flux units) which is within the assumed measurement error (see Eq. (27) below). We subtracted the limb response from the observed data and used the corrected data pairs for the reconstruction.

2.3. The 1-d implementation

In this section, we describe the current implementation of our MEM algorithm and give parameter values used to reconstruct an image of active region AR 5417. Different implementations of the MEM algorithm have their advantages and disadvantages, as discussed by Narayan & Nityananda (1986). After some testing of different schemes, we chose the following method for its speed.

We start with a uniform ``flat" map which has the same temperature tex2html_wrap_inline1721 at all frequencies and spatial positions. For each subsequent iteration n, we evaluate the gradient tex2html_wrap_inline1725 at each point of the map (Eq. (15)) and according to its sign we either increase or decrease the temperature tex2html_wrap_inline1727 by a fraction of its value
equation532

The gain, tex2html_wrap_inline1733, is a function of iteration n only, while tex2html_wrap_inline1737 and tex2html_wrap_inline1739 are, in addition, functions of spatial position x and frequency tex2html_wrap_inline1743. We decrease tex2html_wrap_inline1745 exponentially with n to make sure that changes in T are getting smaller the closer T gets to the actual temperature value.
eqnarray553
where tex2html_wrap_inline1753, tex2html_wrap_inline1755, and tex2html_wrap_inline1757 are parameters to be specified.

The parameters have to be chosen so that any real solar temperature is within the range of reconstructable temperatures; i.e the initial temperature, tex2html_wrap_inline1759, has to be much smaller than any solar temperature and the largest possible iterated temperature has to be much larger than the largest solar temperature. In addition, the parameters have to be chosen so that the increase at the beginning is not too large in order not to ``freeze in" an early iteration; a large tex2html_wrap_inline1761 requires a small tex2html_wrap_inline1763. We set the final gain, tex2html_wrap_inline1765, to 0.001 and we use the following additional parameter values of tex2html_wrap_inline1767, tex2html_wrap_inline1769, tex2html_wrap_inline1771 to fulfill the constraints.

For the measurement error, tex2html_wrap_inline1773 (in sfu), we use the following relation
equation572
with tex2html_wrap_inline1775 being the amplitude.

To estimate the Lagrange multiplier tex2html_wrap_inline1777 for the OVRO data of AR 5417, we used Eq. (24), calculated tex2html_wrap_inline1779 for the whole map, and from a dirty map reconstruction estimated the position of the source in the map. This led to the following relation at spatial positions where a source is present:
eqnarray580
The chosen flat map temperature tex2html_wrap_inline1781 adds the factor tex2html_wrap_inline1783. The estimated value of tex2html_wrap_inline1785 is then
equation594

With this value of tex2html_wrap_inline1787, we calculated several reconstructions using different values of tex2html_wrap_inline1789. We found that tex2html_wrap_inline1791 was too small to yield improvements beyond those for a reconstruction using the spatial entropy term alone and that tex2html_wrap_inline1793 eliminated all ripples but flattened the spectra at high frequencies. Thus, the ``best" value of tex2html_wrap_inline1795 for the analyzed data was chosen to be
equation598

After the MEM reconstruction was done, the reconstructed map was convolved with the so-called ``clean" beam, a Gaussian fit to the inner part of the actual beam, to limit the effect of ``superresolution" (e.g. Cornwell & Braun 1989). All imaging algorithms produce some degree of superresolution since they estimate unmeasured Fourier components; i.e. the reconstructed image shows structures on spatial scales smaller than the fringe spacing of the largest baseline. A superresolution of a factor two or less is generally acceptable (cf. Narayan & Nityananda 1986). In our case, we have an additional complication due to the added spectral term which causes the reconstructed map to be spatially not as smooth as it would be without this term, which produces ``signal" at high spatial frequencies. We compared spatial power spectra of the convolved restored map with power spectra of the corresponding dirty map and found that the largest spatial frequency present in the reconstructed map is comparable to the one in the dirty map within a factor of 1.5 or less. This implies that convolving with the clean beam sufficiently limits the influence of superresolution and that at the same time its effect is limited enough to preserve the spatial information present in the original data.

In our initial algorithm, used to create the maps in this paper, we directly compute the Fourier transform (Eqs. (5) and (6)) which has the advantage of simplicity over the Fast Fourier Transform (FFT) in that it does not require gridding of the visibilities (see Thompson et al. 1991). However, in looking ahead to the 2-d version of the algorithm, execution time becomes more important than simplicity. Therefore, we have extended our 1-d algorithm to perform gridding and employ the FFT. Because gridding of multifrequency data introduces special problems, we briefly outline the gridding process here. The gridding process is described mathematically in terms of convolution and resampling, where an appropriate convolution function has to be chosen. A simple choice for a convolution function, and the one we employ, is a rectangular function (cell averaging); the (uv) plane is divided into rectangular cells of dimension tex2html_wrap_inline1799 by tex2html_wrap_inline1801 centered on the grid point and all measurements within each cell are assigned to the grid point.

  figure603
Figure 1: A white-light image and a magnetogram of active region AR 5417 near the solar limb on March 20, 1989 (vernal equinox) from BBSO. The radio image at 3.6 GHz (covering 256'') is superposed for comparison

  figure608
Figure 2: The CLEAN reconstruction. The spatial information encoded in the spectrum is not used. The contours are at logarithmic intervals to bring out the lower level sidelobes. (contour 4: tex2html_wrap_inline1805, contour 5: tex2html_wrap_inline1807, contour 6: tex2html_wrap_inline1809)

  figure613
Figure 3: The spatial MEM reconstruction. The spatial information encoded in the spectrum is not used. The contours are at logarithmic intervals to bring out the lower level sidelobes. (contour 4: tex2html_wrap_inline1811, contour 5: tex2html_wrap_inline1813, contour 6: tex2html_wrap_inline1815)

  figure618
Figure 4: The reconstructed map using the spatial/spectral MEM algorithm. The contours are at logarithmic intervals to show an improvement in sidelobe level in comparison with Figs. 2 and 3. (contour 4: tex2html_wrap_inline1817, contour 5: tex2html_wrap_inline1819, contour 6: tex2html_wrap_inline1821)

  figure623
Figure 5: The brightness temperature maps at frequencies a) 2.0 GHz, b) 5.0 GHz, c) 7.0 GHz, and d) 9.0 GHz of the spatial/spectral MEM reconstruction (filled gray areas), the spatial MEM (solid line) and the CLEAN reconstruction (dotted line)

The problem introduced with gridding of multifrequency data lies in its smoothing effect on the visibilities. For example, if we divide the spatial frequencies between 0 and the Nyquist frequency, tex2html_wrap_inline1823, into 512 grid points and use a pixel size of 1'' (tex2html_wrap_inline1827), the tex2html_wrap_inline1829 observations occupy 106 grid points at 10 GHz and only 15 at tex2html_wrap_inline1831. To reduce the amount of smoothing at lower frequencies, we divide our frequency range tex2html_wrap_inline1833 into three ranges with different spatial resolution and grid each range separately. We choose a low-frequency range (tex2html_wrap_inline1835) with a pixel size of 4'', a mid-frequency range (tex2html_wrap_inline1839) with 2'', and a high-frequency range (tex2html_wrap_inline1843) with 1'' to ensure that the smoothing introduced by the gridding process remains within a factor of two to five. The combined effect of using FFT instead of direct Fourier transform, gridding and dividing into ranges with different spatial resolution reduces the execution time of the MEM algorithm by a factor of tex2html_wrap_inline1847, for the same 1-d algorithm. The gain is even greater when extended to 2-d.


next previous
Up: A spatial and

Copyright by the European Southern Observatory (ESO)
web@ed-phys.fr