Although the data used in subsequent sections were obtained with a
linear array of three antennas as noted above, our ultimate goal for
the algorithm is to apply it to data from the 5-element solar
interferometer at Owens Valley Radio Observatory (OVRO). This array
currently consists of two 27-m antennas and three 2-m antennas with
frequency-agile receivers, and is capable of generating 2-dimensional
maps of solar sources at 45 frequencies in the range . See
Gary & Hurford (1994) for an example of observations with
the complete five-element array. The algorithm is thus being developed to
handle fully three-dimensional data (one spectral and two spatial
dimensions). Since this is the more general case, in this subsection
we conceptually describe the complete 3-d algorithm. We implement and
apply it to the simpler case of one spatial dimension in subsequent
sections.
A typical interferometer array measures the spatial Fourier components
of the brightness distribution at each frequency, , and the
observations consist of data pairs of flux amplitude,
, and
phase,
, as a function of frequency,
, and
baseline, b. Each data pair represents one point in a complex
visibility plane:
The image of the observed radio source or its ``true" map, ,
as a function of spatial position and frequency has to be positive everywhere
(continuum emission), and is related to the data via a Fourier transform.
with
where k is the Boltzmann constant, x and y are the spatial coordinates
and and
are the corresponding spatial
frequencies, which are functions of observing frequency,
, and projected
baseline lengths,
and
. For spatially 1-d data (v=0), this relation reduces to
The measurement consists of a finite number of generally unevenly spaced points in the Fourier domain. To reconstruct an image of the radio source, unmeasured Fourier components have to be taken into account. If they are simply set to zero, the resulting ``dirty map" will show strong sidelobes, which can make it difficult to interpret the image. Thus, our objective is to approximate the true image with more likely values than zero for the unmeasured components, but without violating the data constraints; a general discussion of the imaging problem for radio data can be found in Christiansen & Hogbom (1985) and Cornwell & Braun (1989). Since the antennas of the OVRO solar array are equipped with frequency-agile receivers and existing algorithms such as CLEAN (Högbom 1974) or standard MEM do not take advantage of the spatial information available at adjacent frequencies, we wish to develop an algorithm that obtains a global solution to the visibilities in both the spatial and spectral domains.
For this task, following the usual procedure for a MEM algorithm, we chose
a maximum entropy method which allows us to reconstruct a temperature map
consistent with the data with the least amount of added a priori
information or with the fewest assumptions. Narayan &
Nityananda (1986) give a good description of MEM and its
application to radio data, for a more general description of MEM see, for
example, Bevensee (1993). The standard least-squares MEM
ensures that the resulting image is spatially smooth and that it is
positive everywhere. We add to it the information or assumption that a
spectrum at a given spatial position changes only smoothly with frequency.
Thus, our goal can be defined as finding the temperature map, , which maximizes the following objective function:
where and
are the spatial and the
spectral entropy terms,
is the data constraint,
and
are two (adjustable) Lagrange multipliers and n is the iteration index.
As in standard MEM, the data constraint is expressed as a least-squares term
where and
are the measured data and
and
are calculated from the
iterated temperature map using Eqs. (3) and (4),
is the
number of frequencies and
is the number of baselines. The squared
deviation is weighted by the measurement error
.
The spatial MEM term is defined, again as in standard MEM, as follows
with map temperature, T, being a function of spatial position and frequency.
The traditional least-squares MEM maximizes the spatial entropy term,
, subject to the data constraints,
. In our
modified algorithm, we include a spectral entropy term,
,
which we define in analogy to the spatial term
with
The interpolated temperature, , is a function of spatial
position and frequency and is logarithmically interpolated from the
temperatures at the two neighboring frequencies
and
at the same spatial position
This can be alternatively written as
or further simplified to
In our implementation, we use Eq. (13) to calculate and
as can be easily seen in Eq. (14) the exponents are only frequency
dependent and can be calculated outside the iteration loop. The difference
(Eq. (11)) is small if the map
is smooth in frequency and large otherwise, thus ensuring spectral smoothness.
Since this difference can be exactly zero, we added a constant of unity so
that
is zero when
equals
.
To maximize the objective function we have to calculate its
derivative with respect to the temperature, T, at each spatial
position and frequency. This equation is the central part of the
algorithm. We like to point out that throughout the rest of this
paper, as in the actual algorithm, we use an equivalent formalism and
minimize the ``negentropy", -S, instead of maximizing the entropy.
The data constraint derivative has the following form
with and
being the pixel size in the x- and
y-direction. For 1-d data (v = 0), the data constraint
derivative reduces to
The derivative of the spatial entropy term is simply
The derivative of the spectral entropy term consists of three terms
similar to Eq. (18) since T contributes to ,
, and
. The terms
and
are easily derived from Eq. (14). The superscript (n) is
omitted for clarity.
with
where is one frequency up from
and
is
one frequency down from
. This means that while
is
determined from its nearest two neighbors in frequency,
is calculated
using the nearest four neighbors. At the ends of the frequency range, the
highest and lowest frequencies,
and
, have to be
treated separately. The ratios of logarithms of frequencies in
Eqs. (21) and (22) are about
and are easily
extrapolated for the highest and the lowest frequencies by assuming
that the spacing between frequencies stays the same. At the upper
boundary, we substitute the values of
and
at
for the ones at the frequencies
and
, and at
the lower boundary, we substitute their values at
for the
ones at the frequencies
and
.
In standard MEM, the choice of Lagrange multiplier is problematic,
and generally must be done with trial and error for a particular application.
Our algorithm compounds the problem by introducing an additional parameter
. We now discuss the choice of these Lagrange multipliers for this
application. The reconstruction starts with an initial temperature map,
such as a ``flat" map or a model of the source, and ends with a final map
which is the reconstructed image. The Lagrange multipliers,
and
, have to balance the three terms of
in
Eq. (15). The spatial entropy term,
,
is always positive, while the spectral entropy term,
, can be either positive, negative or zero. The data term,
, is negative if the initial map is a flat
map. Thus, the temperature, T, will increase if
is negative and decrease if
is positive.
The initial map is chosen to be smooth everywhere. Therefore, the spectral
entropy term will be zero at the first iteration and will remain very small at
the first few iterations, and, as a consequence,
can be determined
from the initial map and the data constraint.
The parameter, , has to be small enough to allow the
temperature to increase from the initial value at any point in the map,
where the source of interest is present, while
should also be
large enough to ensure that the temperature stays at the initial value or
increases only slightly at other points.
If the initial map is a flat map characterized by a single temperature
, this requirement leads to the following relation
for all points in the map where the source of interest is present.
This relation provides an upper limit for and since the
objective is to reproduce the source and at the same time to reduce
ripples due to unmeasured Fourier components, this is the optimum
value to use.
For the parameter , an analogous relation to Eq. (24) is not
useful because most of the iterated maps would have to be taken into
account, not just the initial one. To determine the best value of
, we reconstructed several maps using different values of
with the same
, derived from Eq. (24), and then compared
the spectra of the different maps. As an initial value of
, we
note that the term
is most likely
smaller than
since
is the
difference of two temperatures, thus
is a good lower
limit. For a small
, the reconstruction will not differ much
from a reconstruction using the spatial entropy term alone. With
increasing
, the sidelobes will be steadily reduced, and the
spectra will appear steadily smoother, but the overall spectral shape
should remain the same. When
gets too large, however, the
spectral slopes begin to decrease due to overly weighting the spectral
smoothness parameter. We have found that this change in spectral
character is very pronounced and easily discerned. The optimum value
of
is then the largest value before which this artificial
flattening occurs. We do not yet have enough experience to say whether
the optimum value of
will be similar for all sources (and
spectral shapes), or whether the above procedure must be followed for
each individual data set.
We apply our 1-d algorithm to observations taken with the OVRO
frequency-agile interferometer of active region AR 5417 near the solar
limb on March 20, 1989 (vernal equinox) using the two 27-m antennas and
the 40-m antenna arranged in a linear east-west array with the two 27-m
dishes at 122 and 488 m east and the 40-m dish at 1066 m east. The
array was operated with an observing sequence that sampled 45
frequencies in the range 1 - 18 GHz, in both the right- and left-handed
circular polarization every 10 s. The amplitudes and phases were
calibrated with respect to our primary flux calibrator, 3C 84, which in
turn was calibrated relative to the flux standard 3C 286.
Unfortunately, the receiver on the 40-m antenna experienced a hardware
problem that resulted in no phase lock above 10 GHz, and there was
external interference at 1.0 GHz, so the data presented here range from
.
For each of the three baselines, we analyze the observations at the 31
frequencies in the range taken between 21:05 UT and 23:58 UT.
The projected baselines change during this time due to Earth rotation,
resulting in fringe spacings that range from 6.8'' at 10 GHz to 276''
at 1.2 GHz. By dividing the data into 80 time
samples, we get
Fourier points at each observed frequency.
To adequately map the active region, we use a map size of 256'' (with
a shift of 23'' from the nominal phase center to bring the solar limb
to the center of the map) and a spatial resolution or pixel size of
1''. The lowest frequency determines the necessary map size, while
the highest frequency determines the required spatial resolution.
Since the active region was located close to the solar limb, we have to correct for the interferometer response to the solar limb. For this purpose, we calculated its response to a uniform disk of the size of the Sun when pointed to the edge of the disk. We used the quiet Sun temperatures measured by Zirin et al. (1991) as the frequency-dependent disk temperature. The resulting amplitudes were generally less than 0.1 sfu (solar flux units) which is within the assumed measurement error (see Eq. (27) below). We subtracted the limb response from the observed data and used the corrected data pairs for the reconstruction.
In this section, we describe the current implementation of our MEM algorithm and give parameter values used to reconstruct an image of active region AR 5417. Different implementations of the MEM algorithm have their advantages and disadvantages, as discussed by Narayan & Nityananda (1986). After some testing of different schemes, we chose the following method for its speed.
We start with a uniform ``flat" map which has the same temperature
at all frequencies and spatial positions. For each
subsequent iteration n, we evaluate the gradient
at each point of the map (Eq. (15)) and according to its
sign we either increase or decrease the temperature
by a
fraction of its value
The gain, , is a function of iteration n only, while
and
are, in addition, functions
of spatial position x and frequency
. We decrease
exponentially with n to make sure that changes in T are getting
smaller the closer T gets to the actual temperature value.
where ,
, and
are parameters
to be specified.
The parameters have to be chosen so that any real solar temperature is
within the range of reconstructable temperatures; i.e the initial
temperature, , has to be much smaller than any solar
temperature and the largest possible iterated temperature has to be
much larger than the largest solar temperature. In addition, the
parameters have to be chosen so that the increase at the beginning is
not too large in order not to ``freeze in" an early iteration; a large
requires a small
. We set the final gain,
, to 0.001 and we use the following additional
parameter values of
,
,
to fulfill the constraints.
For the measurement error, (in sfu),
we use the following relation
with being the amplitude.
To estimate the Lagrange multiplier for the OVRO data of
AR 5417, we used Eq. (24), calculated
for the whole map, and from a dirty map reconstruction
estimated the position of the source in the map. This led to the
following relation at spatial positions where a source is present:
The chosen flat map temperature adds the factor
.
The estimated value of
is then
With this value of , we calculated several reconstructions
using different values of
. We found that
was
too small to yield improvements beyond those for a reconstruction using
the spatial entropy term alone and that
eliminated all ripples but flattened the spectra at high frequencies.
Thus, the ``best" value of
for the analyzed data was chosen to be
After the MEM reconstruction was done, the reconstructed map was convolved with the so-called ``clean" beam, a Gaussian fit to the inner part of the actual beam, to limit the effect of ``superresolution" (e.g. Cornwell & Braun 1989). All imaging algorithms produce some degree of superresolution since they estimate unmeasured Fourier components; i.e. the reconstructed image shows structures on spatial scales smaller than the fringe spacing of the largest baseline. A superresolution of a factor two or less is generally acceptable (cf. Narayan & Nityananda 1986). In our case, we have an additional complication due to the added spectral term which causes the reconstructed map to be spatially not as smooth as it would be without this term, which produces ``signal" at high spatial frequencies. We compared spatial power spectra of the convolved restored map with power spectra of the corresponding dirty map and found that the largest spatial frequency present in the reconstructed map is comparable to the one in the dirty map within a factor of 1.5 or less. This implies that convolving with the clean beam sufficiently limits the influence of superresolution and that at the same time its effect is limited enough to preserve the spatial information present in the original data.
In our initial algorithm, used to create the maps in this paper, we
directly compute the Fourier transform (Eqs. (5) and (6)) which has
the advantage of simplicity over the Fast Fourier Transform (FFT) in
that it does not require gridding of the visibilities (see
Thompson et al. 1991). However, in looking ahead to the 2-d
version of the algorithm, execution time becomes more important than
simplicity. Therefore, we have extended our 1-d algorithm to perform
gridding and employ the FFT. Because gridding of multifrequency data
introduces special problems, we briefly outline the gridding process here.
The gridding process is described mathematically in terms of convolution
and resampling, where an appropriate convolution function has to be
chosen. A simple choice for a convolution function, and the one we
employ, is a rectangular function (cell averaging); the (uv) plane is
divided into rectangular cells of dimension by
centered on the grid point and all measurements within each cell are
assigned to the grid point.
Figure 1: A white-light image and a magnetogram of active region
AR 5417 near the solar limb on March 20, 1989 (vernal
equinox) from BBSO. The radio image at 3.6 GHz
(covering 256'') is superposed for comparison
Figure 2: The CLEAN reconstruction. The spatial information encoded
in the spectrum is not used. The contours are at
logarithmic intervals to bring out the lower level
sidelobes. (contour 4: , contour 5:
, contour 6:
)
Figure 3: The spatial MEM reconstruction. The spatial information
encoded in the spectrum is not used. The contours are at
logarithmic intervals to bring out the lower level
sidelobes. (contour 4: , contour 5:
, contour 6:
)
Figure 4: The reconstructed map using the spatial/spectral MEM
algorithm. The contours are at logarithmic intervals
to show an improvement in sidelobe level in comparison
with Figs. 2 and 3. (contour 4: ,
contour 5:
, contour 6:
)
Figure 5: The brightness temperature maps at frequencies a)
2.0 GHz, b) 5.0 GHz, c) 7.0 GHz, and d) 9.0 GHz
of the spatial/spectral MEM reconstruction (filled gray
areas), the spatial MEM (solid line) and the CLEAN
reconstruction (dotted line)
The problem introduced with gridding of multifrequency data lies in its
smoothing effect on the visibilities. For example, if we divide the
spatial frequencies between 0 and the Nyquist frequency, , into
512 grid points and use a pixel size of 1'' (
),
the
observations occupy 106 grid points at 10 GHz and only
15 at
. To reduce the amount of smoothing at lower frequencies,
we divide our frequency range
into three ranges with
different spatial resolution and grid each range separately. We choose
a low-frequency range (
) with a pixel size of 4'', a
mid-frequency range (
) with 2'', and a high-frequency
range (
) with 1'' to ensure that the smoothing
introduced by the gridding process remains within a factor of two to
five. The combined effect of using FFT instead of direct Fourier
transform, gridding and dividing into ranges with different spatial
resolution reduces the execution time of the MEM algorithm by a factor
of
, for the same 1-d algorithm. The gain is even greater
when extended to 2-d.