The basic principle of the first method named Least
Significant Bits Packing (LSBP) is to send only those bits of the
16 bits output from the ADC which are affected by the signal and
the noise. This is effective for the nominal mission since with
the planned quantization step of 0.3 mK/adu, at one sigma the
noise will fill about 21 levels, this will require at least 5 bits
over 16 and it is reasonable to expect a final data flow
equivalent to
.
It is not possible to improve much
the compression rate by compressing the resulting 5 bits data
stream, since its entropy would be H < 5.4 bits and
.
In order to ensure the compression to be lossless all the samples
exceeding the [,
]
(5 bits) range have to be
sent separately coding at the same time: their position (address)
in the stream vector and their value. So, for
bits
corresponding to a threshold
,
each group of
samples stored into a packet is partitioned into two classes
accordingly with their value x:
Regular Samples (RS) def
all those samples for
which:
,
Spike Samples (SS) def
all those samples
for which:
The coding process then consists of two main steps: i) to split
the data stream in Regular and Spike Samples preserving the
original ordering in the stream of Regular Samples, ii) to store
(send) the first
bits of the regular samples and, in a
separated area, the 16 bits values and the location in the
original data stream of each Spike Sample, i.e. Spike Samples will
require more space to be stored than regular ones. The decoding
process will be the reverse of this packing process.
In this scheme each packet will be divided into two main areas: the
Regular Samples Area (RSA) which hold the stream of Regular Samples, the Spike
Sample Area (SSA) which hold the stream of Spike Samples, plus a number of fields
which will contain packing parameters such as: the number of samples, the number
of regular samples, the offset, etc.
Since the number of samples in each area will change randomly
it will be not possible to completely fill a packet. The filling process will
leave an empty area in the packet in average smaller than
.
In Maris (1999a) a first evaluation for the 30 GHz
channel is given assuming that the signal is composed only of
white noise plus the CMB dipole. As noticed in Sect.
7.2 the cosmological dipole affects the compression
efficiency reducing it of a small amount. To deal with it a
possible solution would be to subdivide each data stream in
packets, subtract to each measure of a given packet the
integer average of samples (computed as a 16 bits integer number) and then
compress the residuals. Each integer average will be sent to Earth
together with the related packet where the operation will be
reversed. Since all the numbers are coded as 16 bits integers all
the operations are fully reversible and no round off error occurs.
However it cannot be excluded that the computational cost of such
operation will compensate the gain in
.
Two schemes are proposed to perform the cosmological dipole
self-adaptement. In Scheme A the average of
samples in the packet are subtracted before coding and then sent
separately. In Scheme B
is varied proportionally to
the dipole contribution. Both of them assumes that the dipole
contribution is about a constant over a packet length. From this
assumption:
samples i.e.
Lpacket < 512 bytes,
since for
Lpacket > 512 bytes the cosmic dipole contribution can
not be considered as a time constant. For larger packets a better
modeling (i.e. more parameters) will be required in order not to
degrade the compression efficiency.
A critical point is to fix the best
,
i.e.
,
for a given
signal statistics, coding scheme and packet length
.
Even
here
grows with the packet length but it does not change
monotously with
.
An increase in
(
)
decreases the
number of spike samples, but increases the size of each regular
sample. While the opposite occurs when
is decreased, and
when
bits
.
For both the schemes the
optimality is reached for
bits, but Scheme A is better than B, with:
,
,
,
.
Compared with arith-n1, this compression rate is
worse by
about a .
This is due to two reasons: i) coding by a
threshold cut is less effective than to apply an optimized
compressor; ii) the results reported in
Tables 6-9
refer to the compression of a full circle of data instead
of a small packet, resulting in a higher efficiency. However, the
efficiency of this coding method is similar to the efficiency of
the bulk of the other true loss-less compressors tested up to now,
and when the need to send a decoding table is considered, is even
higher.
A compression scheme based on the same principle, but with a different
organization of fields, has been proposed also by
Guzzi & Silvestri (1999) which report a similar compression efficiency.
The second possible solution to the packeting problem is to use
one or more standardized coding tables for the compression scheme
of choice (Maris 1999b). In this case the coding table
would be loaded into the on-board computer before launch or time
by time in flight and the table should be known in advance at
Earth. Major advantages would be: 1. the coding table has not to
be sent to Earth; 2. the compression operator will be reduced to a
mapping operator which may be implement as a tabular search,
driven by the input 8 or 16 bits word to be compressed; 3. any
compression scheme (Huffman, arithmetic, etc.) may be implemented
replacing the coding table without changes to the compression
program; 4. the compression procedure may be easily written in C
or the native assembler language for the on-board computer or,
alternatively, a simple, dedicated hardware may be implemented and
interfaced to the on-board computer. The disadvantages of this
scheme are: 1. each table must reside permanently in the central
computer memory unless a dedicated hardware is interfaced to
it; 2. it is difficult to use adaptive schemes in order to tune
the compressor to the input signal, as a consequence the
may
be somewhat smaller than in the case of a true self-adapting
compressor code.
The first problem may be circumvented limiting the length of the
words to be compressed. In our case the data streams may be
divided in chunks of 8 bits and the typical table size would be
Kbyte. Precomputed coding tables may be accurately
optimized by Monte-Carlo simulations on ground or using signals
from ground tests of true hardware.
The second problem may be overcome by using a preconditioning stage, reducing the statistics of the input signal to the statistics for which the pre-calculated table is optimized. In addition more tables may reside in the computer memory and selected looking to the signal statistics. With a simple reversible statistical preconditioner, about ten tables per frequency channel would be stored in the computer memory, so that the total memory occupation would be less than about 40 Kbytes. It cannot be excluded that the two methods just outlined can be merged.
Copyright The European Southern Observatory (ESO)