The expected data rate from the PLANCK Low Frequency Instrument is kbits/sec. The bandwidth for the scientific data download currently allocated is just kbit/sec. Assuming an equal subdivision of the bandwidth between the two instruments on-board PLANCK, an overall compression rate of a factor 8.7 is required to download all the data.
In this work we perform a full analysis on realistically simulated data streams for the 30 GHz and 100 GHz channels in order to fix the maximum compression rate achievable by loss-less compression methods, without considering explicitly other constrains such as: the power of the on-board Data Processing Unit, or the requirements about packet length limits and independence, but taking in account all the instrumental features relevant to data acquisition, i.e.: the quantization process, the temperature/voltage conversion, number of quantization bits and signal composition.
As a complement to the experimental analysis we perform in parallel a theoretical analysis of the maximum compression rate. Such analysis is based on the statistical properties of the simulated signal and is able to explain quantitatively most of the experimental results.
Our conclusions about the statistical analysis of the quantized signal are: I) the nominally quantized signal has an entropy bits at 30 GHz and bits at 100 GHz, which allows a theoretical upper limit for the compression rate at 30 GHz and at 100 GHz. II) Quantization may introduce some distortion in the signal statistics but the subject requires a deepest analysis.
Our conclusions about the compression rate are summarized as
follows:
I) the compression rate
is affected by the quantization step, since greater is the
quantization step higher is
(but worse is the measure
accuracy).
II)
is affected also by the stream length
,
i.e. more circles
are compressed better then few circles.
III) the dependencies on the quantization step and
for each compressor may be
summarized by the empirical formula (12). A reduced
compression rate
is correspondingly defined.
IV) the
is affected by the signal composition, in particular,
by the white noise rms and by the dipole contribution, the
former being the dominant parameter and the latter influencing
for less than
.
The inclusion of the dipole
contribution reduces the overall compression rate. The other
components (1/f noise, CMB fluctuations, the galaxy, extragalactic
sources) have little or no effect on
.
In conclusion, for the
sake of compression rate estimation, the signal may be safely
represented by a sinusoidal signal plus white noise.
V) since the noise rms increases with the frequency, the
compression rate
decreases with the frequency, for the LFI
.
VI) the expected random rms in the overall compression rate is less than
.
VII) we tested a large number of off-the-shelf compressors, with
many combinations of control parameters so to cover every
conceivable compression method.
The best performing compressor is the arithmetic
compression scheme of order 1: arith-n1, the final
being 2.83 at 30 GHz and 2.61 at 100 GHz. This is significantly
less than the bare theoretical compression rate (9)
but when the quantization process is taken
properly into account in the theoretical analysis, this
discrepancy is largely reduced.
VIII) taking into account the data flow distribution
among different compressors the overall compression rate for arith-n1 is:
Possible solutions deal with the application of lossy compression methods such as: on-board averaging, data rebinning, or averaging of signals from duplicated detectors, in order to reach an overall lossy compression of about a factor 3.4, which coupled with the overall loss-less compression rate of about 2.65 should allow to reach the required final compression rate . However each of these solutions will introduce heavy constrains and important reduction of performances in the final mission design, so that careful and deep studies will be required in order to choose the best one.
Another solution to the bandwidth problem would be to apply a coarser quantization step. This has however the drawback of reducing the signal resolution in terms of .
Lastly the choice of a given compressor cannot be based only on its efficiency obtained from simulated data, but also on the on-board available CPU and on the official ESA space qualification: tests with this hardware platform and other compressors will be made during the project development. Moreover, we are confident that the experience which will be gained inside the CMB community developing ground, balloon and space based experiments, as the development of full prototypes of the on-board electronics, will provide us with a solid base to test and improve compression algorithms. In addition the final compression scheme will have to cope with requirements about packet length and packet independence. We discuss briefly this problems recalling two proposals (Maris 1999a, 1999b) which suggest solutions to cope with these constrains.
Acknowledgements
We warmly acknowledge a number of people which actively support this work with fruitful discussions, in particular F. Argüeso, M. Bersanelli, L. Danese, G. De Zotti, E. Gaztñaga, J. Herrera, N. Mandolesi, P. Platania, A. Romeo, M. Seiffert and L. Toffolatti and K. Gorski and all people involved in the construction of the Healpix pixelisation tools, largely employed in this work, G. Lombardi from Siemens - Bocholt and G. Maris from ETNOTEAM - Milano, for fruitful discussions about compression principles and their practical application, P. Guzzi and R. Silvestri from LABEN - Milano for explanations and suggestions about the PLANCK-LFI data acquisition electronics. At last we acknowledge the referee Miguel A. Albrecht, for the useful suggestions and corrections, which improve significantly the text readability and accuracy.
Copyright The European Southern Observatory (ESO)