An informal theoretical analysis may be helpful to evaluate the maximum lossless compression efficiency expected from LFI and to discuss the behaviour of the different compressors. For further details we forward the reader to Nelson & Gailly (1996).

Data compression is based on the partition of a stream of bits into short chunks, represented by strings of bits of fixed length $N_{{\rm bits}}$ , and to code each string of bits $S_{{\rm In}}$ into another string $S_{{\rm Out}}$ whose length $N_{{\rm bits}}^{{\rm out}}$ is variable and, in principle, shorter than $S_{{\rm In}}$ . In this scheme, when the string of bits represents a message, the possible combinations of bits in $S_{{\rm In}}$ represents the symbols by which the message is encoded. From this description the compression operation is equivalent to map the input string set $\{S_{{\rm In}}\}$ into an output string set $\{S_{{\rm Out}}\}$ through a compressing function ${\mathcal{F}}_{{\rm Comp}}$ . A compression algorithm is called lossless when it is possible to reverse the compression process reconstructing the $S_{{\rm In}}$ string from $S_{{\rm Out}}$ through a decompression algorithm. So the condition for a compression programs to be lossless is that the related ${\mathcal{F}}_{{\rm Comp}}$ is a one-to-one application of $\{S_{{\rm In}}\}$ into $\{S_{{\rm Out}}\}$ . In this case the decompressing algorithm is the inverse function of ${\mathcal{F}}_{{\rm Comp}}$ . Of course in the general case it is not possible to have at the same time lossless compression and $\mbox{$N_{\rm bits}$ } > \mbox{$N_{{\rm bits}}^{{\rm out}}$ }$ for any string in the input set. The problem is solved assuming that the discrete distribution $P(\mbox{$S_{{\rm In}}$ })$ of strings belonging to the input stream of bits is not flat but that a most probable string exists. So a good ${\mathcal{F}}_{{\rm Comp}}$ will assign the shortest $S_{{\rm Out}}$ to the most probable $S_{{\rm In}}$ and, the least probable the input string, the longest the output string. In the worst case output strings longer than the input string will be assigned to those strings of $\{S_{{\rm In}}\}$ which are least probable. With this statistical tuning of the compression function the final length of the compressed stream will be shorter than the original length, the averaged length of $\mbox{$S_{{\rm Out}}$ }$ being:

$\begin{displaymath}% \mbox{$\overline{N_{{\rm bits}}^{{\rm out}}}$ } = \sum_{\mb... ...\mbox{${\mathcal{F}}_{{\rm Comp}}$ }(\mbox{$S_{{\rm In}}$ })). \end{displaymath}$

(7)

Several factors affect the efficiency of a given compressor, in particular best performances are obtained when the compression algorithm is tuned on the specific distribution of symbols. Since the symbol distribution depends on $N_{{\rm bits}}$ and on the specific input stream, an ideal general-purpose self-adapting compressor should be able to perform the following operations: i) acquire the full bit stream (in the hypothesis it has a finite length) and divide it in chunks of length $N_{{\rm bits}}$ , ii) perform a frequency analysis of the various symbols, iii) create an optimized coding table which associates to each $S_{{\rm In}}$ a specific $S_{{\rm Out}}$ , iv) perform the compression according to the optimized coding table, v) send the coding table to the uncompressing program together with the compressed bit stream. The uncompressing program will restore the original bit stream using the associated optimized coding table.

In practice in most cases the chunks size $N_{{\rm bits}}$ is hardwired into the compressing code (typically $\mbox{$N_{\rm bits}$ } = 8$ or 16 bits), also the fine tuning of the coding table for each specific bit stream is too expensive in terms of computer resources to be performed in this way, and the same holds for coding table transmission. So there are compressors which work as if the coding table or, equivalently, the compression function is fixed. In this way the bit stream may be compressed chunk by chunk by the compressing algorithm which will act as a filter. Other compressors perform the statistical tuning on a small set of chunks taken at the beginning of the stream, and then apply the same coding table to the full input stream. In this case the compression efficiency will be sensitive to the presence of correlations between difference parts of the input stream. In this respect self-adaptive codes may be more effective than non-adaptive ones, if their adapting strategy is sensitive to the kind of correlations in the input stream.

On the other hand other solutions may be adopted to obtain a good compromise between computer resources and compression optimization. For example all of the previous compressors are called static since the coding table is fixed in one way or the other at the beginning of the compression process and then used all over the input stream. Another big class of self-adaptive codes is represented by dynamical self-adaptive compressors, which gain the statistical knowledge about the signal as the compression proceeds changing time by time the coding table. Of course these codes compress worse at the beginning and better at the end of the data stream, provided its statistical properties are stationary. They are also able to self-adapt to remarkable changes in the characteristics of the input stream, but only if these changes may be sensed by the adapting code. Otherwise the compressor will behave worse than a well-tuned static compressor. Moreover, if the signal changes frequently, it may occur that the advantage of the dynamical self adaptability is compensated by the number of messages added to the output stream to inform the decompressing algorithm of the changes occurred to the coding table. Last but not least, if some error occurs during the transmission of the compressed stream and the messages about changes in the coding table are lost, it will be impossible to correctly restore it at the receiving station. This problem may be less severe for a static compressor since, as an example, it is possible to split the output stream in packets putting stop codes and storing the coding table on-board until a confirmation message from the receiving station is sent back to confirm the correct transmission.

It is then clear that each specific compression algorithm is statistically optimized for a given kind of input stream with its own statistical properties. So to obtain an optimized compressor for LFI it is important to properly characterize the statistics of the signal to be compressed and to test different existing compressors in order to map the behaviour of different compression schemes using realistically simulated signals and, as soon as possible, the true signals produced by the LFI electrical model.

In order to evaluate the performances of different compression scheme we considered the Compression Rate $C_{{\rm r}}$ defined as:

$\begin{displaymath}% \mbox{$C_{{\rm r}}$ } = \frac{L_{{\rm u}}}{L_{{\rm c}}} \end{displaymath}$

(8)

The measure represented by one of the 8640 samples which form one scan circle is white noise dominated, the rms $\sigma_{{\rm T}}$ being about a factor of ten higher than the CMB fluctuations signal. If so, at the first approximation it is possible to assume the digitized data stream from the front-end electronics as a stationary time series of independent samples produced by a normal distributed white noise generator. In such situation symbols are represented by the quantized signal levels, and it is easy to infer the best coding table and by the information theory the expected compression rate for an optimized compressor is promptly estimated (Gaztñaga et al. 1998). In our notation, for a zero average signal:

$\begin{displaymath}% \mbox{$C_{{\rm r}}^{{\rm Th}}$ } = \frac{\mbox{$N_{{\rm bi... ..._{{\rm l}}$ }/\mbox{${\rm adu}$ }) +\ln \mbox{${{\rm VOT}}$ }} \end{displaymath}$

(9)

From Eq. (9) it is possible to infer that the higher is the ${{\rm VOT}}$ , (i.e. higher is the $\Delta T$ resolution) the worse is the compression rate, as already observed in Maris et al. (1998) and Maris et al. (1999). The reason being the fact that as ${{\rm VOT}}$ is increased the number of quantization levels (i.e. of symbols) to be coded is increased and their distribution becomes more flat increasing $\overline{N_{{\rm bits}}^{{\rm out}}}$ . Assuming that all the white noise is thermal in origin then $\mbox{$\sigma_{{\rm l}}$ } \approx \mbox{$\sigma_{{\rm T}}$ } \approx 2 \,\, 10^{-3}$ K. With the ${\rm adu}$ defined in Eq. (5) together with the typical values of $V_{{\rm min}}$ and $V_{{\rm max}}$ assumed therein and $\mbox{$N_{{\rm bits}}$ } = 16$ bits we have $\mbox{$C_{{\rm r}}^{{\rm Th}}$ } \sim 11.09/(3.30+\ln\mbox{${{\rm VOT}}$ })$ . In conclusion, for $\mbox{${{\rm VOT}}$ } = 0.5$ , 1.0, 1.5 V/K the $C_{{\rm r}}^{{\rm Th}}$ is respectively 4.26, 3.36, 3.00. In addition Fig. 2 represents the effect of a reduction of $N_{{\rm bits}}$ on $C_{{\rm r}}^{{\rm Th}}$ compared to $C_{{\rm r}}^{{\rm Th}}$ for $\mbox{$N_{{\rm bits}}$ } = 16$ .

$\begin{figure} \par\includegraphics[width=7cm,clip]{H2223F2.eps} \end{figure}$

Figure 2: $C_{{\rm r}}^{{\rm Th}}$ as a function of ${{\rm VOT}}$ and $N_{{\rm bits}}$ . It is assumed $\mbox{$V_{{\rm min}}$ } = -10$ V, $\mbox{$V_{{\rm max}}$ } = +10$ V and $\mbox{$\sigma_{{\rm T}}$ } = 2\, 10^{-3}$ K. The curve for 12 bits is scaled by a factor 0.1 to allow a better comparison with the 16 bits curve

4 An informal theoretical analysis about the compression efficiency