2 Multiscale vision model

2.1 Introduction

The wavelet transform of an image by the à trous algorithm produces, at each scale j, a set $\{w_j\}$ . This has the same number of pixels as the image. The original image c₀ can be expressed as the sum of all the wavelet scales and the smoothed array $c_{\rm p}$ by the expression $c_0 = c_{\rm p} + \sum_{j=1}^{\rm p} w_j$ . A pixel at position x,y can be expressed also as the sum of all the wavelet coefficients at this position, plus the smoothed array: $c_0(x,y) = c_{\rm p}(x,y) + \sum_{j=1}^{\rm p} w_j(x,y)$ .

After applying the wavelet transform on the image, we have to detect, to extract, to measure and to recognize the significant structures. This is done by first computing the multiresolution support of the image, and by applying a segmentation scale by scale. The wavelet space of a 2D direct space is a 3D one. An object has to be defined in this space. A general idea for object definition lies in the connectivity property. An object occupies a physical region, and in this region we can join any pixel to other ones. Connectivity in direct space has to be transported to wavelet transform space (WTS). In order to define the objects we have to identify the WTS pixels we can attribute to the objects. We describe in this section the different steps of this method.

2.2 Definition

The Multiscale Vision Model (MVM) (Bijaoui & Rué 1995; Rué & Bijaoui 1997) described an object as a hierarchical set of structures. It uses the following definitions:

significant wavelet coefficient: a wavelet coefficient is significant when its absolute value is above a given detection limit. The detection limit depends on the noise model (Gaussian noise, Poisson noise, and so on). See Starck et al. (1998) for a full description of the noise modelling;
structure: a structure ${\cal S}_j$ is a set of significant connected wavelet coefficients at the same scale j;
object: an object is a set of structures;
object scale: the scale of an object is given by the scale of the maximum of its wavelet coefficients;
interscale relation: the criterion allowing us to connect two structures into a single object is called the "interscale relation'';
sub-object: a sub-object is a part of an object. It appears when an object has a local wavelet maximum. Hence, an object can be composed of several sub-objects. Each sub-object can also be analysed.

2.3 The multiresolution support and its segmentation

A multiresolution support of an image describes in a logical or Boolean way if an image I contains information at a given scale j and at a given position (x,y). If M^(I)(j,x,y) = 1 (or $= \ true$ ), then I contains information at scale j and at the position (x,y). M depends on several parameters:

The input image;
The algorithm used for the multiresolution decomposition;
The noise;
All additional constraints we want the support to satisfy.

Such a support results from the data, the treatment (noise estimation, etc.), and from knowledge on our part of the objects contained in the data (size of objects, linearity, etc.). In the most general case, a priori information is not available to us.

The multiresolution support of an image is computed in several steps:

Step one is to compute the wavelet transform of the image;
Binarization of each scale leads to the multiresolution support (the binarization of an image consists of assigning to each pixel a value only equal to 0 or 1);
A priori knowledge can be introduced by modifying the support.

This last step depends on the knowledge we have of our images. For instance, if we know there is no interesting object smaller or larger than a given size in our image, we can suppress, in the support, anything which is due to that kind of object. This can often be done conveniently by the use of mathematical morphology. In the most general setting, we naturally have no information to add to the multiresolution support.

$\begin{figure} \includegraphics[width=8.8cm,clip]{10090f2.eps}\end{figure}$

Figure 2: Example of connectivity in wavelet space: contiguous significant wavelet coefficients form a structure, and following an interscale relation, a set of structures form an object. Two structures S_j,S_j+1 at two successive scales belong to the same object if the position pixel of the maximum wavelet coefficient value of S_j is included in S_j+1

The multiresolution support will be obtained by detecting at each scale the significant coefficients. The multiresolution support is defined by:

$\begin{displaymath}% M(j,x,y) = \left\{ \begin{array}{ll} \mbox{ 1 } & \mbo... ...f } w_j(x,y) \mbox{ is not significant}. \end{array} \right. \end{displaymath}$

(1)

In the case of Gaussian noise, it suffices to compare the wavelet coefficients w_j(x,y) to a threshold level t_j. t_j is generally taken equal to $k \sigma_j$ , where $\sigma_j$ is the noise standard deviation at scale j, and k is chosen between 3 and 5. The value of 3 corresponds to a probability of false detection of 0.0027 for Gaussian statistics. If w_j(x,y) is small, then it is not significant and could be due to noise. If w_j(x,y) is large, it is significant:

$\begin{displaymath}% \begin{array}{l} \mbox{ if } \mid w_j \mid \geq \ t_j ... ...\ \ \mbox{then} w_j \mbox{ is not significant.} \end{array} \end{displaymath}$

(2)

Many other kinds of noise can be considered in the wavelet space. See (Starck et al. 1998) for a review.

Multiresolution support segmentation

The segmentation consists of labelling a boolean image (0 or 1). Each group of connected pixels having a "1'' value gets a label value between 1 and $L_{\rm max}$ , $L_{\rm max}$ being the number of groups. This process is repeated at each scale of the multiresolution support. We define a "structure''Sⁱ_j as the group of connected significant pixels which has the label i at a given scale j.

2.4 Interscale connectivity graph

An object is described as a hierarchical set of structures. The rule which allows us to connect two structures into a single object is called "interscale relation''. Figure 2 shows how several structures at different scales are linked together, and form objects. We have now to define the interscale relation: let us consider two structures at two successive scales, S^k_j and S^l_j+1. Each structure is located in one of the individual images of the decomposition and corresponds to a region in this image where the signal is significant. Denoting $p_{\rm m}$ the pixel position of the maximum wavelet coefficient value of S^k_j, S^k_j is said to be connected to S^l_j+1 if S^l_j+1 contains the pixel position $p_{\rm m}$ (i.e. the maximum position of the structure S^k_j must also be contained in the structure S^l_j+1). Several structures appearing in successive wavelet coefficient images can be connected in such a way, which we call an object in the interscale connectivity graph.

2.5 Reconstruction

Hence, a set of structures defines an object W = S^k_j, ...S^k'_j' which can be reconstructed separately from other objects. The coaddition of all reconstructed objects is a filtered version of the input data.

The reconstruction problem consists of searching for a signal O such that its wavelet coefficients are the same as those of the detected structures. If $\cal T$ describes the wavelet transform operator, and $P_{\rm w}$ the projection operator in the subspace of the detected coefficients (i.e. having set to zero all coefficients at scales and positions where nothing was detected), the solution is found by minimization of

$\begin{eqnarray*}J(O) = \parallel W - (P_{\rm w} \circ {\cal T}) O \parallel \end{eqnarray*}$

where W represents the detected wavelet coefficients of the data. More details can be found in Bijaoui & Rué (1995).

Up: A combined approach for