A Bayesian approach for polarimetric data reduction: the Mueller imaging case

Jihad Zallat; Christian Heinrich; Matthieu Petremand

doi:10.1364/OE.16.007119

1. Introduction

In a previous paper [1], we reported on a new method addressing polarimetric data reduction in the Stokes imaging framework. Our main contribution there concerned the use of a Bayesian approach ensuring the physical authenticity of polarization encoded images obtained from raw intensity acquisitions while providing a robust clustering of the imaged scene. We showed also that our approach preserved the Stokes channels from image noises and errors induced by the imaging system. The problem of the more complex Mueller imaging case remained pending. It is addressed in the present article.

The importance of Mueller imaging comes from the rich set of information it provides about the local nature of the target which may be of great use in studying number of physical phenomena that arise in many fields of science. Hence, an increasing interest is devoted to the development of new imaging systems that record spatially dependent patterns of polarized light diffusely scattered or transmitted by scenes or samples.

Mueller imaging is defined as the distributed measurement of Mueller matrices attached to each pixel. This is made possible by the use of effective polarization state analyzers (PSAs) that acquire the Stokes vectors corresponding to each pixel in the image while a source lights up the scene through a polarization state generator (PSG). Different configurations using different polarization modulation techniques are used in imaging polarimetry. Whatever the technique used, the principle remains the same: a set of raw intensity images have to be inverted to yield polarization encoded images. This procedure called “polarimetric data reduction” (PDR) is achieved through a linear observation model that relates the two sets of images via the system dependent PSA and PSG matrices.

It is worth noting that a proper choice of the PSA and the PSG matrices plays an important role in PDR [2]. It ensures a good conditioning that allows the inversion of the system and minimizes the contamination of polarimetric channels by image noise. However, if one uses the classical PDR techniques, even for well behaved imaging polarimeters [2], the image noise propagates into the estimated Mueller matrices and may prevent their physical admissibility. Indeed, a 4×4 real matrix must satisfy some physical constraints to be bona fide.

The contribution of the present work is twofold. Firstly, we propose an original parameterization of the forward model in order to account for the physical admissibility conditions. This leads to a nonlinear observation model. Secondly, we propose a fully unsupervised Bayesian framework (model and algorithm) addressing the estimation problem. We suppose that the sought object is piecewise constant, which corresponds to many situations of practical interest. The Markov random field model originally proposed by Potts will account for the hidden segmentation map holding the spatial prior knowledge. We will also assume the Mueller matrices to be class wise constant. Estimation will take place in the generalized maximum likelihood framework where the a posteriori probability law is optimized jointly with respect to the parameters and the hyper-parameters.

The article is organized as follows: the nonlinear parameterization of the forward problem and the stochastic model are presented in section 2. The estimation procedure is detailed in section 3. Simulation results and real data processing are presented and discussed in section 4. Conclusion and perspectives are given in section 5.

2. Problem statement

2.1. Observation model in the Mueller imaging polarimetry framework

A Mueller imaging polarimeter allows the indirect measurement of the Mueller matrix attached to each pixel in the image. This is achieved by illuminating an object by p independent polarization states where p≥4 through a PSG. Each of these states interacts with the object to yield an outgoing Stokes vector that is sensed through the PSA by q independent probing states where q≥4. For each pixel location (j_y, j_x) in the image, the p×q intensity measurements I are related to the pixel’s Mueller matrix M by the following linear equation:

I (j_{y}, j_{x}) = A M (j_{y}, j_{x}) W,

where A is a p×4 PSA matrix whose rows constitute the p probing states and W is the 4×q PSG matrix whose columns represent the q incident polarization states.

It can be shown [2] that Eq. (1) can be written in the following form

\underline{I} (j_{y}, j_{x}) = (W^{t} \otimes A) \underline{M} (j_{y}, j_{x}) = P (η) \underline{M} (j_{y}, j_{x}),

where the underline symbol stands for the operator which maps a p×q matrix into an pq×1 vector formed by stacking up its columns and ⊗ represents the Kronecker product. The known pq×16 matrix P, also called “Polarization Measurement Matrix” (PMM), depends on a parameter vector η which controls the combined actions of the PSG and PSA and relates linearly the Mueller parameters vector to the raw measured intensity vector. The multi-component nature of the Mueller matrices image is treated here by seeing the Mueller image as a result of a multi-channel imaging system where the information is conveyed to the user by a multi-channel image data. These channels will be indexed in the following, where appropriate, by j_n∈ [0,15].

Regardless of the physical constraints that must be satisfied by the elements of M̲, we have the same starting point as with the Stokes imaging framework [1]: the two problems share the same observation model. Systematic calibration errors as well as noises that arise from the use of a CCD photosensitive device as a detector (readout noise, thermal noise, dark current noise, and photon noise among others) are accounted for in the same way as in [1], which leads to the complete observation model:

\underline{I} (j_{y}, j_{x}) = (1 + ρ) P (η) \underline{M} (j_{y}, j_{x}) + ε (j_{y}, j_{n}),

where ρ accounts for kinds of noises and errors having their variances directly proportional to the measured intensity (see [1] for more details) while ε encompasses all remaining noise terms.

The problem is now to estimate M̲ from the observations I̲ while ensuring the physical admissibility of the retrieved Mueller operators attached to each pixel. One can easily understand that the classical approach that relies on the PMM pseudo-inverse matrix is not the right way to achieve this task. Indeed, it necessitates a pixel-wise Mueller validity post checking which is time consuming without giving a proper estimate for pathological pixels.

We propose a new parameterization of the problem, yielding an observation model with simple constraints that allows tackling the estimation in a Bayesian framework. This comes at the cost of nonlinearity. The parameterization is achieved by observing that there is a one-to-one correspondence between a bona fide Mueller matrix M and a Hermitian positive semidefinite matrix H called “system coherency matrix”

H = \sum_{k, l = 0}^{3} m_{k l} [σ_{k} \otimes σ_{l}],

where {σ_k}³ _k=0 comprising the Pauli matrices and the identity matrix, are the elements for a basis for ℂ^2×2 and m_kl are the elements of M [3]. The linear equation (4) leads to the following expression:

\underline{M} = T \underline{H},

where T is a 16×16 invertible complex transformation matrix relating the stacked Mueller matrix to the stacked system’s coherency matrix.

It was stated that M is physically acceptable if and only if H is positive semi-definite [3]. Furthermore, if H is positive semidefinite, it can be written in terms of its Cholesky decomposition as [4]

H = Λ Λ^{t},

where Λ is a lower triangular matrix composed of 16 real parameters (λ_i)

Λ = [\begin{matrix} λ_{0} & 0 & 0 & 0 \\ λ_{4} + i λ_{5} & λ_{1} & 0 & 0 \\ λ_{10} + i λ_{11} & λ_{6} + i λ_{7} & λ_{2} & 0 \\ λ_{14} + i λ_{15} & λ_{12} + i λ_{13} & λ_{8} + i λ_{9} & λ_{3} \end{matrix}] .

It is now possible to parameterize the elements of H̲ by {λ_i} in Eq. (5) and report the expression of M̲ in the observation model of Eq. (3) to obtain

\underline{I} (j_{y}, j_{x}) = (1 + ρ) P (η) T \underline{H} ({λ_{i}}; j_{y}, j_{x}) + ε (j_{y}, j_{n}) .

We must note that the observation model is now a nonlinear function of {λ_i} and that {λ_j, j∈[0,3]} must obey a positivity constraint to respect the positive definiteness of H. The estimation problem turns now to the estimation of the real parameters {λ_i} that are used to construct H, hence yielding a physically acceptable Mueller matrix. We note also that generalization to the case where the matrix P depends on the site (j_y, j_x) is straightforward.

2.2. Underlying segmentation model

The underlying segmentation map will be modeled as a Markov random field, and more precisely using Pott’s model [5]. Let

I_{0} = {I_{0} (j_{y}, j_{x}); I_{0} (j_{y}, j_{x}) \in [1, K], j_{y} \in [1, n], j_{x} \in [1, m]}

denote the segmentation image, where K is the number of classes. The prior probability associated to I ₀ reads

p (I_{0} | β) = \frac{1}{Z (β)} \exp (- β U (I_{0})),

where

U (I_{0}) = - \sum_{t ~ s} 𝕀_{(I_{0} (t) = I_{0} (s))}

is the abstract energy measure associated to the field and Z the partition function. Variables t and s denote pixel sites of the image, the ~ denotes neighboring sites (we used eight neighbors in the present work) and 𝕀_(·) is the indicator function taking values 1 or 0 whether the condition between parentheses is true or false. This model favors piecewise constant images, the size of the patches being governed by the abstract inverse temperature parameter β. A key problem in using Eq. (10) as a penalty term is the computation of Z as a function of β. We rely here on thermodynamic integration, i.e., on the relation

\frac{d \log Z}{d β} = \frac{1}{Z} \frac{d Z}{d β} = \sum_{I_{0}} - U (I_{0}) \frac{\exp (- β U (I_{0}))}{Z} = E [- U (I_{0})],

where the expectation is taken with respect to p(I ₀|β).

From a theoretical point of view, we have

\log Z = n m \log K + \int_{0}^{β} E [- U (I_{0})] d β' .

From a practical point of view, values of E [-U (I ₀)] are computed using Markov chain Monte Carlo approximations for different values of β. The corresponding set of points is linearly interpolated, thus defining an approximation of dlog Z/dβ [6] which is further integrated thus giving an approximation of logZ. Let us mention that the samples of I ₀ are generated by alternating Gibbs sampling (pixel-flipping) raster scans of the image with Swendsen-Wang iterations (cluster-flipping). The role of the latter iterations is to improve the mixing properties of the Markov chain. We will not further detail this standard procedure, the reader being referred to [5].

Let Φ be the parameter matrix defined as

Φ = [\begin{matrix} λ_{0} (1) & \dots & λ_{0} (K) \\ ⋮ & ⋮ \\ λ_{15} (1) & \dots & λ_{15} (K) \end{matrix}],

where the numbers between parentheses stand for the class index. In the estimation procedure to be described, the role of the segmentation map I ₀ will be to index the appropriate column k of candidate Mueller class parameters {λ_i (k)} for pixels labeled as class k: I ₀ (j_y, j_x)=k. This straightforwardly yields the vector H̲ attached to such pixel.

2.3. Noise model

The noise will be supposed to be independent and Gaussian distributed according to

ε (j_{y}, j_{x}, j_{n}) ∣ I_{0} (j_{y}, j_{x}) ~ 𝒩 (μ_{I_{0} (j_{y}, j_{x})}^{(j_{n})}, {(σ_{I_{0} (j_{y}, j_{x})}^{(j_{n})})}^{2}),

where (µ ^(jn) _k,σ ^(jn) _k) are the noise parameters (mean and standard deviation) that affect the k^th class in the j_n channel image.

Let us define Θ as the set Θ={Θ^(jn); j_n∈[0,15]}, where

Θ^{(j_{n})} = [\begin{matrix} μ_{1}^{(j_{n})} & μ_{2}^{(j_{n})} \dots μ_{K}^{(j_{n})} \\ σ_{1}^{(j_{n})} & σ_{2}^{(j_{n})} \dots σ_{K}^{(j_{n})} \end{matrix}] .

All these parameters are unknown and will have to be estimated. This general model will in the sequel be restricted to the case of vanishing means. From a modelling point of view, the model with non vanishing means suffers from an identifiability problem: modifying some or all the λ_i’s amounts to modifying the µ_j’s and we have, at least for now, no information (prior or measured) to be able to make a decision. Furthermore, this assumption does not penalize at all the algorithm performances as observed in practice.

2.4. General probabilistic model

The overall probabilistic model may be depicted using a Directed Acyclic Graph (DAG) (see Fig. 1).

For the sake of clarity, we drop the dependency on K, since the only role of K is in defining dimensions. Considering the DAG, we may write:

p (I, Φ, I_{0} | ρ, Θ, β, σ_{λ}) = p (I | Φ, I_{0}, ρ, Θ) p (Φ | σ_{λ}) p (I_{0} | β) .

The component terms on the right-hand-side are reckoned as follows:

• p(I ₀|β) is established by Eq. (10) and section 2.2;

Fig. 1. The directed acyclic graph representing the dependence relations between the variables. According to usual conventions, square boxes represent fixed or observed quantities whereas circles represent quantities to be estimated. More precisely, β is the abstract inverse temperature parameter of the Markov random field,σ_λ is the standard deviation of the λ_i’s, ρ is related to the quantum efficiency of the camera, I ₀ is the underlying class segmentation map, Φ is the Mueller parameter matrix encompassing the λ_i(k)’s, Θ is the matrix encompassing the parameters of the noise distribution, and I is the intensity observations matrix.

Download Full Size | PDF

• p(Φ|σ_λ) is defined in a class-independent manner as follows:

{\begin{matrix} \begin{matrix} λ_{i} ~ 𝒩 (0, σ_{λ}^{2}) 𝕀_{(λ_{i} \geq 0)} & \forall i \in [0, 3], \end{matrix} \\ \begin{matrix} λ_{i} ~ 𝒩 (0, σ_{λ}^{2}) & \forall i \in [4, 15] . \end{matrix} \end{matrix}

The variance σ ² _λ is chosen sufficiently large in order to have no influence. In other words, the prior p(Φ|σ_λ) is flat compared to the likelihood p(I|Φ,I ₀,ρ,Θ) when the λ_i(k) vary. Thus the prior p(Φ|σ_λ) has no influence besides imposing the positivity of λ₀,⋯λ ₃ when estimating Φ. The variance σ ² _λ may be estimated using physical considerations or using a set of I’s (from this set of I’s, we compute the H’s in a pseudo-inverse framework and estimate the λ_i’s by minimizing ‖H-ΛΛ^t‖²). This would give a set of λ_i’s from which we could derive σ ² _λ;

• the data-dependent term p(I|Φ,I ₀,ρ,Θ) is obtained by solving Eq. (8) for ε then using Eq. (15) to express the (joint) probability density function of ε, assuming iid of pixels I(j_y, j_x) conditioned on I ₀:

{\begin{matrix} p (I ∣ Φ, I_{0}, ρ, Θ) = p (ε) = p (\underline{I} - (1 + ρ) P (η) T \underline{H} ({λ_{i}})) \\ = \prod_{j_{y}, j_{x}, j_{n}} 𝒩 (ζ (j_{y}, j_{x}, j_{n}), {(σ_{I_{0} (j_{y}, j_{x})}^{(j_{n})})}^{2}), \end{matrix}

where

ζ (j_{y}, j_{x}, j_{n}) = I (j_{y}, j_{x}, j_{n}) - (1 + ρ) P (η) T \underline{H} ({Φ [I_{0} (j_{y}, j_{x})]}, j_{y}, j_{x}) [j_{n}] .

In other words, the λ_i(k) are indexed from Φ (per Eq. (14)) based on the class index of the pixel, which in turn is looked-up from the segmentation map I ₀.

3. Estimation procedure

All available information is contained in distribution (17), from which we will derive the estimates $\hat{Φ}$ , Î ₀, $\hat{ρ}$ , $\hat{Θ}$ , $\hat{β}$ . Let us emphasize that this method is fully unsupervised: there is no free parameter to be chosen or tuned by the user, set aside the number K of classes (see discussion below).

Let {ρ,Θ,β} be the set of hyperparameters (i.e., the parameters which are not directly of interest), whereas the parameters are I ₀ and Φ. The parameters and hyperparameters will be obtained simultaneously by joint optimization of distribution (17), thus corresponding to generalized maximum likelihood estimation whereas maximum a posteriori would correspond to optimizing the distribution with respect to the parameters for given values of the hyperparameters.

To address the optimization problem, we resort to the Gauss-Seidel scheme: in the global loop, we consider all variables in a row and optimize -log p(I,Φ, I ₀|ρ,Θ,β,σ_λ) with respect to the current variable while the other variables are held fixed. The update of I ₀ is achieved site by site in a raster scan. The update of Φ is achieved component by component (i.e., we loop on i and k and optimize with respect to λ_i (k)). Convergence is reached when I ₀ is not modified by an entire raster scan (in fact, we will monitor that I ₀ is held constant over several (10 in the present case, see the counter<10 condition in Algorithm 2-b) raster scans, since even if I ₀ is held constant, the other variables may vary, if only a little, which in turn may induce a modification of I ₀).

Optimization with respect to λ_i (k) deserves particular comments. This optimization involves a fourth degree polynomial obtained by extracting the corresponding terms from -log p(I|Φ,I ₀,ρ,Θ)-log p(Φ|σ_λ). This is handled using formal computation: the coefficients of the polynomial are analytically precomputed and the local minima are computed given the numerical values of the coefficients. The updated value of λ_i (k) is sought among a) the arguments of the local minima when no positivity constraint is involved or b) the positive arguments of the local minima and zero when a positivity constraint is involved.

The present optimization problem being highly multimodal, particularly with respect to I ₀ when the noise level is important, we suggest using the algorithm detailed below which behaves better than straightforward and sole Gauss-Seidel descent.

At first, twenty different runs of the K-means clustering procedure are applied to the image set of m×n dimensional intensity channel vectors, yielding different class segmentation maps. The best resulting map, characterized by minimal distances between the clusters and the class’s centroids, is retained as the initial value of I ₀ and used to initialize the DAG with the procedure detailed in Algorithm 1-a and Algorithm 1-b. In this phase, a whole scan of I ₀ may lead to numerous modifications since the state of the DAG is far from its final (equilibrium) state. We want to avoid such a large movement without updating the rest (i.e., all variables but I ₀) of the DAG, since it may lead to local optima far from the solution. Thus, we update the other variables of the DAG as soon as a given (reduced) number of site updates are achieved (Algorithm 1-a). This allows reaching a state much closer to the equilibrium than the initial state. We then switch to entire raster scans (Algorithm 1-b) on which we loop until convergence. The role of the mask is to improve the speed of the procedure. The new mask (see the instruction update the mask) is composed of all sites that have just been modified and of the neighbors of those sites. At this stage, the state is not optimal: a better (in the sense of a lower criterion value) solution may be found. This state cannot be reached by updating single sites at a time. This is why we resort to a multiresolution optimization scheme (Algorithm 2-a), where the sites of I ₀ are modified blockwise. The other parameters of the DAG remain fixed in this phase, since they are close to their final values. We conclude with a standard Gauss-Seidel procedure (Algorithm 2-b) until convergence.

Algorithm 1: Optimization phase 1: initialization.

Download Full Size | PDF

4. Algorithm application

4.1. Simulation results

The aim of this section is to compare the estimated Mueller images resulting from different approaches (K-means, pseudo-inverse, Bayesian) using a ground-truth. We synthesized a non trivial 128-pixels×128-pixels Mueller image by using a four class label map shown in Fig. 2. A different Mueller matrix was assigned to each label as shown in Table 1. For the Bayesian approach, we considered two Markov random field models: Potts’ model and the Markov mesh (unilateral model) considered in the previous work addressing the Stokes imaging case.

The PMM of an optimal rotating-retarder Mueller polarimeter using the optimal parameter vector η_opt as defined in Ref. [2] was used to generate the corresponding intensity images by mean of Eq. (2). The hypothesis and arguments about the image noise nature introduced in our previous paper [1] are still valid here. Thus we assume that ε follows a centered Gaussian distribution: a zero mean white Gaussian noise with variance σ ² _n was added to each intensity image for this simulation providing the necessary inputs to our model, i.e. intensity images as well as the PMM P. We present here simulations results carried out with two different noise variances (σ ² _n=0.1 and σ ² _n=0.5) which correspond respectively to signal-to-noise ratios over the mean transmittance image (first Mueller channel) about 10 dB and 3 dB.

Algorithm 2: Optimization phase 2: refinement. In algorithm (a), all pixels of a given block are forced to the same label.

Download Full Size | PDF

The 16 channels of the Mueller image obtained by the classical pseudo-inverse solution are shown in Fig. 3. The results given by different approaches are summarized in Tables 2 and 3. Table 2 shows the percentage of misclassified pixels in the image. Table 3 presents the relative estimation error of the Mueller matrices associated to each class defined as

RE = \frac{{∥ M}_{true} - M_{estimated} ∥}{∥ M_{true} ∥} .

The Frobenius norm is used in the definition of RE.

We calculated also the mean RE value for the 3 dB case over the whole image for the classical

Fig. 2. 128-pixels×128-pixels mask used to generate the synthetic Mueller images. Each color (white, light gray, heavy gray and black) corresponds to one label.

Download Full Size | PDF

Table 1. The four different matrices assigned to the classes of the label map of Fig. 2. Matrix (a) was assigned to the black pixels, matrix (b) to the heavy gray pixels, matrix (c) to the light gray pixels and matrix (d) to the white pixels in the label map

View Table | View all tables in this article

cal pseudo-inverse PDR approach as well as for the Bayesian one (Potts’ model) and obtained respectively 110% for the first one and 2.2% for the second one. When the Mueller matrix is estimated with the pseudo-inverse, 90%(resp. 99 %) of the pixels are not physically admissible for σ ² _n=0.1 (resp. σ ² _n=0.5). Moreover, for the pseudo-inverse estimate, several Mueller channels display no information (see for example column 3 and row 3 in Fig. 3), whereas structural information appears in all channels for the Bayesian estimate.

We conclude that the introduced Bayesian approach (Potts’ model) gives better results in terms of estimated quantities and structural information (composition) of the imaged scene as compared to other approaches (pseudo-inverse and K-means). Moreover, the Markov random field prior plays an important role (see Table 2, where the results significantly degrade for β=0). The two Bayesian variants (Potts’ model and the unilateral model) give nearly the same results in term of RE but Potts’ model is noticeably better in term of misclassified pixels (Table 2). Besides, further simulations showed that Potts’ model is more robust than the unilateral one with respect to increasing noise variance.

In real situations, for some limited scenario, the optimal number of classes (K_opt) can only be roughly estimated. Usually, it can’t be inferred easily by simple observation. Our approach is able to handle this issue. To determine K_opt, the optimum of the law in Eq. (17) is plotted against K (i.e., we plot -log(p(I, $\hat{Φ}$ ,Î ₀| $\hat{ρ}$ , $\hat{Θ}$ , $\hat{β}$ ,σ_λ))+constant against K). We chose for K_opt the

Fig. 3. The 16 channels of the synthetic Mueller image obtained by the commonly used pseudo-inverse estimate obtained for the noise of variance σ ² _n=0.5.

Download Full Size | PDF

Table 2. Mean percentage of misclassified pixels for the Bayesian approach (with ( $\hat{β}$ ) and without (β=0) Markov random field prior) and for K-means clustering applied to the intensity images. Averaging has been done over noise realizations. Numbers between parentheses represents standard-deviation of related quantities.

View Table | View all tables in this article

value for which the graph levels off as shown in Fig. 4. In a fully Bayesian framework, we would have K_opt=argmax_K p(I|K). This would necessitate to resort to Markov chain Monte Carlo methods which is outside the scope of this paper. This is left for future work.

Running the method takes 4 minutes (resp. 30 minutes) for K=4 (resp. K=8) classes (128×128 image, 2.8 GHz standard desktop computer, programmation in C langage). This important variation in CPU time is due to an increase in the number of iterations.

4.2. Results against measured data

The proposed model is applied to real measured images acquired by an inline rotating quarter-wave plate Mueller imaging polarimeter. The object used to carry out these measurements consists of two dichroic patches (Polaroid) and a transparent thin film (cellophane) sandwiched between two coverglasses. The overall mount is imaged through a narrow band interferential

Table 3. Mean values of the relative error RE (in percent) of the four Mueller matrices corresponding to the four classes of Fig. 2. Averaging has been done over noise realizations. The first values correspond to σ ² _n=0.1, the second values correspond to σ ² _n=0.5. The K-means were used to derive a segmentation map over which the pseudo-inverse estimate was class-wise averaged, yielding the M̃_k’s. We ensured physical admissibility conditions over those averages according to ${\hat{\underline{M}}}_{k} = T {\hat{\underline{H}}}_{k}$ , with ${\hat{H}}_{k} = {\hat{Λ}}_{k} {\hat{Λ}}_{k}^{t}$ and ${\hat{Λ}}_{k} = {\arg min}_{Λ} ∥ {M_{¯}^{~}}_{k} - T {\underline{ΛΛ}}^{t} ∥^{2}$ , the minimization being achieved under the admissibility conditions.

View Table | View all tables in this article

Fig. 4. The optimal number of classes for the simulated data. The graph corresponds to σ ² _n=0.5.

Download Full Size | PDF

filter (10 nm) and a PSA in front of a 12 bits radiometric CCD camera. We point out that the dichroic patches are placed such that their transmission axes are nearly orthogonal. Totally polarized incident light undergoes multiple reflections between the two coverglasses and escapes highly depolarized such that the polarization signature is expected to be relatively weak. This situation was chosen to represent a hard test case that can be encountered in many practical situations such as in polarization imaging of biological tissues.

The Mueller image obtained by classical PDR of this object is shown in Fig. 5. At first, we can observe that while the conventional intensity imaging provides little to no information about the object (m ₀₀ channel in the Mueller image), the other Mueller channels provide a rich enhancement to classical imaging. We observe also that the Mueller image is highly contaminated by a noise that hampers the physical interpretation of the interaction mechanisms which would allow inferring the object properties.

The algorithm output is given in Fig. 6 and in Table 4. We point out that all retrieved Mueller matrices are physically valid. This can be verified easily by computing the eigenvalues spectra of their corresponding coherency matrices. Indeed, all the eigenvalues are real positive. In the other hand, Fig. 7 shows the variation of the optimum of Eq. (17) versus the number of classes. The shape of this curve indicates that the optimal number of classes is here 5. This is in complete agreement with the a priori knowledge.

Once the Mueller matrices are estimated, one can focus on the physical interpretation of the

Fig. 5. The Mueller image of the test object as obtained from classical pseudo-inverse PDR. The image in the upper left (m ₀₀) element corresponds to a conventional intensity image.

Download Full Size | PDF

Fig. 6. The final map obtained by the Bayesian approach (Potts’ model). Each gray level corresponds to one different class associated to a different Mueller matrix. The circular cross section delimits the footprint of the illuminating beam. Classes (2) and (3) are the dichroic patches and class (4) corresponds to the thin cellophane film.

Download Full Size | PDF

Table 4. The five retrieved Mueller matrices that correspond to the label map output of our algorithm. These matrices have to be interpreted with the class numbering that is shown in Fig. 6.

View Table | View all tables in this article

Fig. 7. The optimal number of classes for the true image. We observe that the graph is consistent with the a priori knowledge of the object.

Download Full Size | PDF

results. Several analysis tools are available in the literature. We cite among others the coherency matrix decomposition approach [3] and the polar decomposition [7]. For this image we observe that the five obtained matrices depolarize a large amount of any incoming polarized state which is in agreement with the physical situation. We can also compute the product of the two matrices M ₂ and M ₃ corresponding to the second and third classes (which are weak polarizers oriented orthogonally to each other) to obtain

M_{2} M_{3} = [\begin{matrix} 1 & 0.03 & 0.01 & 0.00 \\ - 0.05 & 0.00 & 0.00 & 0.00 \\ - 0.02 & 0.00 & 0.00 & 0.00 \\ 0.00 & 0.00 & 0.00 & 0.00 \end{matrix}],

which is very close to a pure depolarizer in accordance with the physics of the object (transmitting axes nearly orthogonal).

Finally, let us mention that running the algorithm takes 2 hours for K=4 classes (256×256 image, 2.8 GHz standard desktop computer, programmation in C langage).

5. Conclusion and perspectives

In this paper, we present a polarimetric data reduction method addressing piecewise constant images in the Mueller framework imagery. The proposed Bayesian method enforces the physical admissibility conditions while providing a robust clustering of the estimated images. The results obtained are significantly better and more robust than those given by classical pseudoinverse or K-means estimation. This framework will enable the researcher to enhance and deepen the understanding of the object constituents. Extension of the approach to smoothly varying polarization signatures is under investigation.

The software that processes Stokes and Mueller images is available on request for interested readers wishing to process their own images.

Acknowledgement

The authors are indebted to two anonymous reviewers for helpful comments and suggestions and acknowledge the financial support of the Agence Nationale de la Recherche (ANR Polarimetry).

References and links

1. J. Zallat and Ch. Heinrich, “Polarimetric data reduction: a Bayesian approach,” Opt. Express 15, 83–96 (2007), http://www.opticsinfobase.org/abstract.cfm?URI=oe-15-1-83. [CrossRef] [PubMed]

2. J. Zallat, S. Aïnouz, and M.-P. Stoll, “Optimal configurations for imaging polarimeters: impact of image noise and systematic errors,” J. Opt. A: Pure Appl. Opt. 8, 807–814 (2006). [CrossRef]

3. S. Cloude and E. Pottier, “Concept of polarization entropy in optical scattering: polarization analysis and measurement,” Opt. Eng. 34, 1599–1610 (1995). [CrossRef]

4. G. Golub and C. Van Loan, Matrix computation (The Johns Hopkins U. Press, Third edition, 1996).

5. G. Winkler, Image analysis, random fields and Markov chain Monte Carlo methods: a mathematical introduction (Springer, Second edition, 2006).

6. P. Green and S. Richardson, “Hidden Markov models and disease mapping,” J. Am. Stat. Assoc. 97, 1055–1070 (2002). [CrossRef]

7. S.-Y. Lu and R. Chipman, “Interpretation of Mueller matrices based on polar decomposition,” J. Opt. Soc. Am. A 13, 1106–1113 (1996). [CrossRef]

Noise variance σ ² _n	0.1		0.5
Potts’ model ( $\hat{β}$ )	0.05	(0.02)	0.9	(0.2)
Unilateral model ( $\hat{β}$ )	0.40	(0.05)	2.4	(0.2)
Potts’ and unilateral model (β=0)	2.3	(0.2)	14.9	(1.2)
K-means	2.3	(0.2)	11.7	(0.3)

Mueller matrix	Table	1(a)	Table	1(b)	Table	1(c)	Table	1(d)
Potts	0.7	1.7	0.36	0.85	0.35	0.84	1.0	2.9
Unilateral	0.72	2.0	0.35	0.84	0.35	0.84	1.0	3.0
K-means	0.9	6.7	0.4	0.9	0.38	1.0	1.1	6.63

1.0001	-0.0019	0.0002	-0.0001	1.0013	-0.0499	0.0119	0.0000
0.0015	0.0078	0.0007	-0.0002	-0.0575	0.0446	-0.0110	-0.0005
-0.0003	0.0008	-0.0017	-0.0019	-0.0223	0.0151	-0.0082	-0.0022
-0.0002	-0.0001	0.0014	-0.0038	-0.0006	-0.0006	0.0033	-0.0066
Background (class 1)				Internal region (class 5)
0.9996	0.0331	0.0192	-0.0010	0.9998	-0.0130	-0.0112	-0.0161
0.0432	0.0575	0.0213	0.0004	-0.0144	0.1931	0.0172	-0.0119
-0.0177	-0.0143	-0.0209	-0.0036	0.0027	0.0023	-0.1404	-0.1199
-0.0005	-0.0013	0.0038	-0.0148	0.0041	-0.0409	0.1065	-0.1423
Dichroic polaroid 2 (class 3)				Cellophane (class 4)
1.0003	-0.0093	-0.0215	-0.0073
-0.0196	0.2471	0.0238	0.0369
-0.0034	0.0219	-0.2440	0.0004
-0.0005	-0.0022	0.0004	-0.2544
Internal region (class 5)

Noise variance σ ² _n	0.1		0.5
Potts’ model ( $\hat{β}$ )	0.05	(0.02)	0.9	(0.2)
Unilateral model ( $\hat{β}$ )	0.40	(0.05)	2.4	(0.2)
Potts’ and unilateral model (β=0)	2.3	(0.2)	14.9	(1.2)
K-means	2.3	(0.2)	11.7	(0.3)

Mueller matrix	Table	1(a)	Table	1(b)	Table	1(c)	Table	1(d)
Potts	0.7	1.7	0.36	0.85	0.35	0.84	1.0	2.9
Unilateral	0.72	2.0	0.35	0.84	0.35	0.84	1.0	3.0
K-means	0.9	6.7	0.4	0.9	0.38	1.0	1.1	6.63

A Bayesian approach for polarimetric data reduction: the Mueller imaging case

Abstract

1. Introduction

2. Problem statement

2.1. Observation model in the Mueller imaging polarimetry framework

2.2. Underlying segmentation model

2.3. Noise model

2.4. General probabilistic model

3. Estimation procedure

4. Algorithm application

4.1. Simulation results

4.2. Results against measured data

5. Conclusion and perspectives

Acknowledgement

References and links

Cited By

Figures (9)

Tables (4)

Equations (22)

Optics Express