## Abstract

In this work, an alternative route to analyze a set of coherency matrices associated to a medium is addressed by means of the Independent Component Analysis (*ICA*) technique. We highlight the possibility of extracting an underlying structure of the medium in relation to a model of constituent components. The medium is considered as a mixture of unknown constituent components weighted by unknown but statistically independent random coefficients of *thickness.* The *ICA* technique can determine the number of components necessary to characterize a set of sample of the medium. An estimate of the value of these components and their respective weights is also determined. Analysis of random matrices generated by multiplying random diattenuators and depolarizers is presented to illustrate the proposed approach and demonstrate its capabilities.

© 2011 OSA

## 1 - Introduction

Mueller matrix as linear mapping between the input and output Stokes vectors of light interacting with media, is a very powerful tool for polarimetric characterization of linear media. Though Mueller matrix may be viewed as a full representation of the polarimetric properties of a material medium, extracting these properties from the expression of the rough Mueller matrix, is not straightforward. This the reason while the Mueller matrices are usually interpreted by decomposing it into elementary components related to basic optical quantities (diattenuation, retardance and depolarizance for instance).

The most popularized technique is a factorization one based on the polar-decomposition. This decomposition of an arbitrary Mueller matrix was first proposed by Lu and Chipman [1]. Since this seminal article, a lot of papers have been published completing this approach. Some limitations of this decomposition have been underlined [2] and the *forward* and *reverse* decompositions concept has been proposed [3] to solve this question. Others works have tried to limit the “asymmetry” of the initial polar decomposition. A *symmetric* decomposition has been proposed [4]. This approach leads to a straightforward Mueller matrix interpretation in terms of an arrangement of five factors: a diagonal depolarizer stacked between two retarder and diattenuator pairs. Most of these works are in fact related to the question of the influence of the order in which the elementary matrices are multiplied, since they do not commute. Finding different values for the basic optical properties estimated from a single Mueller matrix is a straightforward consequence of this non commuting product. It follows that this set of decompositions is well appropriate to synthesize an optical system with a given Mueller matrix from classically available optical components. The issue of analysing the basic optical properties with this approach seems to be more problematic. There is no reason to presume that the birefringence effect for instance is placed first or even it is localised at a specific place in the medium without a priori knowledge on this medium under analysis.

A second approach to decompose a Mueller matrix into elementary components may be related to the work of Cloude [5] who demonstrated the possibility to decompose such a matrix as a convex sum of up to four Mueller-Jones matrices. This demonstration is related to the spectral decomposition theorem applied to the coherency matrix associated to the Mueller matrix. Coherency matrix **H** is related to the Mueller matrix elements m_{ij} and Kronecker product of Pauli matrices (**σ**_{i}) by:

It is worth noticing that we assimilate **H** to the *coherency matrix* although the classical definition is related to a matrix which differs from **H** by an unitary transformation. In fact, as demonstrated by Aiello [6], the definition of **H** is related to the basis of Χ^{4x4} that is considered.

Coherency matrix is thus another possibility of representation of the polarimetric properties of a material medium. But this parallel *incoherent* decomposition (the emerging light is a sum of several incoherent contributions) of coherency matrices is not the only one like it. Other parallel decompositions may be found in the review paper of Gil [7] for instance. Once again, this tool does not lead to a straightforward characterisation of optical properties of the medium since we only access to an equivalent parallel model.

The third approach to analyse the polarimetric properties of a medium is chronologically the first proposed method since it is related to the layered-medium interpretation proposed by Jones in the seventh paper of his series [8]. Introducing the notion of *N matrices* based on this layered-medium interpretation, Jones described the state of the light at every point *z* in an optical system along the light path but characterized the optical properties of the medium too. Later, Azzam extended this approach [9] to the partially polarized light propagating through non depolarizing media and introduced the *differential Mueller matrix* **m** related to the Mueller matrix at distance *z* into the medium by:

Azzam also derived the relations between the entries of **N** and **m** differential matrices for non depolarizing media. However, the formal relation between these both matrices was formulated by Barakat [10] from the concept of exponential versions of the Mueller-Jones matrices.

In a recent paper [11], we addressed the question of interpolation of the coherency matrix. For the denoted *Log-Euclidean* (*LE*) process, we demonstrated the physical meaning of this interpolation process is founded on a very similar approach to the layered-medium interpretation proposed by Jones. The medium associated to the coherency matrix **H**(*t*) is considered as a sandwich of thin laminae. Interpolating from **H**_{1}=**H**(0) to **H**_{2}=**H**(1) consists in changing the proportion of both the elements **D**_{1}=log(**H**_{1}) and **D**_{2}=log(**H**_{2}) constituting the thin laminae. The basic point that we wish to stress is that **D**_{1} and **D**_{2} may be considered as *constituent* components of the set of matrices **H**(*t*) and thus of the related media. This point is developed in the following and the questions of how determine these components and how many components are needed to describe a set of coherency matrices are addressed by means of *Independent Component Analysis* (*ICA*) technique.

The paper is thus organized as follows. After a reminder of definitions and relations associated with coherency matrix and *LE* process, we derive the hypothesis of our approach and describe the *ICA* processing over the space of coherency matrices. The case of random matrices generated by multiplying random diattenuators and depolarizers is eventually presented for illustrating the proposed approach.

## 2 – Coherency matrix space and LE metrics

#### 2.1. Mueller and coherency matrix definition

Following [12] [13], we define a Mueller matrix **M** as a *convex* sum of so-called Mueller-Jones matrices. From this definition, it is possible to extract from **M** a semi definite Hermitian matrix **H** that is related to the Mueller matrix elements m_{ij} by Eq. (1). If the set of coherency matrices is restricted to *HPD*(4), the manifold of Hermitian Positive Definite matrices of dimension 4, a Lie group structure can be associated to *HPD*(4) (see [14] for demonstration). The group product is defined by:

*exp*and

*log*are the classical matrix exponential and logarithm mappings respectively. It is clear that neither the Mueller-Jones matrices nor the Mueller matrices corresponding to a coherency matrix with 1 or 2 null eigenvalues are elements of

*HPD*(4). So, we do not take them into account in this paper.

#### 2.2. Log-Euclidean distance on coherency matrix space

When considering *HPD*(4) matrices as a smooth manifold, it is also possible to define a Riemannian framework on this set of matrices. This is the way used by Arsigny [14] to define the *LE* distance on Symmetric Positive Definite matrices. Proving that this distance can easily be extended to *HPD*(4) matrices and is well adapted to coherency matrix interpolation was done in [11].

Following the definition of the *LE* distance, interpolating from **H**_{1}=**H**(0) to **H**_{2}=**H**(1) consists in changing the proportion of both the elements **D**_{1}=log(**H**_{1}) and **D**_{2}=log(**H**_{2}) constituting the thin laminae (see [11] for more details). This medium is thus characterized by an infinitesimal generator **D** = (1-*t*).**D**_{1}+*t*.**D**_{2}. This is a sandwich of both the previous media with proportions (1-*t*) and *t* respectively. It is worth noticing that this result is independent of the order in which both the sections are placed since matrix summation is commutative. Thus constituent components are not localised into the medium but may be regarded as distributed contributions along the light path.

This approach can be extended when defining the *logarithmic scalar multiplication* of a *HPD* matrix **H** by a scalar α∈Ρ in the same way as Arsigny proposed for symmetric matrix [14]. This multiplication is defined by:

With both the operations defined by Eq. (3) and (4), **V** = (*HPD*(4), ⊕, •) can be viewed as a vector space when a *HDP* matrix is identified with its logarithm. The reader is referred to [14] for the demonstration of this property with Symmetric Positive Definite matrices.

A straightforward consequence is that **H** can be written as a *linear mixtures* of vectors { **G**_{1}, **G _{2}**, …,

**G**

_{N}} of this space

**V**.

Following the same approach used to exhibit the physical meaning of *LE* interpolation process, **H** may be considered a sandwich of constituent components **D**_{i} = log(**G**_{i}) with the corresponding thickness α_{i}. Decomposing **H** over a set of basis vectors of **V** could be a first available solution. It could be also possible to choose the constituent components **D**_{i} in such a way that the corresponding coefficients α_{i} are homogeneous to optical quantities like for **N** matrix defined by Jones.

However, we propose to explore an alternative route by exhibiting polarimetric structure of the medium directly from the data in relation to a predefined analysis model. If we consider the medium to be described by the coherency matrix **H** at any point in the space (or any instant of time for temporal variations), we wish to be able to identify the presence of a set of constituent components **D**_{i}. This set of constituent components may be considered a signature of the medium. The fluctuations of **H** are thus supposed to be only related to variations of the *thickness* α_{i} of these components. The problem are then: How many constituent components **D**_{i} are needed to characterize the medium and what are these constituent components and their weighting coefficients ?

## 3 – Analysis of a medium by a constituent components model

Without any supplementary information, we are just able to tell that N = 16 components are obviously needed to represent the most general medium since **H** has 16 degrees of freedom.

Nevertheless, one approach to solving this problem could be to assume some statistical properties of the quantities to estimate. If the medium may be considered as a mixture of unknown constituent components weighted by unknown but statistically independent random coefficients of *thickness* α_{i}, the *Independent Component Analysis* (*ICA*) technique can be used to estimate the **D**_{i}.

#### 3.1. ICA model used for coherency matrix of a medium

It is beyond the scope of this paper to develop a general presentation of this method and the reader is referred to the literature on this topic, see [15] for instance. We just develop the classical formalism of *ICA* technique and explain how this approach can be used for our problem

From Eq. (5) at any point in the space or any instant of time, we have a coherency matrix **H** and we can write:

It is worth noticing that it is equivalent to speak about **H** a Coherency matrix, **LH** the logarithm of this matrix or **M** the associated Mueller matrix. There is no ambiguity between the three notations.

Equation (6) defined α_{j} as the realization of a random variable. The random thickness α_{i} are the latent variables of our problem since they cannot be directly observed and the **D**_{j} may be considered as mixing elements. It is possible to derive a more explicit expression of these mixtures. Since **H** ∈ *HPD*(4), **D**_{j} is necessarily a Hermitian matrix and we have only 16 unknown entries for each **D**_{j} matrix:

A similar description is available for the **LH** matrices with 16 estimated entries lh_{i}. The matrix **A** = (a_{ij}) with a_{ij} = d^{j}_{i} is thus a 16 rows and N columns matrix. **X =** (lh_{i}) is a 16 rows and 1 column matrix and **S** = (α_{j}) a N rows and 1 column matrix. We may write using a vector-matrix notation **X** = **A S**. For the classical formalism of *ICA* technique, **A** stands for the mixing matrix. The problem is then to recover the **S** components using the input data **X**. When **A** is known, we can solve this linear equation by classical methods. If the mixing matrix is assumed to be unknown too, this problem can nevertheless be solved but assuming that the random components α_{j} are statistically independent variables. This is the starting point for *ICA* method. All we observe is the random vector **X** and *ICA* algorithm performs an unsupervised estimation of N, the number of independent components and a joint estimation of **A** and **S**. How this algorithm works can be related to a minimization of mutual information process. Mutual information is a natural information measure of the independence of random variables. The *ICA* of the random vector **X** is defined as the invertible transformation **S** = **WX** where the matrix **W** is determined so that the mutual information of the **S** components is minimized [15]. Various algorithms have been derived in the literature to perform this task. FastICA [16] is one of these algorithms and it will be used below. This algorithm assumes that the data is preprocessed by centering. Consequently, we have the following relation between the input data **X** and the estimated **S** and **A**:

(See Appendix A for the proof) where <.> denotes the ensemble average.

It is worth noticing that <**X**> represents the mean value of **LH** matrices and corresponds exactly to the mean value of coherency matrices according to the *LE* distance:

See [11] [14] for more details on this definition. The **A** matrix of Eq. (8) and the corresponding constituent components **D**_{i} may be viewed as a characterization of fluctuations around this mean value. Thus we are exhibiting polarimetric structure of the medium directly from the data itself in which case the feature extraction process can be regarded as well suited to data which is being processed.

Both **S** and **A** being unknown, any scalar multiplier in one of the thickness α_{i} could be cancelled by dividing the corresponding column of **A** by the same scalar. This is a well-known ambiguity of the *ICA* model. Normalizing the **D**_{j} matrices of constituent components, is the way we choose to fix the magnitudes of the thickness α_{j}. After this step of normalization, Tr(**D**_{j} **D**_{j}^{†}) = 1 where † stands for a Hermitian conjugate and Tr(.) denotes the trace of the matrix. Nevertheless, there is always the ambiguity of the sign. Since α_{i} are presumed to be homogeneous to thickness, we can always multiply the α_{j} and the corresponding **D**_{j} matrix by −1 without affecting the model in order to have positive quantities for the thickness.

Obviously, the major assumption of this approach is on the actual existence of statistically independent coefficients α_{j}. This is not an so unrealistic assumption in many cases. Considering a Mueller matrix as a mixing of independent physical quantities (birefringence or dichroïsm for instance) is one example. Since these quantities can have different values from one moment to another or from one location to another, a set of matrices with independent latent variables can be measured and analyzed according to the proposed approach. Illustrating this issue is exactly what we will do with the example below.

#### 3.2. The example of matrices with random diattenuation and depolarization values

We consider the example of a medium characterized by its Mueller matrix **M**. According to Lu and Chipman [1], **M** can be decomposed as **M _{R}M_{D}M_{DEP}** where

**M**stands for a retardance matrix,

_{R}**M**stands for the diattenuation one and

_{D}**M**for the depolarizing one. For the sake of simplicity we will suppose

_{DEP}**M**to be the identity matrix in order to limit the number of independent components, but this is absolutely not a limitation of the approach.

_{R}The **M _{DEP}** matrix can be expressed as:

We assume *a*, *b*, *p* uniformly distributed random variables with <*a*> = 0.3, <*b*> = 0.4 and <*p*> = 0.1. *c*, *q* and *r* are fixed at 0.12, 0 and 0 respectively. Similarly, a **M _{D}** matrix is completely defined by its diattenuation vector

**DV**=[

*f g h*].

*f*and

*g*are fixed to 0.2 and −0.14 respectively and

*h*is assumed uniformly distributed random variable with <

*h*> = 0.12. All these values are arbitrarily chosen but in order to have physical Mueller matrices. A set of random Mueller matrices is then generated by multiplying these random diattenuators and depolarizers

Two examples of the diattenuation and depolarization matrices are listed in Tab. 1
with the corresponding generated matrices **M**.

All these 4 random variables are supposed to be independent. This is not an unrealistic assumption since diattenuation and depolarization can be regarded as derived from different physical properties and their components are specifically introduced as degrees of freedom of Mueller matrices in the Lu and Chipman decomposition.

A set of 1000 matrices is built following the procedure previously described. The corresponding coherency matrices are computed and this set of matrices is the input data analyzed by *ICA* method. The algorithm estimates a number of independent components equal to 4. The mean value of **LH** matrices and the four components are also estimated. The Mueller matrix **M**_{mean} corresponding to this mean value of **LH** is listed in Tab. 2
with its polar decomposition. This is exactly the Mueller matrix with the mean values of the random variables used to generate the set of input data. It could be argued that if a Lu and Chipman decomposition was first applied on each random matrix and the classical Euclidean mean computed on the set of **M**_{D} and **M**_{DEP} respectively, the good mean value is then obtained by the multiplication of both these mean matrices. This is true but is only possible because the order of the decomposition is a priori known which is not generally the case in experimental measurements.

In order to analyze the constituent components estimated by the algorithm, we compute the Mueller matrices denoted **M**_{c}(i), associated to the estimated constituent components when all the α_{j} except one are set to 0. These matrices are thus computed from the N coherency matrices given for i∈{1;N} by:

Tab. 3
shows an example of these N=4 Mueller matrices computed when the α_{j} are the estimated values corresponding to the Mueller matrix at the first line of Tab. 1. For each of these **M**_{c}(i) matrices, the polar decomposition is computed and corresponding **M**_{D} and **M**_{DEP} matrices are shown.

For each of the **M**_{D}(i) and **M**_{DEP}(i) matrices of Tab. 3, the matrix elements that differ from the average matrix elements of Tab. 2 are set in bold. It is thus clear that each of the matrices **M**_{C}(i) carries the information corresponding to one and only one random variables. **M**_{C}(1), **M**_{C}(2) and **M**_{C}(3) are related to three random variables we have introduced in the matrix of depolarization and **M**_{C}(4) is on the fourth random variable that we introduced in the matrix of diattenuation. The α_{j} we used in this example are the estimated values corresponding to the Mueller matrix at the first line of Tab. 1. It is worth noticing that the entries (in bold font) of the **M**_{D}(i) and **M**_{DEP}(i) matrices corresponding to the action of the random variables, have the values of the entries (in bold font) of the depolarization and diattenuation matrices at the first line of Tab. 1. Similar results are obviously obtained with all matrices of the original set. Constituent components **D**_{i} therefore characterized the polarimetric properties of the set of matrices under analysis. By *ICA* approach, it is possible to isolate these polarimetric features, each being carried by a constituent component **D**_{i}. These examples demonstrate the potential of the method in terms of feature extraction of a set of Mueller or coherency matrices.

## 4 - Conclusion

In this work, an alternative route to analyze a set of coherency matrices associated to a medium is addressed by means of the Independent Component Analysis technique. We highlight the possibility of extracting an underlying structure of the medium in relation to a model of constituent components. If we consider the medium to be described by the coherency matrix at any point in the space (or any instant of time for temporal variations), we are able to identify the presence of a set of constituent components. This set of constituent components may be considered a signature of the medium. The fluctuations of the coherency matrix are thus supposed to be only related to variations of the *thickness* of these components. The samples are considered as a mixture of unknown constituent components weighted by unknown but statistically independent random coefficients. The *ICA* technique can determine the number of components necessary to characterize this set of samples of the medium. An estimate of the value of these components and their respective weights is also determined.

The case of random matrices generated by multiplying random diattenuators and depolarizers was presented to illustrate the proposed approach and demonstrate its capabilities. Applications of this method to the case of coherency matrix from experimental data are being analyzed.

## Appendix A

From the *ICA* technique, we have an estimation of **A** and **W** in such a way that **WA** = **Id** but it is worth noticing that **AW** is not equal to the identity matrix. The algorithm assumes that the data is preprocessed by centering so we actually estimate:

then

But from (A1), we have by a left multiplication by **W:**

from (A3) and (A2) we have: **S** = **W X** and then:

Thus from (A1), (A2) and (A4) we eventually have **X** - <**X**> = **A** (**S** - <**S**>) that gives Eq. (8).

## References and Links

**1. **S. Y. Lu and R. A. Chipman, “Interpretation of Mueller matrices based on polar decomposition,” J. Opt. Soc. Am. A **13**(5), 1106–1113 (1996). [CrossRef]

**2. **J. Morio and F. Goudail, “Influence of the order of diattenuator, retarder, and polarizer in polar decomposition of Mueller matrices,” Opt. Lett. **29**(19), 2234–2236 (2004). [CrossRef] [PubMed]

**3. **R. Ossikovski, A. De Martino, and S. Guyot, “Forward and reverse product decompositions of depolarizing Mueller matrices,” Opt. Lett. **32**(6), 689–691 (2007). [CrossRef] [PubMed]

**4. **R. Ossikovski, “Analysis of depolarizing Mueller matrices through a symmetric decomposition,” J. Opt. Soc. Am. A **26**(5), 1109–1118 (2009). [CrossRef] [PubMed]

**5. **S. R. Cloude, “Group theory and polarisation algebra,” Optik (Stuttg.) **75**, 26–36 (1986).

**6. **A. Aiello and J. P. Woerdman, arXiv.org e-print archive math-ph/0412061 (2004)

**7. **J. J. Gil, “Polarimetric characterization of light and media,” Eur. Phys. J. Appl. Phys. **40**(1), 1–47 (2007), doi:. [CrossRef]

**8. **R. C. Jones, “A new calculus for the treatement of optical systems. VII properties of the N-matrices,” J. Opt. Soc. Am. **38**(8), 671–685 (1948). [CrossRef]

**9. **R. M. A. Azzam, “Propagation of partially polarized light through anisotropic media with or without depolarization: A differential 4x4 matrix calculus,” J. Opt. Soc. Am. **68**(12), 1756–1767 (1978). [CrossRef]

**10. **R. Barakat, “Exponential versions of the Jones and Mueller-Jones polarization matrices,” J. Opt. Soc. Am. A **13**(1), 158–163 (1996). [CrossRef]

**11. **V. Devlaminck, “Mueller matrix interpolation in polarization optics,” J. Opt. Soc. Am. A **27**(7), 1529–1534 (2010). [CrossRef] [PubMed]

**12. **K. Kim, L. Mandel, and E. Wolf, “Relationship between Jones and Mueller matrices for random media,” J. Opt. Soc. Am. A **4**(3), 433–437 (1987). [CrossRef]

**13. **B. N. Simon, S. Simon, N. Mukunda, F. Gori, M. Santarsiero, R. Borghi, and R. Simon, “A complete characterization of pre-Mueller and Mueller matrices in polarization optics,” J. Opt. Soc. Am. A **27**(2), 188–199 (2010). [CrossRef] [PubMed]

**14. **V. Arsigny, P. Fillard, X. Pennec, and N. Ayache, “Geometric means in a novel vector space structure on symmetric positive-definite matrices,” SIAM J. Matrix Anal. Appl. **29**(1), 328–347 (2007). [CrossRef]

**15. **A. Hyvärinen and E. Oja, “Independent component analysis: algorithms and applications,” Neural Netw. **13**(4-5), 411–430 (2000). [CrossRef] [PubMed]

**16. **A. Hyvärinen, “Fast and robust fixed-point algorithms for independent component analysis,” IEEE Trans. Neural Netw. **10**(3), 626–634 (1999). [CrossRef] [PubMed]