## Abstract

We analyze estimation error as a function of spectral bandwidth for division-of-amplitude (DoAm) Stokes polarimeters. Our approach allows quantitative assessment of the competing effects of noise and deterministic error, or bias, as bandwidth is varied. We use the signal-to-rms error (SRR) as a metric. Rather than calculating the SRR of the estimated Stokes parameters themselves, we use the singular-value decomposition to calculate the SRRs of the coefficients of the measured data vector projected onto the measurement matrix left singular vectors. We argue that calculating the SRRs for left singular vector coefficients will allow development of reconstruction filters to minimize Stokes estimation error. For the example case of a source with constant polarization over a relatively wide band, we show that as the spectral filter bandwidth is increased to include wavelengths significantly different than the design wavelength, the SRRs of the estimated left singular vector coefficients will a.) increase monotonically if relatively few photo-detection events (PDEs) are recorded, b.) after a sharp peak close to the design wavelength, decrease monotonically if relatively many PDEs are recorded, and c.) have well-defined maxima for nominal PDE counts. Given some idea of the source brightness relative to detector noise, one can specify a spectral filter bandwidth minimizing the variance and bias effects and optimizing Stokes parameter estimation. Our approach also allows one to specify the bandwidth over which the response of “achromatic” optics must be reasonably invariant with wavelength for rms Stokes estimation error to remain below some desired maximum. Finally, we point out that our method can be generalized not only to other types of polarimeters, but also to any sensing scheme that can be represented by a linear system for limiting values of a certain parameter.

© 2010 Optical Society of America

## 1. Introduction

Since Azzam [1] first proposed the division-of-amplitude (DoAm) Stokes polarimeter, optimization and noise properties of these and similar instruments have received considerable attention. Data from a DoAm polarimeter can be used to estimates the Stokes parameters of light reflected from or being emitted by a target or scene. For relatively narrow-band incident light, the relation between a vector of measured data and the Stokes vector **s** for the incident light is described by a linear system:

where **W** is called the “forward” or “measurement” matrix and is defined by how light is distributed among the detectors by the optical components (retarders, polarizing beam-splitters, TIR prisms) in the various beam paths of the instrument. The detectors could be point detectors or imaging focal plane arrays. The Stokes parameters are estimated by inverting the above relation:

with **W**
^{−1} called the “synthesis” or “reconstruction” matrix.

We note, not insignificantly, that the elements of **W** are actually *estimated* from calibration measurements (see, e.g., Chipman [2]), with the polarimeter illuminated by a polarization state generator (PSG) source. Very often, the system of calibration equations is overdetermined by using more calibration states than formally necessary; this means the synthesis matrix is generally (**W**
^{T}
**W**)^{−1}
**W**
^{T}, the least-squares inverse of **W**. Still, measurement noise, systematic errors in polarimeter components and the PSG, and finite-precision arithmetic all mean the “true” matrix **W** can never be known exactly. To avoid confusion regarding our main points, we assume such calibration errors are insignificant and denote the synthesis matrix simply as **W**
^{−1}, whether or not **W** is square.

Because the DoAm polarimeter can be described using linear algebra, it’s natural to study the instrument in terms of the matrix properties of **W**. Ambirajan and Look [3, 4] linked the optimal rotation angles for a polarimeter consisting of 1/4-wave plate and linear polarizer in tandem, each of which are free to rotate, to the condition number and determinant of the relevant matrix. Sabatke, et al. [5, 6] introduced the equally-weighted variance (EWV) metric (the sum of the inverse-square singular values of **W**) to optimally choose the retardance in a retarderpolarizer instrument. Tyo [7] showed that minimizing the EWV is equivalent to equalizing the lengths (L2 norms) of the rows of the synthesis matrix. Tyo also showed [8] that minimizing the measurement matrix condition number was crucial to limiting the effect of systematic errors in the instrument, such as miscalibration and misalignment of optical components.

Treatments based on physics and estimation theory are also available. Gamiz and Belsher [9] derived expressions for the variances associated with estimating the intensities in orthogonal polarizations and the relative phase between them. Foreman, et al. [10] developed the Fisher Information matrix (equivalently, the Cramer-Rao Lower Bound, or CRLB) for Stokes parameter estimation. The CRLB formalism allowed the inclusion of prior knowledge about the system, allowing an optimal Stokes polarimeter to be designed for such cases as measurement of a known polarization signal against an unpolarized background, or detection of strictly linear polarized light.

Implicit in these valuable results is the assumption of a relatively narrow band spectral filter. This is not without good reason; the concept of polarization is itself a narrow-band phenomenon, intimately related to temporal coherence [11]. As a result, the emphasis by previous workers on minimizing noise amplification in Stokes estimation process is because remotesensing Stokes measurements are often at low signal levels due to the use of narrow-band filters with natural illumination (in contrast with Mueller polarimetry, where a polarization state generator usually includes a very bright illumination source).

As pointed out by Tyo, et al. (2006) [12], polarization can be a slowly-varying function of wavelength for many sources; however, increasing the spectral filter bandwidth to decrease Stokes estimation variance is seldom considered because of bandwidth-induced estimation bias, as pointed out by Boger, et al. [13]. Bias increases with measurement bandwidth for two reasons: First, the polarization state of the source will in general vary with wavelength, so any attempt to reduce estimation noise by increasing the filter bandwidth unavoidably biases estimation of the source polarization in a relatively narrow band about any particular wavelength. This statement of course holds for any kind of polarimeter. Second, bias will generally increase with bandwidth even if the Stokes parameters of the incident light remain relatively constant over the band. This is because incident light is distributed among the detectors according to **W**, and wavelength-varying properties of the sensor mean the elements of **W** are generally wavelength-dependent.

The wavelength dependence of **W** is due principally to wavelength-varying retardance in the polarimeter itself, quantitatively affecting estimation of the linear-polarization Stokes parameters *s*
_{1} and *s*
_{2} as well as the circular polarization parameter *s*
_{3}. Further, if polarizing beam splitters are used, component coatings will generally cause wavelength-dependent transmission and reflection. In principle, polarizer efficiency also varies with wavelength; in practice, however, this effect can be made negligible with very little trouble. Diattenuation may also vary with wavelength, but our experience is that retardance and coating effects are the dominant factor in measurement matrix variation with wavelength. Given the components for the appropriate PSG, a useful synthesis matrix can of course be calculated for a relatively narrow band about any particular wavelength; further one might suppose a “broadband” measurement matrix could be calculated using a suitable illumination source for calibration; however, even if one overlooks the complication of wavelength-dependent retardance in the PSG, it’s easy to show that the synthesis equation Eq. (2) is formally correct *only for a single wavelength*.

The SRR and signal-to-noise (SNR) for an estimator *θ͂* of some quantity *θ* compare as follows: The SNR is the ratio of the mean and standard deviation of *θ͂*:

By contrast, the SRR is given by

where 〈*θ͂*〉 is the expected value (mean) of *θ͂*, *σ*
^{2}
_{θ͂} is the variance, and *E*{*} indicates the expectation operator. Note the denominator of Eq. (4) can be written

with the second term being the bias associated with the estimator *θ͂*.

Our SRR is developed as a function of spectral bandpass to explicitely allow consideration of a particular bias-variance tradeoff in Stokes estimation. Such tradeoffs are classical estimation problems. One can always drive variance to an arbitrarily small level simply by increasing bias; for example, one could define a zero-variance estimator that always yields the same Stokes vector regardless of the input measurements. Of course, such an estimator is of little practical value. One definition of an *optimal* estimator is one where the rms error, as given by the sum of variance and bias, is minimized; by extension, an optimal solution is given when the SRR is maximized. Using our approach and dividing the spectral band of the incident light into subbands small enough that **W** is approximately constant over each sub-band, the changes in modal noise and Stokes estimation noise and bias can be observed as the bandwidth is increased. The behavior of both bias and variance can be clearly observed and optimal-estimation spectral filter widths calculated.

We note that retarders can be fabricated to have reasonably constant retardance over significant ∆*λ*/*λ*, reducing the wavelength dependence of the instrument response. It’s therefore legitimate to wonder if bias issues can be rendered moot by simply designing the polarimeter using such components. “Achromatic” retarders, for example, are made by combining materials with opposing refractive index dispersions. However, disadvantages remain: First, the fast axis of an achromatic retarder is generally wavelength dependent, introducing a wavelength-dependent response for off-axis rays; this is of special importance for imaging instruments. Second, the combined-material index can vary more strongly with temperature than that of a single material, imposing the need for environmental control. For beam splitter coatings, maintaining constant transmission and reflection coefficients over a significant ∆*λ*/*λ* means additional cost. Whether polarimeter design involves retarders, beam splitters, or birefringent optics, our approach can be used to specify the bandwidth over which components must have a relatively flat response. Finally, we note again that Stokes estimation from measurements with significant ∆*λ* will be biased regardless of the polarimeter components used if the source polarized over only a relatively small band.

In this work, we exploit the singular value decomposition (SVD) to define an expression for the signal-to-rms error ratio, or SRR, of the coefficients of the measured data vector projected onto the orthogonal basis vectors spanning the column space of the measurement matrix. These vectors are the matrix *left singular vectors*, with the right singular vectors being given by the orthogonal basis of matrix row space [14]. For brevity’s sake, we refer to the left singular vector coefficients as the “modal,” “matrix mode,” or “mode-space” representation of the incident and estimated Stokes parameters. We show that generally, each of these modal coefficients couples to more than one Stokes parameter. From this, we infer the possibility of minimizing Stokes estimation error by optimally weighting, or filtering, the modal coefficients based on their respective SRRs.

In Section 2, we briefly discuss the singular-value decomposition of the measurement matrix **W** and its interpretation; this discussion allows us to develop expressions for matrix mode signal, noise, and bias in Section 3. These expressions are developed as a function of spectral filter bandwidth. In Section 4, we present example SRR calculations using the measurement matrix from a numerical model of a DoAm polarimeter similar to one developed at Lockheed Martin’s Advanced Technology Center. We present a short discussion of how these results might be applied and speculate briefly on how modal filtering might be exploited in Section 5. Finally, our work is summarized and conclusions presented in Section 6.

## 2. Singular value decomposition of the measurement matrix

Equation 1 describes the relation between measured data and the Stokes vector for either *K* point detectors or for a resolution element common to *K* imaging array detectors. For an imaging array, the element *s _{o}* in each Stokes vector

**s**gives the energy incident at any particular pixel from the field angle subtended by that pixel and for the relevant exposure time.

For any eigenvector of **W**
^{T}
**W**,

and

where *u _{i}* is the

*i*-th singular value of

**W**. Note that since

**W**

^{T}

**W**is symmetric, its eigenvectors are orthogonal.

As we mentioned earlier, **W** is full rank for the Stokes polarimeter; that is, no vector **s** exists with |**s**| > 0 that will yield a null measurement (|**d**| = 0) from the device. Since **W**
^{T}
**W** is symmetric, the full-rank condition implies the eigenvectors **x**
_{i} form an orthonormal basis spanning the domain (row) space of **W**. Further, all four singular values of **W** are non-zero.

To proceed further, consider how the singular-value decomposition allows us to see into the heart of the transformation **W**: The SVD of **W** is given by

where the rows of **V**
^{T} are the eigenvectors of **W**
^{T}
**W** and the columns of **U** are the eigenvectors of **WW**
^{T}. As mentioned previously, these are the right and left singular vectors, respectively. The matrix factor **S** is diagonal, with the singular values as the diagonal elements (the singular values are actually returned in a vector by most numerical library SVDs). Seen this way, it’s clear the action of **W** on the Stokes vector **s** is as follows:

**p**=**V**^{T}**s**. The Stokes vector**s**is projected onto an orthonormal basis for the row space of**W**, forming the 4 × 1 vector**p**with elements equal to the inner products between**s**and the basis vectors.**g**=**S p**. The elements of**p**are each weighted by their respective singular values, forming the 4 × 1 vector**g**. This is easy to see because**S**is a diagonal matrix. The singular values can be thought of as the gain associated with each mode. We refer to**g**as the modal representation of the incident Stokes vector.**d**=**U g**+**n**. The gain-weighted elements of**g**now become coefficients to form the*K*× 1 vector**d**as a linear combination of the columns of**U**plus noise. Just as the rows of**V**^{T}form an orthogonal basis for the row space of**W**, the columns of**U**form an orthogonal basis for the range (column space) of**W**.

Once the SVD factors are calculated, the synthesis matrix, **W**
^{−1}, may be written down. Because the factors **V**
^{T} and **U** are orthogonal, their inverses are given by their transposes. Further, **S** is diagonal, so **S**
^{−1} is simply another diagonal matrix with each element given by the inverse of the corresponding diagonal element in **S**. The synthesis matrix can be written down in terms of the SVD factors:

with the estimated Stokes vector written as

It follows that we can also define a modal representation for a sample function of the noisy measured data:

The SRR we develop here is for the statistics of the coefficients of **ĝ**. Note that squaring the above yields a result differing only by the inverse singular value scaling from the “power spectrum” presented by Kadrmas, et al. [15].

## 3. Matrix mode representations of signal, noise, and bias

In the following sections, we discuss representing polarimeter signal, bias, and noise in terms of the relevant singular vectors to enable optimal Stokes estimation via modal filtering. To that end, we develop bandwidth-dependent expressions for the modal signal, noise, and bias.

#### 3.1. Modal Filtering

If the left singular vectors of **W** are accessible via the SVD and expressions for the modal signal, noise, and bias are available, one can consider filtering the modes as a function of the expected SRR to optimize Stokes estimation. Rather than estimate the Stokes vector using Eq. (10), we write

where **F** is a diagonal matrix with elements *F _{ii}* ≤ 1 and chosen to give the most weight to the singular modes with the largest SRR. the value of such filtering becomes clear by considering the

**V**matrix factor in

**W**

^{−1}as a function of wavelength. As an example, we calculate

**W**for the polarimeter described in Phenis, et al. [16] and de Leon, et al. [17]. The optical parameters are chosen to optimize

**W**, as described in Sec. 4.1, at a wavelength of 633 nm. Factoring

**W**, we see that at the reference wavelength,

**V**is within a column permutation of the identity matrix

**I**. The values produced using a double-precision library SVD are

In keeping with the notation from the SVD discussion in Section 2, we define a vector **p̂** = **S**
^{−1}
**ĝ**. From this definition,

It follows that

For this optimal **W**, not only does the associated **V** map each modal coefficient to a single Stokes parameter in Eq. (2), the estimated Stokes vector *is* the modal vector **p̂**; that is,

to within the permutation of the Stokes parameters imposed by the permutation matrix **P**. Sabatke, et al. [5] point out that in general, the above relation doesn’t hold; but for an ideal, optimal polarimeter, it is indeed true that **s̃** = **P p̂**. In this case, there is nothing to be gained by calculating the modal SRRs and applying an appropriate weight; one could do as well calculating the SRRs for the Stokes parameters themselves. However, **V** is very sensitive to wavelength and rapidly becomes non-diagonal for wavelengths different than the design wavelength of 633 nm. Recalculating **W** with the same optical component values and a wavelength just 5 nm higher, for example, the SVD yields

Clearly, each **p̂** coefficient now couples to multiple Stokes parameters, just as, e.g., individual Fourier components of a discretely-sampled image or temporal signal couple to all pixels or time samples. The same degree of coupling can be observed by applying the SVD to the average of **W** over a finite wavelength band; for example, averaging the wavelength-dependent **W** matrices used in this paper over 85 nm and then factoring yields

We note the practical utility of the wavelength-averaged **W** is a matter for further study. As we have previously pointed out, the inverse of such a matrix does not yield a formally correct solution for **s**; however, similar approaches have been used to find approximate solutions in other applications; e.g., deconvolution of wavelength-average point-spread functions from turbulence-degraded images [18].

The reason for the observed coupling behavior is that, as discussed in the references we have cited, optimizing the design of a DoAm polarimeter means minimizing the condition number of **W**. In Fig. 1, we plot the L2-norm condition number of **W** matrices calculated over a range of wavelengths using optics designed to produce the optimum **W** at 633 nm; note the clear minimum at 633 nm.

The condition number associated with the L2 norm of any matrix is the ratio of the largest and smallest singular values. While Hansen [19] points out that the following property is perhaps impossible to prove, it is easily observed that as singular values decrease relative to the largest, the associated singular vectors become more oscillatory; that is, there are relatively more sign changes from element to element in the singular vectors. This draws one to think of the singular vectors heuristically as frequency eigenmodes; however, as the condition number decreases, the singular values become degenerate and the singular vectors all tend toward the same, small number of element-wise sign changes. At the limit point of this process (the optimally-conditioned matrix), many or all of the singular values are degenerate, and the associated singular vectors have only one non-negligible element. Arranged as matrix columns, these vectors form a **V** matrix that is diagonal to within a column permutation. For any more realistic scenario, however, including data acquisition off the design wavelength or over a finite bandwidth, the condition number is not minimized. The singular modes of **W** then couple to multiple Stokes parameters, and modal weights (or “filters”) can be developed as functionals of the modal SRR for optimal Stokes parameter estimation.

#### 3.2. The modal signal

From the above discussions, we may be tempted to write the broad-band mode-space representation of the Stokes vector as

where *n* is the index for *N* spectral filter bands, each centered about some wavelength *λ _{n}* and assumed to be narrow enough that the optimality metric for

**W**and the relevant optical transmissions change very little over the band. The factor

*f*is the spectral weighting provided by the detector quantum efficiency and transmission of the filter and optics in the

_{n}*n*-th band, and

**s**

_{n}is the Stokes vector for light with wavelength bounded by the limits of the

*n*-th band. Note the weighting is defined such that Σ

_{n}f

_{n}= 1, conserving energy.

However, **W** is actually a function of wavelength because of the wavelength-dependent response of optical components; as previously mentioned, the practical implications of this fact were reported by Boger, et al. [13]. We demonstrate them here in Figs. 2 and 3. In Fig. 2, we have plotted the simulated photo-detection events (PDEs) measured by a four-detector DoAm polarimeter exposed to unpolarized light. The *x*-coordinate of each diamond in the plot is the center of a 10-nm band, and the *y*-coordinate is the number of PDEs collected at each detector when the instrument is exposed to unpolarized light in the associated band. The measurement matrices, one for each band, were calculated as described in the next section. The optical components represented by the measurement matrix are again for the Phenis, et al. [16] type of polarimeter. The incident light is at a brightness such that the total number of photons collected by all four detectors is 1000. Note an equal number of PDEs are observed at each of the four detectors only for the band centered at the design wavelength of 633 nm; in all other bands, the number of PDEs at each detector differs substantially. If the source is observed over any band other than the small band about the design wavelength, and the reconstruction matrix for the design wavelength is used to estimate the Stokes parameters, a non-zero degree of polarization would be estimated even though the incident light was actually unpolarized. Finally, note that were we to normalize the plot so the energy at the detectors sums to unity, the plotted points at each wavelength would be the elements in the first column of the measurement matrix associated with that wavelength. This is because for unpolarized light, only the *s*
_{0} element of the incident Stokes vector is non-zero.

In Fig. 3, we have plotted the absolute differences between the PDE count on the (arbitrarily labeled) first detector and the other three. Rather than plot the differences for each small band, we now plot the differences as the spectral bandpass is increased 20 nm at a time by successively adding 10-nm sub-bands to both sides of the previous band, starting with a 10 nm band centered at 633 nm. To show the significance of the PDE differences, the noise due to Poisson fluctuations averaged over the four detectors is also shown. Note that because the rms shot noise increases as the square root of the number of PDEs, the PDE differences will become more and more significant relative to noise as the incident brightness increases.

The preceding discussion leads us to conclude that, within the limits our approximating a continuous spectral band as a sum over small discrete bands, the “broad-band” expressions for **g** and **ĝ** are

and

where **U**
^{T}
_{r} is the **U**
^{T} factor for the synthesis matrix calculated at a reference wavelength; that is, the wavelength for which the polarimeter is designed or for which **W** is optimized using, for example, the EWV. For the purposes of this paper, we define the “signal” as the ensemble mean of the noisy measurements, back-projected to mode space: 〈**ĝ**〉.

#### 3.3. The modal bias

By “modal bias,” we mean | 〈**ĝ**〉 − **g**|, the absolute difference between the mean of representation of the measured data (the signal) and the modal representation of the actual input Stokes vector. Since the SVD of any matrix is unique, the left and right singular vectors associated with the various matrices **W**
_{n} also vary as a function of wavelength. For the broadband SRR expression to be meaningful, the signal and noise expressions must be with reference to a common set of modes. To successfully write the SRR, then, we note that a.) the row and column *dimensions* of **W** don’t change as the input wavelength varies, and b.) all **V**
^{T} and **U** matrix factors are orthogonal matrices regardless of wavelength. These facts mean the **V**
^{T} and **U** factors at any two wavelengths can differ *at most by a simple rotation*. As a consequence, the factors of the individual measurement matrices can be easily re-written to express the signal at a particular wavelength in terms of the matrix modes at another.

Consider the matrix **W**
_{r}, the measurement matrix again at a reference wavelength *λ _{r}*. Now we define

*N*-1, 4 × 4 forward rotation matrices

**R**

^{(4)T}

_{n}to rotate the four rows of the matrix factor

**V**

^{T}

_{n}to align with the rows of

**V**

_{r}. Since for any rotation matrix

**R**

^{−1}=

**R**

^{T}, we can write the matrix

**W**

_{n}as

Note because **R**
^{(4)T}
_{n} rotates the rows of **V**
^{T}
_{n} to align with the rows of **V**
^{T}
_{r}, the Stokes vector **s**
_{n} is now projected onto the rows of **V**
^{T}
_{r} instead of the rows of **V**
^{T}
_{n}. Note also the singular value matrix **S**
_{n} is also “rotated” and is here denoted **G**
_{n}. The rotated gain matrix **G**
_{n} is not generally diagonal because the right singular vectors of **W**
_{r}, onto which the input has been projected, are generally linear combinations of the modes of **W**
_{n}; gain on one mode of **W**
_{r} is generally gain on more than one mode of **W**
_{n}. The mapping applied by **W**
_{n} can thus be alternatively interpreted using the rotation matrices as follows:

**p**_{nr}= (**V**_{n}**R**^{(4)}_{n})^{T}*f*_{n}**s**_{n}. The filter-weighted input vector*f*_{n}**s**_{n}in the*n*-th spectral band is projected onto the right singular vectors of**W**_{r}, the measurement matrix at the reference wavelength.**g**_{n}=**G**_{n}**p**_{nr}. The elements of**p**_{nr}are each weighted by the rotated gains to properly scale the projections in terms of the modes of**W**_{n}, the measurement matrix for the*n*-th band. Put another way, the projections are rotated back onto the modes of**W**_{n}before being scaled by the singular values in**S**_{n}.

Rather than rotate the inputs back onto the modes for **W**
_{n}, however, we desire to express the signal from all bands in terms of the singular modes for **W**
_{r}. This means we write the broadband gain-weighted signal **g**
_{b} in modal space as the sum of all **p**
_{nr} vectors, weighted by the *reference-wavelength* singular values:

This result leads directly to an expression for the *j*-th element of the bias vector **b**:

where the first term is the ensemble average of Eq. (18) and the noise is zero-mean about the signal.

#### 3.4. The modal noise

The vector of detector noise variances in the modal or singular-value space is given by the diagonal elements of the modal covariance matrix **C**
_{m}:

where **n** is an *K*-element vector of noise realizations and < * > indicates ensemble expectation. The matrix **U**
^{T} projects the noise vector back into modal space. In all polarimeter configurations we can think of, the noise processes will very likely be statistically independence between detectors (or pixels on *K* focal planes); however, the off-diagonal elements of **C**
_{m} are not generally zero because of the correlation induced by **U**
^{T}. The variances in the estimated mode coefficients themselves are represented by the diagonal elements, and the covariance between modes by the off-diagonal elements.

### 3.4.1. Read noise

For zero-mean noise like CCD read noise, the second term in Eq. (22) is zero, so the diagonal elements of **C**
_{m} are

This leads to

$$\phantom{\rule{.9em}{0ex}}=\u3008\left[{U}_{j1}^{T}{n}_{1}+{U}_{j2}^{T}{n}_{2}+\dots +{U}_{jK}^{T}{n}_{K}\right]\phantom{\rule{.2em}{0ex}}\left[{U}_{j1}^{T}{n}_{1}+{U}_{j2}^{T}{n}_{2}+\dots +{U}_{jK}^{T}{n}_{K}\right]\u3009.$$

With statistically independent noise realizations at each detector, this simplifies to

where *σ*
^{2}
_{k} is the variance in the *k*-th detector.

### 3.4.2. Shot noise

For shot noise, Eq. (22) can still be simplified to the form given by Eq. (25) as follows. The underlying random process is now Poisson, and the noise is not zero-mean. This means the diagonal elements of **C**
_{m} are now

Poisson noise fluctuations remain statistically independent between detectors, so the cross-terms (with *k* ≠ *q*) in the first term above are canceled by the corresponding *k* ≠ *m* cross-terms in the second term. This leaves

where 〈*d _{k}*〉 = 〈(

**W**)

_{s}_{k}〉 is the mean (true) value of the signal in units of photo-detection events (PDEs) at the

*k*-th detector, assuming the randomness in the source itself is not significant. For broadband data, again modeled as

*N*narrow bands, we have

If *s*
_{0} is a random variable (due either to source fluctuations or fluctuations in the propagation medium), this extra noise source must be included. This scenario can be handled by treating the measured energy as a doubly-stochastic random variable, but we don’t need to include this treatment to understand the effects of varying the filter bandwidth and plan to discuss this effect in further work.

#### 3.5. Modal SRR expression

Combining Eqs. (18), (25), and (28), we obtain the expression for the SRR of the estimated coefficient for the *j*-th (of 4 possible) mode of **W**:

Equivalently, we say that Eq. (29) gives the SRR for the *j*-th coefficient of **ĝ**. Note this expression would be evaluated for all 4 modes to obtain the SRR for the full singular value spectrum of **W**. Again, *N* is the number of spectral bands. The read noise, independent of wavelength, goes equally into all spectral bins. The variances in the denominator account for shot noise and read noise fluctuations in the selected mode of the reconstruction matrix.

#### 3.6. Calculating the rotation matrices

The rotation matrix for the *n*-th band is calculated by solving

Transposing both sides of the above relations and using the identity (**A B**)^{T} = **B**
^{T}
**A**
^{T} yields

#### 3.7. Noise in the estimated Stokes vector

To see the effect of modal filtering on Stokes estimation, we will find it useful in subsequent work to evaluate the noise in the estimated Stokes parameters and compare it to the bias introduced by increasing the spectral bandwidth. To do this, we calculate the Stokes parameter noise in terms of the modal covariance matrix **C**
_{mr}, where **C**
_{mr} is evaluated using the reference-wavelength matrix **U**
^{T}
_{r} and the noise contributions from both read noise and shot noise from incident light summed over all *N* bands. Following the arguments presented in the preceding noise discussions, the modal covariance matrix elements are seen to be

where *σ*
^{2}
_{k(total)} is the noise at the *k*-th detector summed over all bands from both shot and read noise sources:

Using Eq. (22) with the reference-wavelength synthesis matrix **W**
^{−1}
_{r} in its factored form **V**
_{r}
**S**
^{−1}
**U**
^{T}
_{r} and again using the identity (**A B**)^{T} = **B**
^{T}
**A**
^{T}, the covariance matrix for the estimated Stokes parameters is:

The diagonal elements of **C**
_{S} give the variance associated with the four estimated Stokes parameters. This result can be easily generalized to include the effect of a modal filter:

## 4. Example SRR calculations

#### 4.1. Numerical codes

To demonstrate the competing effects of bias and variance as the spectral filter width is varied, we plot the modal SRR against filter width for 4 values of source brightness (or four different exposure times). To evaluate the expressions derived here, we wrote computer codes to a.) calculate the required **W** matrices at 24 wavelengths over a 120-nm band; b.) calculate the SVD factors and rotation matrices and c.) calculate the SRR. The matrices are assumed to represent the (approximately constant) response of the polarimeter over the various 5-nm bands.

To calculate the required measurement matrices at various wavelengths, we wrote a simulation of a DoAm polarimeter similar to one described in Phenis, et al. [16]. In addition to the elements for each **W** matrix, the code also evaluates the Sabatke, et al. [6] metric as a function of the beam-splitter assembly (BSA) [16] parameters. For the reference wavelength of 633 nm, optimal values for the beam splitter S- and P-polarization transmission/reflection ratios and the fast axes and retardances for both wave-plates (6 parameters) were chosen by varying the value of one parameter at a time (“line search”) to minimize the Sabatke, et al. metric. The non-reference matrices were then calculated by varying the wavelength while holding the BSA parameter values constant. For a detailed description of the optimization procedure, see Mudge, et al. [20].

The SRR code loops through the 24 sub-bands, modeling the distribution of light among the detectors using the appropriate **W** matrix. As the simulated bandwidth is increased, the cumulative signal is calculated using **U**
^{T}
_{r} as in Eq. (18), and the cumulative noise is calculated Eqs. (25) and (28). The matrices **V**
^{T}
_{n} and **R**
^{T}
_{n} are used in Eq. (21) to calculate the cumulative bias. Once the loop is finished, the modal and Stokes covariance matrices are calculated using Eqs. (32) and (34).

#### 4.2. SRR results

Here, we evaluate the modal SRR (the SRR for the coefficients of **ĝ**) vs. spectral filter width for the case of the DoAm polarimeter described by Phenis, et al. [16]. The reference wavelength is again 633 nm. For all but one of our examples, the polarization of and energy from the modeled source are both constant for each sub-band of the full 120-nm spectrum, with the Stokes vector of the incident light proportional to

Referring to Sec. 3.1, one can verify that for the reference wavelength, this input Stokes vector couples equal energy to three of the four right singular vectors of the measurement matrix.

We plot the SRR only for three of the modal coefficients (denoted as modes 2, 3, and 4) noting that to a very good approximation, it’s generally true that the “first” coefficient (associated with the largest singular value) couples only to *s*
_{0}, the total intensity. One may confirm this by inspecting the top rows of the matrices in Sec. 3.1. Estimation of total intensity isn’t biased by the wavelength dependence of **W**, so the SRR of “mode 1” simply increases monotonically with filter width. We note that high-SRR estimation of *s*
_{0} can be used to constrain estimation of the other Stokes parameters, but there is otherwise little interesting in the behavior of the mode 1 SRR.

The first scenario is one for which the source is bright enough or the exposure time long enough to record 1 × 10^{5} photo-detection events (PDEs) over the 5-nm reference band. The results are shown in Fig. 4. Note the SRR maxima occur at a filter width of 20 and 15 nm for modes 2 and 3, respectively. The SRR maxima are relatively narrow, as well. The SRR for the mode 4 coefficient decreases monotonically with filter width. This behavior, with the optimum width decreasing as the mode index increases, implies that for a relatively large number of PDEs, the effect of bias on the SRR is larger for increasing mode index; *viz.* for modes corresponding to smaller singular values. We argue that this can be understood by considering Eqs. (17), (18), and (28). We observe that if a large number of PDEs are measured relative to read noise, a smaller singular value means both a lower SRR and a lower variance on the corresponding mode. Further, the signal and variance both increase more slowly for modes with smaller singular values than those with larger. These facts imply that for this high-PDE scenario, the effect of bias become more significant as the mode index increases, consistent with our results. The same effect can be observed in Fig. 5, where the per-band PDE count is still very large (5 × 10^{4}), but has been reduced enough that the mode 4 SRR increases with filter width before finding a maximum at 10 nm.

In Fig. 6, the reference-band PDE count has been reduced to 1000. Comparing these plots to those in Fig. 5, one observes the SRR maxima have decreased and the optimum filter widths has increased for all modes as the PDE count decreases. One also observes that the SRR curves for mode 3 and 4 coefficients cross over. For 5 × 10^{4} PDEs, the crossover occurs at a width of about 102 nm; for 1000 PDEs, about 107 nm. For contrast, we show in Fig. 7 the SRR plots for the same per-sub-band PDE count but with all light incident unpolarized for wavelengths outside the small band centered at 633 nm. Here, large bias on all modes, caused by the abrupt and marked polarization difference between light inside and outside the narrow reference band, causes the SRR to rapidly decrease with increasing filter width.

Finally, the SRR curves are plotted for a per-band PDE count of 100 and an rms read noise of 25 electrons in Fig. 8. Here, SRR curves for mode 2 and 4 coefficients appear to have very broad maxima at 110 nm and 90 nm, respectively. Interestingly, the mode 3 SRR curve has a somewhat sharper maximum at a filter width of 80 nm (increased from 50 nm for the 1000 PDE case); however, all three plots curves continue the trend of having broader maxima at larger filter widths as the number of PDEs in the measurement decreases.

## 5. Applications of the SRR metric

These results imply that *a priori* hypotheses (such as the size of the band over which the source polarization might reasonably be slowly-varying) and knowledge (such as the brightness of the source) can play a significant role in selecting DoAm polarimeter operating parameters for optimal Stokes estimation. For example, if the source being studied is known to be polarized over only a very narrow band, our techniques, along with knowledge of instrument transmission and noise characteristics, can be used to establish the exposure time required to achieve a required degree of estimation fidelity in terms of the SRR and how much increased bandwidth can be traded against a shorter exposure time. Also, the expected signal, noise, and bias can all be evaluated vs. spectral filter width and exposure time using the SRR expressions. Further, optical components can be selected for the polarimeter design based on how they affect the SRR as a function of spectral width; for example, we mentioned previously that using these techniques, one could specify the bandwidth over which the response of retarders and polarizing beam splitters must be relatively invariant to achieve a required estimation fidelity. Finally, our approach allows one to make optimal use of wavelength- and filter-width-selective optics when observing different sources with a variety of brightness and polarization properties. Examples of such components range from a simple filter wheel to continuously-tunable filters, like stacked Fabry-Perot or Michelson interferometers.

Our results, showing SRR maxima at different filter widths for the three matrix modes, lead rather naturally to the questions of estimation metrics and of modal filtering. One can readily envision techniques to exploit the modal SRR representation. For example, given that the three SRR maxima generally occur for different spectral filter widths, it’s legitimate to ask which SRR should be maximized by our choice of spectral width. To answer that question, we can examine the columns of the relevant **V** matrix to understand the coupling of modal coefficients to Stokes parameters. One way of quantifying this coupling is by calculating the kurtosis of the columns. Kurtosis is a measure of spread and “peakedness” in a distribution relative to that of a Gaussian distribution; here, increasingly negative kurtosis (“platykurtic,” or flat relative to a Gaussian) indicates higher coupling between the singular vector and the various Stokes parameters. In Sec. 3.1, we showed a **V** matrix obtained by factoring the average of the over 85 nm of the reconstruction matrices calculated for the Phenis, et al., polarimeter. Calculating the kurtosis for the second, third, and fourth columns yields -1.696, -1.999, and -2.01, respectively (note all three kurtosis values were calculated using the absolute values of the column elements). Given the SRR plots shown in Fig. 6, then, one might recognize the significance of the larger column kurtosis of modes 3 and 4 by selecting a filter width of, say, 40 nm; this is mid-way between the mode 4 maximum at 35 nm and the mode 3 maximum of 45 nm and well short of the mode 2 maximum at 55 nm.

Other applications are also possible. For example, Goudail and Bénière [21] propose an active imaging scheme where backscattered laser illumination is analyzed with polarizing optics prior to image formation. The polarization states of the illuminator and analyzer are chosen to optimize the contrast between regions of the target characterized by different Mueller matrices. Rather than use a single, optimal analyzer state for data collection, one could use a complete polarimeter and filter the associated matrix modes to optimize the contrast; it is reasonable to suppose that an optimally-weighted, linear combination of polarimeter modes would yield higher contrast, and therefore more effective image segmentation, than measurements using any single analyzer state.

Finally, we note that Tyo, et al. (2010) [22] discuss optimization of a partial Mueller polarimeter (PMP) for detection, discrimination, classification, or identification tasks. Using *N* polarizer/analyzer states to probe an unknown Mueller matrix yields a reconstruction matrix to invert the measured data. For cases where only partial knowledge of the Mueller matrix is needed, or where temporal evolution of the scene or other factors limit the number of measurements, Tyo, et al. (2010) define “scene” and “sensor” vector spaces as well as metrics to quantify the distance between these spaces. These developments allow one to optimally select PMP components or some number of measurements less than the 16 required to estimate the full Mueller matrix. As an additional practical metric, Tyo, et al. (2010) present an SNR analysis rather similar to our own modal SRR derivations, defining a signal-to-noise for the scene space basis vectors according to their geometric relation to the sensor space and induced noise correlations. As with our approach, the Tyo, et al. (2010) approach allows post-detection filtering to further optimize reconstruction processing or develop an *a priori* experiment design.

## 6. Summary and conclusions

We have developed a technique to quantitatively assess the competing effects of noise and deterministic error, or bias, as bandwidth is varied. As a metric, we used the signal-to-rms error (SRR) associated with estimating coefficients for the projection of the Stokes vectors onto the right singular vectors of the measurement matrix. To facilitate our treatment, we developed “broad band” expressions for the signal, bias, and noise in the singular vector space of a DoAm polarimeter. We then presented modal SRR plots for a DoAm polarimeter similar to one built at the Lockheed Martin Advanced Technology Center [16, 20].

Taken together, the SRR results indicate that for a high-DoP source with polarization varying slowly over a relatively wide spectral band (∆*λ*/*λ* ≥ 0.05), the SRR plots exhibit a 3-region (“high,” “low,” and “nominal” PDE counts) behavior:

- For PDE counts in each small band large enough to yield relatively high SNR, the bias resulting from increasing the spectral filter bandwidth begins to degrade SRR on modes 2, 3, and 4 at very small filter widths. The difference in optimum filter width for the three modes is relatively small.
- For relatively small per-band PDE counts, the reduction in variance for larger filter widths means the SRR on modes 2, 3, and 4 continues to increase even for relatively large widths.
- There is a region of “nominal” PDE counts giving well-defined maxima on modes 2, 3, and 4.

These results can be used to specify optical filter widths or exposure times given hypotheses or prior knowledge regarding the source brightness or expected width of the band over which the source polarization is slowly-varying. The behavior of SRR plots will of course vary not only with brightness, but also the distribution of polarization with wavelength, the bandwidth over which the source has significant brightness, and detector noise and quantum efficiency properties. Finally, we argue that calculating the SRRs for the measurement matrix “modal” coefficients, rather than the for estimated Stokes parameters, allows development of reconstruction filters to minimize Stokes estimation error.

We draw the reader’s attention to the fact that the techniques presented here could be used to develop optimal-estimation inverse filter matrices and study the effect of bandwidth on almost any type of polarimeter (e.g., division-of-time, division-of-aperture) as long as the operation of the instrument can be modeled as a linear system for relatively narrow-band light. Further, these concepts can be extended to any linear system where the matrix elements are a function of some relevant parameter. For example, the point-spread function (PSF) of any electro-optical imaging system will vary with wavelength due to diffraction, dispersion, and wavelength-dependent aberrations. For any particular wavelength, the PSF can be described by a matrix (see, e.g., Nagy, et al. [23]). The elements of the PSF matrix would depend, as with our polarimeter, on wavelength. If an observer desired to deconvolve the PSF corresponding to a single wavelength from an image acquired using a relatively wide spectral filter, our technique would enable bias and variance to be calculated as a function of filter bandwidth; an optimal filter matrix **F** can then be designed using the associated SRRs.

## Acknowledgments

We both gratefully acknowledge Internal Research & Development funding from the Lockheed Martin Space Systems Company Advanced Technology Center. DWT also warmly acknowledges clarifying conversations with Robert Plemmons, Reynolds Professor of Mathematics and Computer Science at Wake Forest University; and Sudhakar Prasad, Professor of Physics and Astronomy at the University of New Mexico. Bob was also gracious enough to review an early draft of the manuscript and offer several helpful suggestions. Finally, we are both pleased to acknowledge the contributions of our anonymous referees, from whose contributions this paper benefited significantly.

## References and links

**1. **R. M. A. Azzam, “Division-of-amplitude photo-polarimeter for the simultaneous measurement of all four Stokes parameters of light,” J. Mod. Opt. **29**, 685–689 (1982).

**2. **R. A. Chipman, “Data reduction for light-measuring polarimeters,” in *Handbook of Optics*, 3 ed., Vol. 1, Ch. 15, Sec. 20, McGraw-Hill (1995)

**3. **A. Ambirajan and D. C. Look, “Optimum angles for a polarimeter: Part I,” Opt. Engr. **34**, 1651–1655 (1995). [CrossRef]

**4. **A. Ambirajan and D. C. Look, “Optimum angles for a polarimeter: Part II,” Opt. Engr. **34**, 1656–1658 (1995). [CrossRef]

**5. **D. S. Sabatke, A. M. Locke, M. R. Descour, W. C. Sweatt, J. P. Garcia, E. L. Dereniak, S. A. Kemma, and G. S. Phipps, “Figures of merit for complete Stokes polarimeter optimization,” in *Polarization Analysis, Measurement, and Remote Sensing III*, D. B. Chenault, M. J. Duggin, W. G. Egan, and D. H. Goldstein, eds., Proc. SPIE413375–81 (2000). [CrossRef]

**6. **D. S. Sabatke, M. R. Descour, E. L. Dereniak, W. C. Sweatt, S. A. Kemme, and G. S. Phipps, “Optimization of retardance for a complete Stokes polarimeter,” Opt. Lett. **25**, 802–804 (2000). [CrossRef]

**7. **J. S. Tyo, “Noise equalization in Stokes parameter images obtained by use of variable-retardance polarimeters,” Opt. Lett. **25**, 1198–1200 (2000). [CrossRef]

**8. **J. S. Tyo, “Design of optimal polarimeters: Maximization of signal-to-noise and minimization of deterministic error,” Appl. Opt. **41**, 619–630 (2002). [CrossRef] [PubMed]

**9. **V. L. Gamiz and J. F. Belsher, “Performance limitations of a four-channel polarimeter in the presence of detection noise,” Opt. Engr. **41**, 973–980 (2002). [CrossRef]

**10. **M. R. Foreman, C. M. Romero, and P. Török, “A priori information and optimization in polarimetry,” Opt. Express **16**15212–15226 (2008). http://www.opticsexpress.org/abstract.cfm?URI=oe-16-19-15212 [CrossRef] [PubMed]

**11. **E. Wolf, “Unified theory of polarization and coherence,” in *Introduction to the Theory of Coherence and Polarization of Light* (Cambridge University Press2007)pp. 174–201.

**12. **J. S. Tyo, D. L. Goldstein, D. B. Chenault, and J. A. Shaw, “Review of passive imaging polarimetry for remote sensing applications,” Appl. Opt. **45**, 22 5453–5469 (2006). [CrossRef] [PubMed]

**13. **J. Boger, D. Bowers, M. P. Fetrow, and K. Bishop, “Issues in a broadband 4-channel reduced Stokes polarimeter,” in *Polarization Analysis, Measurement, and Remote Sensing IV*, D. H. Goldstein, D. B. Chenault, W. G. Egan, and M. J. Duggin, eds., Proc. SPIE4481, pp. 311–321 (2002). [CrossRef]

**14. **G. H. Golub and C. F. Van Loan, “Orthogonality and the SVD,” in *Matrix Computations*, 3rd ed., Johns Hopkins University Press, p. 70 (1996).

**15. **D. J. Kadrmas, E. C. Frey, and B. M. W. Tsui, “An SVD investigation of modeling scatter in multiple energy windows for improved SPECT images,” IEEE Trans. Nuc. Sci. **43**, 2275–2284 (1996). [CrossRef]

**16. **A. M. Phenis, M. Virgen, and E. de Leon, “Achromatic instantaneous Stokes imaging polarimeter,” in *Novel Optical Systems Design and Optimization VIII*,” J. Sasian, R. Koshen, and R. Juergen, eds., Proc. SPIE5875, pp. 587502-1–587502-8 (2005).

**17. **E. de Leon, R. Brandt, A. Phenis, and M. Virgen, “Initial results of a simultaneous Stokes imaging polarimeter,” in *Polarization Science and Remote Sensing III*, J. Shaw and J. S. Tyo, eds., Proc. SPIE6682, 668215–668215-9 (2007). [CrossRef]

**18. **M. C. Roggemann, D. W. Tyler, and M. F. Bilmont, “Linear reconstruction of compensated images: Theory and experimental results,” Appl. Opt. **31**, 7429–7441 (1992). [CrossRef] [PubMed]

**19. **P. C. Hansen, “The smoothing property of the kernel,” in *Rank-Deficient and Discrete Ill-Posed Problems*,” SIAM Press, Philadelphia, p. 8 (1998).

**20. **J. D. Mudge, M. A. Virgen, and P. Dean, “Near-infrared simultaneous Stokes imaging polarimeter,” in *Polarization Science and Remote Sensing IV*, J.A. Shaw and J.S. Tyo, eds., Proc. SPIE7461, 74610L (2009). [CrossRef]

**21. **F. Goudail and A. Bénière, “Optimization of the contrast in polarimetric scalar images,” Opt. Lett. **34**(9), 1471–1473 (2009). [CrossRef] [PubMed]

**22. **J. S. Tyo, Z. Wang, S. J. Johnson, and B. G. Hoover, “Design and optimization of partial Mueller polarimeters,” Appl. Opt. **49**, 2326–2333 (2010). [CrossRef] [PubMed]

**23. **J. G. Nagy, R. J. Plemmons, and T. C. Torgersen, “Iterative image restoration using approximate inverse preconditioning,” IEEE Trans. Image Proc. **5**, 1151–1162 (1996). [CrossRef]