Neural network calibration of a snapshot birefringent Fourier transform spectrometer with periodic phase errors

David Luo; Michael W. Kudenov

doi:10.1364/OE.24.011266

1. Introduction

Phase errors have been described by Mertz [1] and Forman [2], and these two technique represent the two most popular methods of correcting phase errors in the field of Fourier transform spectroscopy. In the Mertz method, phase correction is performed in the frequency domain, while the Forman method applies the phase correction in the spatial domain. A combination of the two methods, using an iterative FFT shift theorem, has also been described [3]. In a typical interferometer, phase errors can contain both systematic and random contributions. Systematic phase errors are generally reproducible and happen predictably, e.g., phase errors caused by dispersion within a Michelson interferometer [4]. Meanwhile, random phase errors are manifested by noise and can include random variations in sampling error in, for example, a step-scan Fourier transform infrared spectrometer (FT-IR) [5]. Failures on applying proper phase correction algorithms on the asymmetric interferogram lead to increased error in the reconstructed spectra. For spatial heterodyne spectrometers without a moving mirror or scanning requirements [6], systematic phase errors are caused by refractive index dispersion or lens distortion. Correcting these effects can be accomplished by measuring a monochromatic interferogram, which can be used to correct later measured polychromatic interferograms, assuming that the phase error is frequency independent [7]. For frequency dependent phase error, the phase correction procedure is applied in the spectral domain by multiplying the Fourier transform interferogram with pre-calculated phase errors obtained from a multiline source [8]. However, application of these phase correction techniques creates additional computational overhead. For instance, the Mertz phase correction technique often requires several forward and inverse fast Fourier transforms (FFTs), in addition to interpolation. Performing this procedure on each interferogram from a high resolution imaging Fourier transform spectrometer can be computationally overwhelming for high throughput measurements [9].

In this paper, we present a method that has greater speed to process many parallel interferograms while simultaneously accounting for phase errors. Neural networks (NNs) are well known for their ability to identify statistical significance as applied to pattern recognition [10]. However, an additional and less studied application of NNs are their use in providing a method to empirically calibrate various sensors and systems to overcome systematic errors [11–13]. In the case of sensor calibration, where the sampling rate is fixed and the phase errors are systematic and reproducible, NNs can offer great advantages for realizing real-time operation. Such is the case with birefringent interferometers. In section 2, we describe the theoretical model and the nature of the phase errors. Section 3 contains various calibration methods that were studied to reconstruct measured data. Section 4 demonstrates our proof of concept experimental configuration. Lastly, section 5 shows a comparison of the results, garnered from the proposed NN-based spectral calibration, as compared to both linear and Fourier-transform based approaches.

2. Theoretical model of systematic phase errors

Random phase errors, caused by variations in scanning and sampling, have been described in previous work [14]. However, in a birefringent interferometer, phase errors of this nature are uncommon since the optical path difference is generally fixed with respect to the detector (i.e., there is no scanning, nor are there any moving parts). In our previously described birefringent interferometer, it is often the case that many interferogram segments are collected by multiple apertures or across multiple regions of a focal plane array (FPA) [9, 15]. In the case of our SHIFT spectrometer [9], the phase error $ϕ_{s} (x)$ is step-like and occurs primarily when transitioning from column to column. In such a system, a discrete superposition of monochromatic interference patterns $I (x)$ can be described by

I (x) = \frac{1}{2} \sum_{m}^{M} [1 + \cos (2 π O P D (x) σ_{m} + ϕ_{s} (x))],

where

O P D (x)

is the optical path difference,

σ_{m}

is the wavenumber (

σ_{m} = 1 / λ_{m}

), and

M

is the total number of sampling points. The step-like phase error can be expressed as a square wave such that

ϕ_{s} (x) = \frac{d N}{2} {1 + sgn [\sin (π x / P)]},

where

d N

is the peak phase error, in radians, and

P

is half the period of the square wave. This type of phase error generates a reconstruction artifact that creates a periodic error after Fourier transformation. Simulated results of white light spectra, calculated for wavelengths spanning 450 nm to 700 nm, demonstrates the phase error’s effect on the reconstructed spectra when an FFT is used. In this case, spectra are calculated from the double sided interferograms [16] as

I (σ) = | F F T (I (x)) |,

where

F F T

represents the numerical fast Fourier transformation. The interferogram, affected by three different step-like phase functions when

d N

is equal to 0π, π/2, and π, are shown in Fig. 1(a)-1(c), respectively.

Fig. 1 The interferogram (l) contaminated with three different phase step ( $ϕ_{s}$ ) functions of $d N$ at (a) 0π, (b) π/2, and (c) π.

Download Full Size | PDF

The Root Mean Square (RMS) error is calculated between the reconstructed spectra with and without phase error as

R M S = \frac{1}{N_{s i m}} \sqrt{\sum_{n = 1}^{N} {(I_{i d e a l} (σ) - I_{c o n t} (σ))}^{2}},

where

I_{i d e a l} (σ)

refers to the reconstructed spectra without phase error,

I_{c o n t} (σ)

denotes the reconstructed spectra contaminated with phase error, and

N_{s i m}

is the length of the simulated interferogram. Figure 2(b)-2(d) illustrates the reconstructed spectra extracted from the RMS error plot in Fig. 2(a) when

d N

is 0π, π/2, and π, respectively. Due the nature of the cosine term in the interferogram per Eq. (1), when

d N

is 0π and 2π, ideal recovery of the contaminated spectra is obtained. Conversely, the worst error occurs when

d N

is approximately

π \pm 0.09 rad

.

Fig. 2 (a) The RMS error plot of the reconstructed spectra calculated with and without phase error. Ideal and contaminated spectra are shown in (b-d) for $d N$ equal to (b) 0π, (c) π/2, and (d) π.

Download Full Size | PDF

3. Experimental configuration

While these phase errors are common in the 2-dimensional (2D) SHIFT spectrometer, to simplify the presented studies, a representative 1-dimensional (1D) experiment was configured. A schematic of this 1D Birefringent Fourier Transform Spectrometer (BFTS) is depicted in Fig. 3 [17]. It consists of a generating polarizer (G) which linearly polarizes incident light at 135 degrees relative to the x axis. A quartz Wollaston prism (WP), which has a wedge angle of 6.2 degrees, splits the light into two orthogonal components that then transmit through an achromatic quarter wave plate (QWP) oriented with its fast axis at 45 degrees. This converts the two orthogonal linear polarizations, exiting the Wollaston prism, into two circular components. A relay lens (L4) then re-localizes the prism’s interference onto a phase mask (PM). This mask, which is created as a louvered waveplate [18], enables the experimental simulation of piecewise linear phase errors across the WP’s interference pattern. After transmission through the phase mask, the light is then analyzed by a linear polarizer (A) with a transmission axis of 0 degrees with respect to the x axis. Finally, the light is relayed from the phase mask onto a focal plane array (FPA) by relay lens L5.

Fig. 3 Schematic of the BFTS used for validating the calibration procedures.

Download Full Size | PDF

3.1 Phase mask

The polarization phase mask (PM) consists of a single $25 \times 25$ mm glass substrate, which was coated with a polymerized liquid crystal-based half wave plate (HWP) optimized for a 600 nm wavelength. As illustrated in Fig. 4(a), the phase mask was divided into two regions, A and B. For region A, the HWP’s fast axis orientation varied periodically between 0° and 45°, while for region B, its fast axis was varied between 0° and 22.5° along the x axis. The period (Λ) for both regions is 1 mm. This periodicity creates a geometric phase shift, along the x axis, that acts like a step function when the Wollaston prism’s interference fringes are imaged onto it, enabling us to experimentally model the systematic phase error. The phase, which has been modulated into a simulated monochromatic interferogram, is shown in Fig. 4(b). We will refer to these two phase patterns as “phase A” and “phase B” for the remainder of this manuscript.

Fig. 4 (a) The half wave plate PM configuration showing the “phase A” and “phase B” regions with periodic fast axis orientation along x. (b) The phase ( $ϕ (m)$ ) step function that is generated, by the geometric phase shift, within regions A and B. This creates monochromatic interferograms which appear as (c) $I_{P M A}$ and (d) $I_{P M B}$ for regions A and B, respectively. These are compared to the ideal interferogram $I_{i d e a l}$ and plotted alongside the PM’s fast axis orientation to illustrate its relation to the geometric phase.

Download Full Size | PDF

BFTS interferograms, without the PM, were simulated at a wavelength of 633 nm, by using the Mueller matrix formalism. The BFTS’s Mueller matrix, without the PM, can be expressed as

S_{o u t} = M_{A} M_{Q W P} M_{W P} M_{G} S_{i n},

where

S_{i n}

and

S_{o u t}

are the Stokes vectors that describe the input and output light of the system, while

M_{G}

,

M_{A}

,

M_{Q W P}

, and

M_{W P}

are the Mueller matrices of the generator, analyzer, QWP, and WP, respectively. Assuming that

S_{i n} = {[\begin{matrix} 1 & 0 & 0 & 0 \end{matrix}]}^{T}

and,

S_{o u t} = {[\begin{matrix} S_{0} & S_{1} & S_{2} & S_{3} \end{matrix}]}^{T}

where T indicates the matrix transpose, then the light intensity, detected by the FPA, can be obtained by calculating the output from Eq. (5) and extracting the

S_{0}

component of

S_{o u t}

. Hence, a general mathematical form can be yielded as

I = \frac{1}{2} [1 + \cos (4 π B \tan (α) σ x)],

where

B

and

α

are the birefringence and wedge angle of the WP, respectively. Similarly, the BFTS’s Mueller matrix, with PM inserted, can be written as

S_{o u t} = M_{A} M_{H W P} (θ) M_{Q W P} M_{W P} M_{G} S_{i n},

where

M_{H W P}

is the Mueller matrix of the PM and

θ

is the fast axis orientation angle. Using Eq. (7), the interferogram with phase contamination can be expressed as

I (θ) = \frac{1}{2} [1 - \sin (2 θ) \cos (4 π B \tan (α) σ x) + \cos (4 θ) \sin (4 π B \tan (α) σ x)],

where

θ

is the fast axis orientation angle of the PM and is implicitly a function of x. Hence, when the fast axis of the PM transitions from 0 to 45°, there is a total geometric phase shift of π rad observed within the phase A region. Meanwhile, a smaller geometric phase shift of π/2 rad is observed in the phase B region. The simulation results, based on Eq. (6) and Eq. (8), are illustrated in Fig. 4(c) and 4(d), respectively.

3.2 Spectral sources for calibration

Calibration spectra were generated using a spectrum generator, based on a Digital Light Processing (DLP) projector module [19, 20]. A schematic of the BFTS, integrated into the spectral calibration setup, is depicted in Fig. 5. The BFTS’s field of view is filled by a 100 mm diameter integrating sphere. This sphere is illuminated by either (1) the DLP-based spectrum generator; or (2) a monochromator. The DLP-based spectrum generator consists of a 75 W xenon arc lamp that is reimaged onto a slit (S) by lens L0. Between the slit and L0 is located a longpass filter (F), which is used to block ultraviolet (UV) light, with wavelengths less than 400 nm, from the lamp. Light from the slit is then collimated by lens L1 through a polarization grating (PG) [21]. This PG takes the relatively unpolarized light from the lamp and diffracts it, with high efficiency, into a + 1st and −1st diffraction order that are orthogonally polarized. The + 1st order diffracted beam is then imaged onto the DLP chip directly by lens L2, and a mirror (M) is used to capture and redirect the −1st order onto the DLP chip. A final lens L3 is used to collect both beams after reflection from the DLP, and the beams are diverted into the integrating sphere’s entrance port. Ultimately, this configuration maps light of different colors onto the DLP. Based on the image that is loaded onto the DLP, the DLP’s micro-mirrors redirect the desired wavelengths into the integrating sphere and reject undesired wavelengths, creating an arbitrary spectrum once the light is homogenized inside the integrating sphere. The interferometer can thus be illuminated with arbitrary spectra up to the maximum spectral resolution of the DLP illuminator, which is approximately 20 nm.

Fig. 5 The calibration system’s optical schematic. Light from a xenon light source illuminates a grating, which disperses light onto a DLP. This light is homogenized within an integrating sphere for generating calibration and testing spectra for the interferometer.

Download Full Size | PDF

While the DLP enables the generation of continuous spectra, its relatively low spectral resolution can be a limitation for calibrating the system using H-Matrix (or measurement matrix) based techniques. Thus, a Horiba Micro-HR monochromator, illuminated by a Tungsten-halogen light source, was used to create variable narrow spectral band illumination. Light from this monochromator was guided into the sphere through a fiber light guide. It should be mentioned that either the DLP-based light source or the monochromator-based light source were used to illuminate the sphere, and that both sources were never used simultaneously to generate spectra. A central processing unit (CPU) is used to synchronize the DLP or the monochromator to the FPA’s measurements. Finally, “truth” or target spectra were recorded by a fiber-based USB 4000 Ocean Optics Spectrometer. The targets’ spectra were used for the neural network training in the output layer and for calculating the RMS error between the various calibration techniques.

4. Sensor calibration methods

Implementing neural networks for the BFTS’s calibration provides several advantages. First, NNs can potentially minimize the post-processing that is required to calculate a spectrum from the measured interferogram. For example, in Fourier-based reconstruction techniques, compensating for phase errors requires several forward and inverse FFTs for deconvolution, interpolation, upsampling, etc. These calculations have to be performed for each interferogram, increasing the computational burden. The number of operations (NOA) required for the NN, FFT, and linear-unmixing (H-Matrix) techniques were estimated to be 19379, 454877, and 8446, respectively, for our data. While the NOA required for the H-Matrix reconstruction technique is two times smaller than the NN, the NN’s performance, as will soon be demonstrated, exceeds that of the H-Matrix. Meanwhile, the NOA of the NN reconstruction method is 23 times smaller than the FFT’s, primarily due to interpolation steps that were required for phase correction in the FFT technique. Conversely, provided appropriate training, an NN can be established to perform these operations directly. Additionally, once the neural network’s architecture is determined, the parallel processing capability, often associated with FFTs, is preserved. However, advantages of using NNs come at the cost of increased complexity associated with the calibration equipment or procedures needed for NN training, which can be more costly than conventional methods of calibration. Furthermore, one must be careful to avoid over-fitting the solution [22]. Still, the possibility of transferring some of the post-processing burden away from the field and into the laboratory is an enticing aspect for real-time performance.

Three kinds of calibration approaches were applied to calibrate the spectrometer: (1) conventional Fourier transforms with Mertz phase correction; (2) a linear systems model, or “H-Matrix”; and (3) an artificial Neural Network (NN). In this section, we will briefly overview the theory and procedures of performing the conventional Fourier and H-Matrix calibrations. Furthermore, the NN’s architecture, training data preparation, and the calibration steps will also be detailed.

4.1 Fourier transform

Given an interferogram without periodic phase error, we could perform Mertz phase correction. However, when the artificial phase error is introduced into the interferogram, additional phase compensation and post processing are required to reconstruct the true spectra. For instance, the phase mask does not have a perfectly immediate transition, in the HWP’s fast axis orientation, between adjacent periods. To avoid sampling this transition, the 1D interferogram is downsampled, using a sparser uniform sampling grid, which avoids this transition. This maintains the Nyquist sampling rate while avoiding spurious phase measurements caused by samples lying within this region. To phase correct the periodic phase error, phase corrections are performed in the frequency domain to compensate the phase change within each period of the phase mask [23]. Thus, numerical phase correction is applied only on the artificial phase contaminated interferogram, as illustrated in Fig. 6(a). Assuming monochromatic light with a wavenumber $σ_{0}$ , the interferogram, with periodic phase modulation, can be expressed using Eq. (1). Fourier transformation of one period ( $2 P$ ) of this function yields

I (σ) = \frac{1}{2} [δ (σ + σ_{0}) e^{i d N} + δ (σ - σ_{0}) e^{- i d N}],

where

δ

is the Dirac delta function. Hence, the phase term can be eliminated by multiplying the Fourier Transform of

I (x)

with the conjugate of its corresponding phase error, which can be written as

I^{'} (σ) = I (σ) e^{i k ϕ_{p} (m)}, {\begin{matrix} k = - 1 & if & σ < 0 \\ k = 1 & if & σ > 0 \end{matrix}

where

ϕ_{p} (m)

is the phase step function and the integer number

m

specifies the phase transition from column to column along the x axis as depicted in Fig. 4(b).

Fig. 6 The Fourier Transform calibration procedure. (a) The artificial phase contaminated interferogram applied with the numerical phase correction technique on each section of the interferogram, (b) the numerical phase compensated interferogram, (c) the upsampled, zero-padded interferogram with three different apodization filters applied, and (d) the phase corrected as well as double-sided symmetric interferogram.

Download Full Size | PDF

The resulting interferogram, after inverse Fourier transforming the numerically phase-corrected spectra, is what we are referring to as a phase compensated (but not phase corrected) interferogram, as illustrated in Fig. 6(b). Since the interferogram was asymmetric due to dispersion within the prism, phase correction was incorporated via the Mertz phase correction algorithm [1]. The procedure for applying Mertz phase correction is summarized in Fig. 6(c). The post processed 1D interferogram is first upsampled by a factor of 8. A ramp function (Apo2) was then applied on the upsampled interferogram. This function was then centered via zero-padding and apodized using the triangular function, illustrated as Apo3. Finally, the interferogram’s center burst is isolated and apodized by Apo1 to extract the low frequency phase errors. The corrected spectra are obtained by multiplying the uncorrected spectra with the phase angle obtained from the short double-sided interferogram [1, 24]. Inverse Fourier transformation of this spectra yields the symmetric double-sided interferogram, illustrated in Fig. 6(d).

4.2 H-matrix

The H-Matrix, used in our calibration procedure, is constructed by 20 post-processed 1D interferograms, acquired using a monochromator. Interferograms were sampled uniformly in wavenumber from 2500 to 13889 cm⁻¹ (400 nm to 720 nm). Thus, an arbitrary interferogram can be represented as a weighted sum of the interferograms contained within the H-Matrix as [25]

I = H f,

where

H

is the system transfer matrix,

f

is the input spectrum that must be calculated, and

I

is the measured interferogram. Hence, the input spectrum can be calculated as

f = W I,

where

W

is the pseudo-inverse of

H

. For our results, the pseudo-inverse was calculated using the built-in MATLAB function, which uses single value decomposition.

4.3 Neural network

A supervised multilayer feed-forward back-propagation neural network, also known as a multilayer perceptron (MLP) [26], was used in our NN-based calibration approach. The NN training algorithm is operated by the MATLAB NN toolbox [27] and involved two steps. First, to enable convergence towards an optimal neural network, the training data set must be representative of many different classes of spectra. Second, once the training data set is created, the optimal NN architecture must be determined to enable spectral calculation without over-fitting.

4.3.1 Training data preparation

Identifying the target characteristics to include into the training data is important to ensure proper NN training. Since we are using a NN to model an experimental Fourier transform process, input data are 1D interferograms (i.e., intensity versus optical path difference) and output, or target, data are spectra (i.e., intensity versus wavelength). The training set comprised of monochromatic, dichromatic, and random spectra. A representative spectrum from each of these data sets are presented Fig. 7.

Fig. 7 Representation of a monochromatic, dichromatic, and random continuous spectrum used to generate the NN training data set.

Download Full Size | PDF

To ensure that the training data set was statistically significant, and to ensure that it included both spectral and intensity variations, we created the data set using the following procedure. First, 50 monochromatic spectra were generated using the monochromator, sampled linearly in wavenumber from 13,889 to 25,000 cm⁻¹ (or 400-720 nm). These spectra were represented in the training data set to present the NN with high spectral resolution (10.3 nm) monochromatic inputs. Next, three sets of these monochromatic spectra (or 150 total spectra) were acquired at different intensity levels: the monochromator’s maximum light output, 73% brightness, and 49% brightness. This enables the NN to properly identify signals with differing illumination levels. Similarly, 56 monochromatic spectra (spectral bandwidth of 20 nm), were generated by the DLP for wavelengths spanning 400-720 nm. Similar to the monochromator data, three different intensity levels (or 168 spectra total) were generated corresponding to the DLP’s maximum light output, 77% brightness, and 58% brightness. Finally, the total number of dichromatic and random continuous spectra were 332 and 511, respectively. In these data sets, both the brightness of any particular wavelength and the wavelength’s spectral position were randomized. Thus, there were a total of 1161 training pairs, consisting of BFTS interferograms and Ocean Optics spectra. Of these data, 1011 pairs were used for NN training, and 150 pairs were used for validation.

4.3.2 NN architecture determination

The size of the input and output layers, as well as the number of hidden nodes and layers, influences the NN’s training speed and the risk of over-fitting [28]. Generally, the size of the input and output layers should be kept small to accelerate the training time. Furthermore, as described later in Sections 4.4.1 and 4.4.2, some post-processing was performed on the input interferogram and target spectra to improve fitting performance. Ultimately, the BFTS’s post-processed interferogram and the OO’s post-processed spectrum contained 136 and 70 data points, respectively. The number of samples contained within the OO’s spectrum was determined by matching the OO’s spectral resolution to that of the BFTS, while the number of samples representing the BFTS’s interferogram was determined using the Nyquist limit.

NN training was performed using the Levenberg-Marquardt algorithm, and a mean least square error algorithm (cost function) was used to determine the training direction. The cost function can be written as

e = \frac{1}{N_{h}} \sum_{k = 1}^{N_{h}} {(t (k) - s (k))}^{2},

where

N_{h}

is the total number of nodes at the output layer,

t (k)

is the target vector, and

s (k)

is the updated output from the output layer. The transfer function within the hidden layer was the hyperbolic tangent sigmoid function and its mathematic expression is

y (v) = \tan h (v),

where

v

is the sum of the neuron’s input vector with its weighted bias and

y (v)

is the neuron’s output, which is non-linear but differentiable everywhere. The output layer transfer function is a linear function where its boundary is set from infinity to negative infinity, which has the form of [27]

y (v) = v .

The selected NN architecture was a feed-forward design which contained 136 input nodes, 20 nodes within a hidden layer, and 70 nodes in the output layer. Figure 8 illustrates the Neural Network’s topology in which the aforementioned post-processed interferograms serve as inputs to be matched to their corresponding OO spectra.

Fig. 8 The Neural Network architecture showing one input-output pair. BFTS interferograms and OO spectra were matched between the input and output layers, respectively.

Download Full Size | PDF

4.4 Data acquisition and post-processing common to all methods

4.4.1 Interferogram preparation for calibration

The 2D interferograms were first captured by an 8-bit FPA with a sensor size of $1280 \times 960$ pixels. All 2D interferogram frames were first divided by a measured flatfield to remove illumination nonuniformity across the camera [17]. Dark frames were also taken to remove dark noise. After manually choosing the highest quality row of pixels (the intensity patterns only varied on the horizontal axis) on the 2D interferogram, each point was averaged between the vertical $\pm 20$ pixels to create a 1D interferogram. Finally, the mean intensity was removed from the 1D interferogram. The outcome after applying the procedures mentioned above is shown in Fig. 9(a).

Fig. 9 The NN and H-Matrix calibration procedure. The interferogram is depicted (a) after averaging and mean value removal; (b) after appending zeros, alongside the triangular apodization filter; and (c) after apodization.

Download Full Size | PDF

In order to make the interferogram symmetric around ZOPD, the interferogram was zero-padded. It should be mentioned that the ZOPD position was set to the pixel containing the interferogram’s maximum intensity value. Finally, the interferogram was apodized using the triangular function illustrated in Fig. 9(b), which produced the final post-processed interferogram per Fig. 9(c).

4.4.2 Spectra preparation for calibration

Spectra, taken from the OO spectrometer, were also processed before using them for validation and calibration testing. Dark spectra were acquired and subtracted from each measurement to remove dark noise and the OO spectra were filtered to match the BFTS’s spectral resolution. First, the BFTS’s spectral resolution, with a triangular apodization filter, can be calculated by

Δ σ = 1 / 2 O P D_{\max} .

For our BFTS, OPD_max = 12.7 microns, yielding a spectral resolution of 1571 cm⁻¹. In order to match the OO’s spectral resolution to that of the BFTS, OO spectra

I (λ)

were first linearly sampled in wavenumber to produce a new spectrum

I^{'} (σ_{n})

. After interpolation, a double sided spectrum

I_{m} (σ)

was created by mirroring the interpolated spectrum to negative wavenumbers by

I_{m} (σ_{n}) = {\begin{matrix} I^{'} (σ_{n}) & if & σ \geq 0, \\ I^{'} (- σ_{n}) & if & σ < 0. \end{matrix}

This mirrored OO spectrum was then inverse Fourier transformed, apodized with a triangular function with a full width of

{2OPD}_{max} = 12.7

microns, and forward Fourier transformed to create a spectrum with comparable spectral resolution to the BFTS measurements.

5. Results

The results of the three calibration techniques will be discussed in detail here. These techniques, described previously in section 4.0, were applied to the post-processed interferograms contained within the calibration and validation data sets. Due to numerical scaling differences between the three techniques, comparison was performed by calculating the RMS error between transmission measurements.

5.1 Transmission measurements

In order to enable comparison between the three different techniques, the RMS error was calculated on transmission measurements that were calculated, from the DLP data, by

T (λ) = \frac{I_{S a m p l e} (λ)}{I_{W h i t e L i g h t} (λ)},

where

I_{s a m p l e}

and

I_{W h i t e L i g h t}

are the sample and white light reference spectra, respectively, calculated after implementing the FFT, H-Matrix, or NN calibration procedure. In this case, sample spectra were selected, from amongst the 150 validation spectra in the data set, while the white light spectrum was generated using the DLP by configuring it to reflect all incident light into the sphere. The RMS error of each technique was calculated by

R M S = \frac{1}{N_{T}} \sqrt{\sum_{n = 1}^{N} {(T (λ_{n}) - T_{O O} (λ_{n}))}^{2}},

where

T

is the DLP’s transmission, as calculated by one of the three aforementioned techniques,

N_{T}

is the number of samples contained within the transmission measurements, and

T_{O O}

is the transmission calculated using only the OO’s spectrum. These quantities were calculated for the three phase error cases: (1) no phase error (ideal, no phase mask); (2) 90 degree phase shift (phase B); and (3) 180 degree phase shift (phase A). In our experiments, the RMS error was calculated for wavelengths spanning 500-700 nm to match the error calculation to the optimal operating regime of the phase mask.

5.2 Reconstructed spectra

The results of the transmission measurements and the calculated absolute error, after applying the NN training, FFT, and H-Matrix calibration procedures, are shown in the following sections, as compared alongside the OO (truth) transmission spectra. All of the reconstructed spectra were interpolated onto the same wavelength axis for direct comparison. A more detailed discussion of the RMS error is reserved for section 6.0.

5.2.1 Case I: without phase error

When no artificial phase error was introduced into the system, the reconstructed spectra obtained by the three techniques performed well. Representative spectra from this configuration are provided in Fig. 10, illustrating the outcome from the three algorithms given a monochromatic, dichromatic, and random input spectrum. Meanwhile, RMS error, calculated using all 150 validation spectra, yielded the results depicted in Fig. 11.

Fig. 10 The monochromatic, dichromatic, and random reconstructed spectra obtained by all the three calibration techniques and the calculated absolute error for case I.

Download Full Size | PDF

Fig. 11 RMS error for Case I (no phase error) for each spectral type.

Download Full Size | PDF

5.2.2 Case II: phase B

With the 90 degree (Phase B) phase mask inserted into the interferometer, most of the reconstructed spectra obtained from the FFT had lower accuracy. Representative spectra for this Phase B error case are illustrated in Fig. 12 while the RMS error is presented in Fig. 13 for monochromatic, dichromatic, and random input spectra.

Fig. 12 The monochromatic, dichromatic, and random reconstructed spectra obtained by all the three calibration techniques and the calculated absolute error for case II.

Download Full Size | PDF

Fig. 13 RMS error for Case II (90 degree phase error) for each spectral type.

Download Full Size | PDF

5.2.3 Case III: phase A

Finally, the 180 degree (Phase A) phase mask was inserted into the interferometer. Similar to previous sections, representative spectra for the Phase A error case are illustrated in Fig. 14, while the RMS error is presented in Fig. 15 for monochromatic, dichromatic, and random input spectra.

Fig. 14 The monochromatic, dichromatic, and random reconstructed spectra obtained by all the three calibration techniques and the calculated absolute error for case III.

Download Full Size | PDF

Fig. 15 RMS error for Case III (180 degree phase error) for each spectral type.

Download Full Size | PDF

6. Discussion

Generally, the three calibration methods succeed in all three cases. For cases I and II, RMS error of the H-Matrix method was highest, followed by the RMS error of the FFT-based calibration and the NN providing the best performance of all three. For the H-Matrix technique, the calibration process was intuitive and easy to configure; however, the mean RMS error for case I (without phase error) was approximately 2.3 times greater than it was for the Fourier transform technique. Since the H-Matrix was constructed by using the monochromatic spectra, which were linearly sampled in wavenumber, the performance on the monochromatic spectra yielded lower RMS error. However, when calibrating broadband or continuous spectra, the RMS error increased by nearly a factor of 2. This is likely caused by cross-talk in the matrix during data reduction. Meanwhile, for the FFT technique, the RMS error increased when going from case II to case III. With reference to Fig. 15, the FFT’s RMS error in case III was approximately 2.6 times larger than case II and 3.3 times greater than case I. Additionally, for case III, the H-Matrix performed better than the FFT, which is converse to cases I and II in which the FFT performed better than the H-matrix technique. Finally, the Neural Network-based calibration provided low RMS error for all three cases, and was relatively independent of the phase mask’s presence in the system. For case II, the NN’s RMS error was over 6 and 3.6 times smaller than that of the H-Matrix and FFT, respectively. Furthermore, the differences between RMS errors were greatest in case III, demonstrating a six and nine fold decrease in the RMS error between the H-Matrix and FFT, respectively, when using the NN to calibrate the system.

7. Conclusion

We have successfully demonstrated the first experimental application of NNs to the calibration of a birefringent Fourier transform spectrometer with systematic phase errors. With the described technique, we were able to experimentally achieve an RMS error of approximately 0.0046 when artificial systematic phase error was present in the system; a 4.76 and 10 fold improvement over the FFT-based reconstruction method and a 6 and 11.7 fold improvement over the H-Matrix approach for periodic π/2 (phase B) and π (phase A) phase errors, respectively. Also, with our promising results on using NNs to calibrate our proposed system, there is a great possibility of calibrating more complex systems such as hyperspectral imagers or heterodyne spectrometers with multiple aperture arrays. Future work will be focused on optimizing the NN’s parameters and applying them to imaging architectures.

Acknowledgments

The authors would like to thank Matthew Miskiewicz and Michael J. Escuti for providing the patterned phase mask that was used in this experimental work.

References and links

1. L. Mertz, “Auxiliary computation for Fourier spectrometry,” Infrared Phys. 7(1), 17–23 (1967). [CrossRef]

2. M. L. Forman, W. H. Steel, and G. A. Vanasse, “Correction of Asymmetric Interferograms Obtained in Fourier Spectroscopy,” J. Opt. Soc. Am. 56(1), 59–61 (1966). [CrossRef]

3. A. Ben-David and A. Ifarraguerri, “Computation of a spectrum from a single-beam fourier-transform infrared interferogram,” Appl. Opt. 41(6), 1181–1189 (2002). [CrossRef] [PubMed]

4. V. Saptari, Fourier Transform Spectroscopy Instrumentation Engineering (SPIE press, 2004), Vol. 61.

5. J. A. De Haseth, “Stability of Rapid Scanning Interferometers,” Appl. Spectrosc. 36(5), 544–552 (1982). [CrossRef]

6. M. W. Kudenov, M. E. L. Jungwirth, E. L. Dereniak, and G. R. Gerhart, “White light Sagnac interferometer for snapshot linear polarimetric imaging,” Opt. Express 17(25), 22520–22534 (2009). [CrossRef] [PubMed]

7. J. M. Harlander, H. Tran, F. L. Roesler, K. P. Jaehnig, S. M. Seo, W. T. Sanders III, and R. J. Reynolds, “Field-widened spatial heterodyne spectroscopy: correcting for optical defects and new vacuum ultraviolet performance tests,” In 2280, 310–319 (1994).

8. C. R. Englert, J. M. Harlander, J. G. Cardon, and F. L. Roesler, “Correction of phase distortion in spatial heterodyne spectroscopy,” Appl. Opt. 43(36), 6680–6687 (2004). [CrossRef] [PubMed]

9. M. W. Kudenov and E. L. Dereniak, “Compact real-time birefringent imaging spectrometer,” Opt. Express 20(16), 17973–17986 (2012). [CrossRef] [PubMed]

10. H. Yang, “A back-propagation neural network for mineralogical mapping from AVIRIS data,” Int. J. Remote Sens. 20(1), 97–110 (1999). [CrossRef]

11. T. Fearn, “REVIEW: Standardisation and calibration transfer for near infrared instruments: a review,” J. Near Infrared Spectrosc. 9(1), 229 (2001). [CrossRef]

12. L. Duponchel, C. Ruckebusch, J. P. Huvenne, and P. Legrand, “Standardisation of near-IR spectrometers using artificial neural networks,” J. Mol. Struct. 480, 551–556 (1999). [CrossRef]

13. C. A. Osorio-Gómez, E. Mejía-Ospino, and J. E. Guerrero-Bermúdez, “Spectral reflectance curves for multispectral imaging, combining different techniques and a neural network,” Rev. Mex. Fis. 55, 120–124 (2009).

14. D. A. Naylor, T. R. Fulton, P. W. Davis, I. M. Chapman, B. G. Gom, L. D. Spencer, J. V. Lindner, N. E. Nelson-Fitzpatrick, M. K. Tahic, and G. R. Davis, “Data processing pipeline for a time-sampled imaging Fourier transform spectrometer,” in Optical Science and Technology, the SPIE 49th Annual Meeting (International Society for Optics and Photonics, 2004), pp. 61–72. [CrossRef]

15. J. Craven-Jones, M. W. Kudenov, M. G. Stapelbroek, and E. L. Dereniak, “Infrared hyperspectral imaging polarimeter using birefringent prisms,” Appl. Opt. 50(8), 1170–1185 (2011). [CrossRef] [PubMed]

16. P. Connes and G. Michel, “Astronomical Fourier spectrometer,” Appl. Opt. 14(9), 2067–2084 (1975). [CrossRef] [PubMed]

17. M. W. Kudenov, M. N. Miskiewicz, M. J. Escuti, and J. F. Coward, “Polarization spatial heterodyne interferometer: model and calibration,” Opt. Eng. 53(4), 044104 (2014). [CrossRef]

18. J. Kim, R. K. Komanduri, K. F. Lawler, D. J. Kekas, and M. J. Escuti, “Efficient and monolithic polarization conversion system based on a polarization grating,” Appl. Opt. 51(20), 4852–4857 (2012). [CrossRef] [PubMed]

19. S. W. B. Joseph and P. Rice, “A hyperspectral image projector for hyperspectral imagers,” (2007).

20. L. J. Hornbeck, “Digital light processing and MEMS: Timely convergence for a bright future,” Proc. SPIE 2642, 2 (1995). [CrossRef]

21. C. Oh and M. J. Escuti, “Achromatic diffraction from polarization gratings with high efficiency,” Opt. Lett. 33(20), 2287–2289 (2008). [CrossRef] [PubMed]

22. S. Geman, E. Bienenstock, and R. Doursat, “Neural Networks and the Bias/Variance Dilemma,” Neural Comput. 4(1), 1–58 (1992). [CrossRef]

23. V. C. Chan, M. Kudenov, and E. Dereniak, “Phase correction algorithms for a snapshot hyperspectral imaging system,” 2015, vol. 9611, pp. 961111.

24. P. R. Griffiths and J. A. De Haseth, Fourier Transform Infrared Spectrometry, Chemical Analysis ; v. 83 (Wiley, 1986).

25. D. C. Heinz and C.-I. Chang, “Fully constrained least squares linear spectral mixture analysis method for material quantification in hyperspectral imagery,” IEEE Trans. Geosci. Rem. Sens. 39(3), 529–545 (2001). [CrossRef]

26. D. E. Rummelhart, “Learning representations by back-propagating errors,” Nature 323(6088), 533–536 (1986). [CrossRef]

27. H. Demuth, M. Beale, and M. Hagan, “Neural network toolbox user’s guide, The MathWorks,” Inc., Natrick, USA (2009).

28. E. B. Baum and D. Haussler, “What Size Net Gives Valid Generalization?” Neural Comput. 1(1), 151–160 (1989). [CrossRef]

Neural network calibration of a snapshot birefringent Fourier transform spectrometer with periodic phase errors

Abstract

1. Introduction

2. Theoretical model of systematic phase errors

3. Experimental configuration

3.1 Phase mask

3.2 Spectral sources for calibration

4. Sensor calibration methods

4.1 Fourier transform

4.2 H-matrix

4.3 Neural network

4.3.1 Training data preparation

4.3.2 NN architecture determination

4.4 Data acquisition and post-processing common to all methods

4.4.1 Interferogram preparation for calibration

4.4.2 Spectra preparation for calibration

5. Results

5.1 Transmission measurements

5.2 Reconstructed spectra

5.2.1 Case I: without phase error

5.2.2 Case II: phase B

5.2.3 Case III: phase A

6. Discussion

7. Conclusion

Acknowledgments

References and links

Cited By

Figures (15)

Equations (19)

Optics Express