In this paper we introduce new variables that can be used to retrieve the atmospheric continuum emission in the inversion of remote sensing measurements. This modification tackles the so-called sloppy model problem. We test this approach on an extensive set of real measurements from the Michelson Interferometer for Passive Atmospheric Sounding. The newly introduced variables permit to achieve a more stable inversion and a smaller value of the minimum of the cost function.
© 2013 OSA
For the retrieval of atmospheric constituents and state variables such as pressure and temperature from remote sensing spectroscopic measurements it is necessary to model the radiative transfer through an inhomogeneous medium. The inhomogeneouos atmosphere is usually discretized in small elements which are assumed to be homogeneous. The law governing absorption in an homogeneous medium is the Beer-Lambert lawEquation (1) has the general solution I(x) = cexp(−Kσx). As a consequence the radiance reaching the instrument is modeled as a linear combination of exponential terms. Disregarding scattering, which is justified in the mid-to-far infrared spectral region in clear sky conditions, the radiance S(σ) at frequency σ reaching the instrument for a fixed observation geometry is described by the radiative transfer equation  2, 3]. Equation (2) may then be rewritten as 4, 5]
The inversion of the measured spectra is obtained by optimizing the values of the retrieval variables. In the weighted least-squares approach the cost function χ2 is defined as ℓ2 norm of the differences between observed and simulated radiances (the so-called residuals) weighted with the inverse of the error covariance matrix of the measurements.
While the xi,m have a physical meaning, their selection as retrieval variables makes the forward model a sloppy model, as described in . The main drawback of such a model is that the model manifold includes wide regions where the residuals of the reconstructed spectrum are large, and long, narrow and tortuous valleys where the residuals are small. As a consequence many different sets of retrieval variables may lead to similar χ2 values close to the minimum.
There are many methods to deal with sloppy models. However, most of them are based on a line search substep to calculate the increment along a descent direction . Those methods are not suitable for atmospheric retrievals, because the evaluation of the simulated spectrum (the forward model) is a time-consuming operation. Our solution is to modify the retrieval variables to obtain a better dependency in the forward model and, ultimately, a better convergence towards the minimum of the χ2. In the theory Section 2 we present the theoretical background of the modification. In Section 3 we present the results of some test retrievals applied to MI-PAS (Michelson Interferometer for Passive Atmospheric Sounding) observations. Finally in Section 4 we draw the conclusions.
2.1. The continuum variables
Conventionally the cross-sections kσ,m appearing in Eq. (4) are computed including only the contributions of transitions not farther than 25 cm−1 from σ[4, 5]. In restricted spectral intervals, such as the microwindows (MWs) used in our retrievals (less than 3 cm−1) , the contribution of the lines beyond the 25 cm−1 threshold is modeled with a frequency-independent term, the so-called continuum, which is normally MW and altitude dependent. This term may also account for unmodeled contributions from aerosols or residual instrument calibration errors. In the forward model this term is usually implemented as an additional gas, with VMR equal to 1. As a consequence the columns of the continuum depend only on the observation geometry and the density of the layers. The retrieval variables model the continuum cross-section vertical profile. This profile can be approximated with a curve with nodes zi (the retrieval grid) and values ki = k(zi). The discretization used for the radiative transfer calculation (3) may be finer than the retrieval grid. However, for each layer l the continuum kl in that layer will depend on at most two values ki and ki−1 with consecutive indeces. Thus we have
The air columns can be explicitly calculated, but again we obtain a sloppy model because the contribution of the continuum in the productory (4) is
For the minimization required by the inversion, most techniques, such as the widely used Gauss-Newton method, require the derivatives of the simulated spectrum with respect to the retrieval variables. The derivatives with respect to the ki continuum variables are calculated as6), for each l there are only two non vanishing derivatives, and . With the new approach the derivatives become 9) the only non vanishing derivatives are
2.2. The VMR variables
In principle this approach can be extended also to the VMR variables, however there is one important difference. In Eq. (4) the quantity holding the dependence on the continuum retrieval variables is the cross-section kl. On the other hand, the cross-sections do not depend on the VMR retrieval variables, the dependence being via the gas columns cl,m.
If the VMR profile is modeled with a piecewise-linear function with nodes zi and values xi = x(zi), then there are constants
The derivatives of the spectrum with respect to these new variables are:15) the only non vanishing terms are
3. Test of the method on MIPAS measurements
Starting from March 2002, MIPAS  has been sounding for ten years the mid-infrared atmospheric limb-emission spectrum from the polar satellite ENVISAT, developed by the European Space Agency (ESA). The measured spectral region contains the signatures of many atmospheric constituents. ESA set up a retrieval algorithm [4, 5, 7] to determine pressure at tangent points and geolocated profiles of temperature and of VMR of some atmospheric constituents. The retrieved constituents in version 6.0  of the ESA operational processor are H2O, O3, HNO3, CH4, N2O, NO2, CFC-11, CFC-12, ClONO2 and N2O5. The ESA operational retrieval algorithm V6.0 is based on the Gauss-Newton method, modified with the regularizing Levenberg-Marquardt (LM) technique (see  and the references therein). Due to the computational cost of the forward model, no search methods are viable for the LM constant. The strategy adopted for the evolution of the LM constant is the damping-undamping strategy (see ).
For the tests presented in this paper we used more conservative convergence thresholds with respect to those selected in V6.0. The V6.0 thresholds were optimized for near real-time processing of MIPAS data, taking into consideration the stringent runtime requirements. Since April 2012 the instrument is no longer operational, and computing time is no longer a primary issue, provided it remains into reasonable bounds. In the next reprocessing of the whole MIPAS mission convergence thresholds similar to those used for our tests will be adopted, in order to decrease the convergence error at the expenses of an increase in computing time. Moreover, we implemented the following selectable options.
- In addition to the nominal (NOM) version, we implemented the change of retrieval variables (as explained in Section 2), either for the continuum variables only (CONT version) or for the continuum and VMR variables (POLY version). The better conditioning of the inversion obtained in the CONT and POLY approaches permits to start the retrieval with a smaller LM damping parameter for the new retrieval variables. The advantages of this modification will be discussed in Section 3.1.
We introduce the following scalar quantifiers to characterize the performance of the retrieval.
- The average (on the set of processed limb scans) number of retrieval iterations .
- The consistency of the final simulated spectra with the observations is measured by . This is the arithmetic mean (on the set of processed scans) of the normalized chi-square (see ) related to individual profile retrievals.
- We consider the number of degrees of freedom (DoF) of the retrieval, calculated as the trace of the averaging kernel matrix . This quantifier represents the overall reduction of the vertical resolution as a consequence of the applied constraints. We divide this number by the number n of points of the retrieved profile, since this latter can vary from scan to scan due to cloud contamination. We then take the arithmetic mean on the set of processed scans. Note that measures the fraction of the vertical resolution lost because of the combined effect of the LM technique and the a-posteriori regularization.
- Finally, to evaluate the profile smoothness we use the oscillation quantifier Ω2. For any profile xi = x(zi), i = 1,..., n, Ω2 is defined as
3.1. Tests on a set of 12 MIPAS orbits
To assess the improvements of the change of variables described in Section 2, we first tested the retrieval on a limited set of real observations. We selected 12 orbits of MIPAS measurements from the year 2007. The measurements cover the four seasons to ensure that the test includes sufficient atmospheric variability. For this test we compared the NOM, CONT and POLY approaches, both without regularization and with the IVS regularization (NOM_R, CONT_R, POLY_R).
Figure 1 shows the impact of the proposed change of the retrieval variables on the algorithm. We calculated the spectral condition number of the GN approximation of the Hessian of the cost function, modified with the LM method. This is the matrix which needs to be inverted for the calculation of the next iterate of the GN sequence. In the figure we report the probability density of this condition number calculated from our NOM_R and CONT_R samples at the first and the final iteration. We noticed that the new approach generally led to a significant reduction of the condition number of the matrix. For some weak species like CFC-11 (cfr. right panel of Fig. 1) this reduction also solved the numerical problems of the inversion caused by a condition number greater than ∼ 1017 (the inverse of the machine precision). Furthermore we observed that this conditioning was less sensitive to the LM damping parameter. Hence we were able to decrease by a factor of 10 the initial value of this parameter for the new retrieval variables. This reduction is possible also thanks to the reduced non-linearity of the forward model with respect to the new continuum variables. The smaller value of the LM damping parameter leads to larger steps in the GN sequence. As a consequence we are able to get closer to the true minimum of the cost function, with fewer iterations. We found that the results of the retrieval do not depend critically on the initial value of the LM parameter for the new variables. Thus a fine tuning of this parameter on a per-target basis would not lead to significant improvements.
The results are summarized in Table 1. For each test and each target parameter, the table reports the quantifiers described above. The last block of the table contains the percentage variation of the quantifiers with respect to the NOM approach, averaged over the retrieval targets.
Generally, all the retrieval targets exhibit similar behaviors, with two notable exceptions. The NO2 VMR is retrieved only above 24 km, while the continuum parameters are fitted only below 30 km. Thus, only a small number of continuum parameters are retrieved, in an altitude range where the opacity of the atmosphere is small. Hence the sensitivity of the retrieval to continuum parameters is good also in the NOM approach. In the retrieval of N2O5, the selected MWs span a limited spectral range (approximately 27 cm−1), therefore a single continuum profile is retrieved to make the inversion better conditioned. Hence the effect of the CONT approach on these two species is small.
From the last block of Table 1 we see that the CONT approach is able to reduce the final of about 3% and the number of iterations of about 15% with respect to the NOM approach, at the expenses of a 5% increase of the Ω̄2. The regularizing effect of the LM method in this case is weaker due to the smaller initial value of the LM damping parameter. This is confirmed by the 5% increase in . In the CONT_R approach the regularization is able to reduce the amplitude of the oscillations to the level of the NOM_R approach, while maintaining the 3% reduction of the .
We tested the statistical significance of the reduction in the number of iterations and in the with a per-target Welch’s t-test. Given the relatively large sample size, all the improvements of the CONT_R with respect to the NOM_R method are significant with a probability threshold of 0.01, except for the aforementioned NO2 and N2O5 species.
As pointed out in Section 2, the new approach changes the search domain for the continuum variables from the semi-unbounded interval [0, +∞) to the bounded domain [0, 1]. The LM method does not constrain the retrieval variables within their domain, thus a check on their value is implemented in the program, similarly to the approach already adopted for the 0 bound in V6.0 . This check however intervenes only in a very marginal number of retrievals, usually corresponding to difficult observational conditions.
The outcome of the POLY and POLY_R tests shows no advantage with respect to the CONT and CONT_R approaches. We attribute this result to the fact that the cross-sections kl,m(σ) of the retrieved gas depend on the frequency within each MW. On the other hand the cross-sections kl,c for the continuum are constant within each MW. The same argument also applies to the transparencies τσ,l,m and τl,c. In the NOM approach a sudden large increase of a continuum parameter may easily lead to a fully opaque layer τl,c ≈ 0 for all the frequencies within the involved MW. As a consequence, any emission from a layer beneath l is blocked by the fully opaque layer. On the other hand, a large increase in the VMR parameter hardly leads to a full MW with τσ,l,m ≈ 0, due to the dependence on σ. Also in the CONT approach we can get an opaque layer with τl,c ≈ 0. However, while in the NOM case we have when kj → ∞, in the CONT case when ξj → 0, because the exponents in Eq. (9) are less than 1. Consequently, in this case the sensitivity to the continuum parameters is not lost in the CONT approach.
3.2. Tests on a whole month of orbits
To make sure that the results of the previous test are not biased by the relatively small sample of orbits, we extended the test to a whole month of MIPAS orbits. We selected the measurements acquired in October 2007, consisting of 279 orbits, for a total of approximately 21400 scans. In order to limit the necessary computations, we compared only the most promising CONT_R approach to the standard NOM_R approach. We collected the results in Table 2, which has the same format of Table 1. The numbers of Table 2 fully confirm the conclusions of the previous test, i.e. reduced number of iterations, smaller and slightly smaller Ω̄2 for the CONT_R over the NOM_R approach.
In Fig. 2 we report the behavior of the versus the latitude of the limb scans. The graph is obtained by binning the in 5 deg. latitude intervals, and then averaging the data within each bin. The four panels refer to different retrieval targets. We do not report all target species, because they all show very similar behaviors. We note that the reduction of obtained by the CONT_R approach is visible at all latitudes. For the reasons explained in Section 3.1, the change of variable shows no advantage in the retrieval of N2O5, see bottom right panel of the figure.
Despite the smaller achieved, we found that the average differences between the NOM_R and CONT_R profiles show no evident systematic features above the noise error.
Normally in the inversion of remote sensing atmospheric measurements, the atmospheric continuum is retrieved as an additional molecular cross-section, constant in limited spectral intervals. The selection of such retrieval variables however leads to a so called sloppy model. In this case the numerical search for the minimum of the cost function may be difficult.
In this paper we propose a different set of retrieval variables for the atmospheric continuum. Tests with real MIPAS observations show that a more stable retrieval is obtained with the new variables. As a consequence we obtain a reduction in the number of iterations, a smaller minimum of the cost function, and slightly less oscillating profiles. The same modification can be applied also to the VMR retrieval variables. Our tests show however that – at least in the present formulation – we get no real advantage connected with this additional change.
The technique can be applied to any inverse problem whenever the forward model depends exponentially on the inversion variables. Its effectiveness however should be evaluated case by case on the basis of numerical tests.
We thank the IFAC-CNR institute in Firenze for making available computing and technical facilities through associate contract to M.R., in the frame of the research activity TA.P06.002. This study was supported by the ESA-ESRIN contract 21719/08/I-OL.
References and links
1. T. von Clarmann, M. Hpfner, B. Funke, M. López-Puertas, A. Dudhia, V. Jay, F. Schreier, M. Ridolfi, S. Ceccherini, B. J. Kerridge, J. Reburn, R. Siddans, and J.-M. Flaud, “Modelling of atmospheric mid-infrared radiative transfer: The AMIL2DA algorithm intercomparison experiment,” J. Quant. Spectrosc. Radiat. Transfer 78(3–4), 381–407 (2003) [CrossRef] .
2. A. R. Curtis, “Discussion of a statistical model for water vapor absorption,” Q. J. R. Meteorol. Soc. 78, 638–640 (1952) [CrossRef] .
3. W. L. Godson, “The evaluation of infrared radiative fluxes due to atmospheric water vapour,” Q. J. R. Meteorol. Soc. 79, 367–379 (1953) [CrossRef] .
4. M. Ridolfi, B. Carli, M. Carlotti, T. von Clarmann, B. M. Dinelli, A. Dudhia, J.-M. Flaud, M. Höpfner, P. E. Morris, P. Raspollini, G. Stiller, and R. J. Wells, “Optimized forward model and retrieval scheme for MIPAS near-real-time data processing,” Appl. Opt. 39, 1323–1340 (2000) [CrossRef] .
5. P. Raspollini, C. Belotti, A. Burgess, B. Carli, M. Carlotti, S. Ceccherini, B. M. Dinelli, A. Dudhia, J.-M. Flaud, B. Funke, M. Höpfner, M. López-Puertas, V. Payne, C. Piccolo, J. J. Remedios, M. Ridolfi, and R. Spang, “MI-PAS level 2 operational analysis,” Atmos. Chem. Phys. 6, 5605–5630 (2006) [CrossRef] .
6. M. K. Transtrum, B. B. Machta, and J. P. Sethna, “Geometry of nonlinear least squares with applications to sloppy models and optimization,” Phys. Rev. E 83, 036701 (2011) [CrossRef] .
7. P. Raspollini, B. Carli, M. Carlotti, S. Ceccherini, A. Dehn, B.M. Dinelli, A. Dudhia, J.-M. Flaud, M. López-Puertas, F. Niro, J.J. Remedios, M. Ridolfi, H. Sembhi, L. Sgheri, and T. von Clarmann, “Ten years of MIPAS measurements with ESA Level 2 processor V6 – Part I: retrieval algorithm and diagnostics of the products,” Atmos. Meas. Tech. Discuss. 6, 461–518 (2013) [CrossRef] .
8. H. Fischer, M. Birk, C. Blom, B. Carli, M. Carlotti, T. von Clarmann, L. Delbouille, A. Dudhia, D. Ehhalt, M. Endemann, J. M. Flaud, R. Gessner, A. Kleinert, R. Koopman, J. Langen, M. López-Puertas, P. Mosner, H. Nett, H. Oelhaf, G. Perron, J. Remedios, M. Ridolfi, G. Stiller, and R. Zander, “MIPAS: an instrument for atmospheric and climate research,” Atmos. Chem. Phys. 8, 2151–2188 (2008) [CrossRef] .
9. M. Lampton, “Damping-undamping strategies for the Levenberg-Marquardt nonlinear least- squares method,” Comput. Phys. 11, 110–115 (1997) [CrossRef] .
10. M. Ridolfi and L. Sgheri, “A self-adapting and altitude-dependent regularization method for atmospheric profile retrievals,” Atmos. Chem. Phys. 9, 1883–1897 (2009) [CrossRef] .
11. M. Ridolfi and L. Sgheri, “Iterative approach to self-adapting and altitude-dependent regularization for atmospheric profile retrievals,” Opt. Express , 19, 26696–26709 (2011) [CrossRef] .
12. P. R. Bevington and D. K. Robinson, Data reduction and error analysis for the physical sciences, 3rd ed. (McGraw–Hill, 2003).
13. C. D. Rodgers, Inverse methods for atmospheric sounding: Theory and practice Atmospheric, Oceanic and Planetary Physics, (World Scientific, 2000).