The lidar signal-to-noise ratio decreases rapidly with an increase in range, which severely affects the retrieval accuracy and the effective measure range of a lidar based on the Fernald method. To avoid this issue, an alternative approach is proposed to simultaneously retrieve lidar data accurately and obtain a de-noised signal as a by-product by combining the ensemble Kalman filter and the Fernald method. The dynamical model of the new algorithm is generated according to the lidar equation to forecast backscatter coefficients. In this paper, we use the ensemble sizes as 60 and the factor δ1/2 as 1.2 after being weighed against the accuracy and the time cost based on the performance function we define. The retrieval and de-noising results of both simulated and real signals demonstrate that our method is practical and effective. An extensive application of our method can be useful for the long-term determining of the aerosol optical properties.
© 2013 OSA
Lidar is a powerful active remote sensing apparatus used in atmospheric exploration. For instance, it can be used to detect clouds and aerosols, planetary boundary layer, stratospheric ozone . Lidar has become an extensively used tool due to its unique capability of measuring and illustrating the atmospheric vertical profile. There is an abundance of lidar networks and satellite lidar systems such as the Micro-pulse Lidar Network and Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) which play significant roles in the study of local and global climatology .
Elastic lidar measurements are common due to relatively inexpensive operational design and cost as well as being a by-product of a more advanced lidar. A general single-scattering lidar equation for a single-wavelength elastic lidar can be written as [1, 3]4]), β(r) and α(r) are the atmospheric backscattering and extinction coefficient, respectively, the subscript 1 and 2 denote aerosol and molecule, respectively, and e(r) is the noise which can be considered as Gaussian.
Since the 1960s, many methods have been proposed to retrieve the atmospheric parameters from a lidar signal [1, 3, 5–8]. In the 1980s, Klett and Fernald presented stable inversion methods by using one-component and two-component atmospheric models, respectively [3, 6]. These methods have been considered as a standard inversion method and are hence widely used. The Fernald two-component algorithm is mathematically equivalent to the Klett one-component algorithm but can treat aerosol and molecular components separately which is important when neither component is dominant. Thus, we chose to use the Fernald two-component algorithm in this study.
In spite of the contributions made by Klett and Fernald, there are three major error sources involved in the retrieval [2, 6, 9]: (1) the backscatter calibration, (2) the prior knowledge of the lidar ratio (i.e. extinction to backscatter ratio) and its profile, and (3) the noise. In the retrieval, the backscatter calibration can be performed at the tropopause where we assume few aerosols are present. The lidar ratio can be obtained by auxiliary instruments such as a sun-photometer. However, the signal-to-noise ratio (SNR), which decreases rapidly with the increase of the range and attenuation of laser beam energy, can strongly affect the retrieval results, since the effective measure range of a lidar system critically depends on the SNR. Nevertheless, the measuring range of a lidar can be very short, especially in daytime measurements.
In order to increase the SNR and effective measuring range, averages over replications are generally applied, however, this presents the issue of diminishing returns. Halving the standard error requires an average over four replications with the next several halvings requiring 16, 64, and 256 replications. This rapidly reduces the time or vertical resolution of the lidar measurement. In order to avoid the expensive averaging technique, many methods have been presented to de-noise the lidar signal with low-pass filters, such as the Monte Carlo average, Fourier, wavelet and empirical mode decomposition methods [10–12]. Because P(r) is a grossly nonlinear non-stationary signal (due to the strong attenuation caused by the 1/r2 effect), the range of variability can be as large as seven or more orders of magnitude. The moving average and Fourier methods are not well suited for the de-noising of a nonlinear non-stationary lidar signal [11, 12]. Furthermore, a wavelet analysis has issues of cutoff frequency and basic wavelet function selection, while empirical mode decomposition is in an early development phase . Thus, most of the de-noise methods prove difficult to use for the de-noising of P(r). The magnitude range of the range-corrected signal X(r) [i.e. P(r)r2] is smaller than that of P(r). However, the noise level of X(r) is equivalent to e(r)r2, which rapidly increases with increasing range causing de-noising methods based on X(r) to yield undesirable results. Thus, using results based upon the de-noised signal by the Fernald method can be seriously distorted.
We consider an alternative approach which can simultaneously obtain an accurate de-noised lidar signal and a corresponding retrieval result by combining the ensemble Kalman filter and the Fernald method. The ensemble Kalman filter is based on a Monte Carlo random sampling, which can avoid the limitations of the regular and extended Kalman filters where the limitations are from problems of assumptions of linear model and filter divergence, respectively. In this approach, the dynamical model, which is for forecasting the backscatter coefficients, is generated according to the lidar equation. In order to obtain optimal results of our method, we select the parameters based on the performance function we defined. We select the ensemble sizes as 60 and δ1/2 as 1.2 after weighed against the accuracy and the time cost. The results of both the simulated and real signal demonstrate that our method is practical and effective, even at a far range where the SNR is low. Furthermore, our study suggests that de-noise is unnecessary in the near range because the SNR is very high.
2. Principles and methods
2.1 Flow of the retrieval and de-noising algorithm
A alternative algorithm is proposed for the retrieval of optical properties from a lidar signal. In the algorithm, overlap factor is considered as constant. The algorithm contains two similar parts: the forward and backward parts, however we discuss only the backward part in this study while the forward part can be extrapolated correspondingly. The backward part of the algorithm comprises three key steps as shown in Fig. 1 , which can be described in detail as follows:
According to the spatial correlation of the aerosol backscatter coefficient, we use β1(i + n)... β1(i + 1), β1(i) and Eq. (1) to forecast Xratio(i). Xratio(i) is defined as the ratio of the range-corrected signal of two adjacent range bins, which can be written as
In this step, we utilize the ensemble Kalman filter to reduce the noise of . We use , X(i-1) and Std·r2(i-1) as the input parameters in order to update in the ensemble Kalman filter. Then, we use the mean of the to update and obtain a more accurate X(i-1).
We retrieve β1(i-1) based on X(i) and the updated X(i-1) by the normal Fernald method, and then iterate the calculation until the whole β1 profile have been retrieved.
2.2 De-noising by ensemble Kalman filter
The Kalman filter is an optimal linear filter based on Gaussian distribution, and it can obtain the optimal estimation of model state in a dynamical system with prior information of two-order moments . The covariance of model state error decreases after each iteration in a Kalman filter. From this property, we can use it to reduce the difference between two adjacent states in a dynamical model as well as obtain the forecast and analysis signals one by one with the model and filter. The Kalman filter is based on assumptions of a linear model and linear observation operator with a Gaussian distributional error, while the extended Kalman filter was developed by using a tangent linearization to deal with the nonlinear model and measure equations. However, the estimated accuracy of the extended Kalman filter was two-order because of its linear approximation which can cause filter divergence due to the strong nonlinearities of model equations .
The ensemble Kalman filter (EnKF) (based on Monte Carlo random sampling) was proposed to avoid this problem. In the EnKF, forecast error covariance was estimated from the ensemble samples of the forecast of model state instead of the prior information in the Kalman filter . In this study, the dynamical model is generated according to the Fernald method to forecast the backscatter coefficients.
For a discrete dynamical system with model state Xk and random model error εk at time k, we can express the model equation as
where M(·) is the model operator for forecasting model states from time k-1 to k, and is the model error with zero-mean. We assume the model error is contained in the model forecast, i.e. the model operator is not perfect. Furthermore, the measurement equation is given by
where Yk is the observation at time k, H(·) is the observation operator, and is the observational error with zero-mean and covariance In the de-noising, the observation operator is identical and
The EnKF Algorithm was proposed to approximate the Kalman filter, which can be described as follows :
In the forecast step, at time k can be obtained from model state at time k-1, where the superscript f and a denote the forecast and analysis scheme, and the superscript (i) denotes the i-th sample of the variables, is the mean of N samples of . In the de-noising algorithm, we use and Xratio(i) to forecast approximately. Next, the forecast error are calculated which is used to obtain the analysis scheme at time k along with the Kalman gain matrix, In order to get Kalman gain, we should calculate the forecast error covariance matrices , , as below.
Finally, can be calculated as the estimator of the model state Xk and can be regarded as the estimation of analysis error. In the de-noising process, after a few iterations, the error covariance matrix would be under-represented due to the use of a small ensemble, hence updating the model states with observations would prove useless. This phenomenon is called filter divergence. We use covariance inflation to obtain a new error covariance matrix (multiplied by a factor of δ>1, or ) as a simple method to avoid filter divergence along with an additional filtering technique . The implementation of covariance inflation for ensemble method can be given by
In this paper, the magnitude of δ1/2 and N will be discussed and selected based on the performance function in Section 3.
2.3 Data retrieval by the Fernald method
For a single-wavelength elastic lidar, aerosol backscatter profiles can be calculated by the Fernald method only if assumptions are made for the aerosol lidar ratio S1(i) and the backscatter coefficient at a calibration range. Under these assumptions, the total backscatter coefficients at range bin i of the ground-based lidars can be derived based on the two-component Fernald method. The two-component method is the most common inversion method because it can distinguish the contributions between molecules and aerosols. The two-component Fernald method contains backward and forward solutions, which can be respectively given as [3, 6]
The backward solution:
The forward solution:2], but we assume it is constant with a value of 50 sr according to a previous study of long-term observations in an atmospheric background period . Calibration altitude c can be chosen at about 10 km, and we can assume β(c)/β2(c) = 1.05β2(c) for the real data retrieval.
In a retrieval combining the ensemble Kalman filter and the Fernald method, there are two selective parameters (N and δ1/2) involved. In order to optimize these parameters, we construct a performance function F(N, δ1/2) based on the final values of β1(i) retrieved with N and δ1/2 as
3. Results and discussion
The purpose of this section is to report the findings of the layer detection scheme defined in the previous section as applied to a variety of signals. This purpose will be achieved by testing the scheme with simulated and real signals with 7.5 m as the vertical resolution.
3.1 Testing with simulated signals
To test the algorithm, a single scattering signal of ground based lidar at a 532 nm wavelength with 50 sr as the aerosol lidar ratio has been simulated under standard conditions of pressure and temperature are illustrated in Fig. 2(a) . A typical optically thick multi-layer and planetary boundary layer were simulated at 4.4-6.8 km and below 2 km, respectively. Figure 2(a) shows the true value along with a signal contaminated by Gaussian noise with zero-mean. Figure 2(b) shows the true value and retrieved aerosol backscatter coefficient profiles by the normal Fernald method. Figures 2(a) and 2(b) also shows the de-noised signals and the aerosol backscatter coefficient retrieved by our method with the optimized parameters we will discuss, respectively. Generally, the aerosol backscatter coefficient retrieved by the normal Fernald method has a much larger uncertainty than that of our method. Furthermore, the noise of the lidar signal has been significantly suppressed by our method.
In order to obtain optimal results from using our method, we need to optimize parameters based upon the performance function we defined. The values of the performance function with the changes of N and δ1/2 are shown in Fig. 3 , respectively. The values of the performance function with a same δ1/2 but different ensemble sizes decrease with the ensemble size, and tended to be stable after 20. After weighing the de-noising effect and the calculation time, we select 60 as the ensemble size. Furthermore, the performance function with the same ensemble size but different δ1/2 has a minimum when δ1/2 approaches 1.2 which coincides with results from a previous study . Thus, we applied δ1/2 = 1.2 and N = 60 for the retrieval.
We simulated 200 noise contaminated profiles as shown in Fig. 2 and calculated the mean and standard deviation of the signals and retrieval results at different height as shown in Fig. 4 for investigating the performance of the Fernald method and our method. In the range below the multi-layer, the standard deviations of both methods are comparable and negligible. This suggests that both methods work well in this range since the SNR is very high. In the range of the multi-peaks layer, though there are biases in the result of our method, the standard deviations of our method are much lower than that of the Fernald method. Generally, the errors of our method are smaller than that of the Fernald method in this range, and our method can keep layer’s structure and distinguish close layers well. In the range above the multi-layer, the standard deviations of our method are distinctly less than that of the Fernald method. This suggests that the noise should be suppressed in this range because of the low SNR.
3.2 Testing with real signals
To further verify the practical utility and de-noising effect of our method, we test our algorithm in real cases measured by our ground-based lidar (532 nm) on 30th November 2008 at Wuhan University in Wuhan, China (30.5° N and 114.3° E). The vertical and time resolutions of the measurement are 7.5 m and 1 min, respectively. The overlap of our lidar is unity at ranges beyond 300 m, and the signals in the overlap affected range are not shown in this paper.
Figures 5(a) and 5(b) are the sequential aerosol backscatter coefficients retrieved by the normal Fernald method and our method, respectively. The patterns of Figs. 5(a) and 5(b) are the same, which means our method is as stable as the classic Fernald method and is suited for long-term determining of the aerosol optical properties. Figure 5(c) shows the mean and standard deviation of the observation signals and the de-noised signals, respectively. Figure 5(c) illustrate that the noise of the lidar signal has been suppressed by our method. However, the relative differences of the statistics information are not as significant as those of the simulated case, that because the variations of real signals are also contributed by the variation of the backscatter coefficient and the two-way transmittance of the atmosphere as shown in Eq. (1). However, in the simulated case, the backscatter coefficient and the two-way transmittance are constants. Figure 5(d) shows the mean and standard deviation of the aerosol backscatter coefficients retrieved by the normal Fernald method and our method. In the range below 4 km, the standard deviations of both methods are comparable. On the contrary, in the range above 4 km, the standard deviations of our method are distinctly less than that of the Fernald method, which conforms with the conclusion of the simulated case.
To further demonstrate the performance the performance of our method, we show six cases of results of one-minute signals retrieved by the two methods, as well as results of 64 minutes averaged signals, which are centered on the one-minute signal, retrieved by the Fernald method. The six one-minute signals are the signals at 22:00 to 5:30 with a 1:30 interval. As Figs. 6(a) -6(f), for one-minute signals, smooth signals can be observed below 4 km, however, the signals are seriously contaminated by noise above 4 km due to decreasing of the signal intensity and SNR caused by the 1/r2 effect and the attenuation of laser beam energy. The aerosol backscatter coefficient retrieved by our method not only has smaller uncertainty above 4 km, but also fits the profile of the normal Fernald method below 4 km well. The results of our method are consistent with that of the 64 minutes averaged signals. That means the effect of our method is approximately equivalent to that of averaging 64 replications in our retrieval, which can halve the standard error of the signal three times.
As seen in Figs. 7(a) -7(f), which are enlarged subsection views of Figs. 6(a)-6(f), respectively, the aerosol backscatter coefficient in the near range changes largely due to the dynamics of the boundary layer. In comparison with the retrieved result of the one-minute profile, both the results of our method and the 64 minutes averaged signal shown an over-smoothing in the peak information of the inflection points in many locations. For our method, over-smoothing was due to the poor forecast caused by the rapid change of the tendency of aerosol variation. Our method can provide acceptable results, but lead to a shift of a few bins sometimes because the effects of the poor forecast which can be still residual in the retrieval of a few lower bins. In the high-quality range below 2 km, the bias caused by the shift can be larger than the suppression impact for the random noise as shown in Figs. 7(a)-7(c), so the de-noise process is unnecessary. In the range between 0.3 and 2 km, assuming the results of the Fernald method are the “true” values and the random error of the results of our methods is negligible, we found that the relative bias (shift) of the results of our method is 5.7% on average. In the range between 2 and 4 km where the signal quality is still acceptable, our method starts to benefit the retrieval as shown in Figs. 7(d)-7(f). Figures 7(d)-7(f) also illustrate that our method can keep layer’s structure and distinguish close layers well under the noise condition. In the range above 4 km, the suppression impact for the random noise is much more significant than the bias caused by the shift of a few bins, so the de-noising effect is more remarkable as seen in Figs. 6(a)-6(f).
In the future, we can use a layer detection algorithm (e.g. the simple multiscale algorithm ) to find the base, peak and top of the aerosol and cloud layers, and then apply the normal Fernald method to retrieve the backscatter coefficient of the bins near the base, peak and top to avoid the effects of a poor forecast. Moreover, we can develop a forecast strategy for the layer base, peak and top. Note that we are not emphasizing the accuracies of the lidar ratio and the backscatter calibration, but emphasizing the effect of anti-noise in the retrieval. In this study, we consider a same constant lidar ratio and backscatter calibration in all data retrieval, which acts to cause consistent biases where the biases can offset one another in the comparisons. The effect of the lidar ratio and the backscatter calibration in the retrieval can refer to the error analysis of the Klett-Fernald algorithm which we based upon .
We propose an alternative method for retrieving the aerosol lidar backscatter coefficient profile by combining the ensemble Kalman filter and the Fernald methods. We test our method by using complex simulated and real signals, respectively. In order to obtain optimal results of our method, we select parameters based on the performance function we defined. We select the δ1/2 as 1.2 as well as the ensemble sizes as 60 in our retrieval. The retrieval results show that our method can obtain accurate backscatter coefficient. The results of both simulated and real signal demonstrate that the effectiveness and efficiency of our method are satisfactory. In particular, the experimental results of real signals show that our method can work at a far range where the signal-to-noise ratio is low. For our lidar data set, the effect of the new method is approximately equivalent to an average of 64 replications, which can halve the standard error three times. Furthermore, this study suggests that a de-noise scheme is unnecessary in the near range because the SNR is very high.
In this study, we only consider single-scattering in the lidar equation. However, the multiple scattering effects should be taken into account in future studies. Moreover, we can develop a forecast strategy for the layer base, peak and top to avoid over-smoothing due to the poor forecast caused by the tendency change of aerosol variation. In essence, further verification and improvement as well as extensive application of our method by long-term observations can be useful for the determining of the aerosol optical properties.
This work was supported by 973 Programs (2009CB723905, 2011CB707106), the NSFC (10978003). Additional thanks goes to Tim Logan and Matthew Gibbons for their insightful and helpful comments for this manuscript.
References and links
1. V. A. Kovalev and W. E. Eichinger, Elastic lidar: theory, practice, and analysis methods (Wiley-Interscience, 2004).
4. W. Gong, F. Mao, and J. Li, “OFLID: Simple method of overlap factor calculation with laser intensity distribution for biaxial lidar,” Opt. Commun. 284(12), 2966–2971 (2011). [CrossRef]
5. R. Collis, “Lidar: a new atmospheric probe,” Q. J. R. Meteorol. Soc. 92(392), 220–230 (1966). [CrossRef]
7. F. Rocadenbosch, C. Soriano, A. Comerón, and J. M. Baldasano, “Lidar inversion of atmospheric backscatter and extinction-to-backscatter ratios by use of a Kalman filter,” Appl. Opt. 38(15), 3175–3189 (1999). [CrossRef]
9. F. Rocadenbosch, M. N. Reba, M. Sicard, and A. Comerón, “Practical analytical backscatter error bars for elastic one-component lidar inversion algorithm,” Appl. Opt. 49(17), 3380–3393 (2010). [CrossRef]
10. H. T. Fang and D. S. Huang, “Noise reduction in lidar signal based on discrete wavelet transform,” Opt. Commun. 233(1-3), 67–76 (2004). [CrossRef]
12. W. Gong, J. Li, F. Mao, and J. Zhang, “Comparison of simultaneous signals obtained from a dual-field-of-view lidar and its application to noise reduction based on empirical mode decomposition,” Chin. Opt. Lett. 9(5), 050101–050104 (2011). [CrossRef]
13. R. E. Kalman, “A new approach to linear filtering and prediction problems,” J. Basic Eng. 82(1), 35–45 (1960). [CrossRef]
14. R. N. Miller, M. Ghil, and F. Gauthiez, “Advanced data assimilation in strongly nonlinear dynamical systems,” J. Atmos. Sci. 51(8), 1037–1056 (1994). [CrossRef]
15. G. Evensen, “Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics,” J. Geophys. Res. 99(C5), 143–162 (1994). [CrossRef]
16. G. Evensen, “The ensemble Kalman filter: Theoretical formulation and practical implementation,” Ocean Dyn. 53(4), 343–367 (2003). [CrossRef]
17. J. L. Anderson and S. L. Anderson, “A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts,” Mon. Weather Rev. 127(12), 2741–2758 (1999). [CrossRef]
18. Y. Sasano, “Tropospheric aerosol extinction coefficient profiles derived from scanning lidar measurements over Tsukuba, Japan, from 1990 to 1993,” Appl. Opt. 35(24), 4941–4952 (1996). [CrossRef]
19. P. Sakov and P. R. Oke, “A deterministic formulation of the ensemble Kalman filter: an alternative to ensemble square root filters,” Tellus, Ser. A, Dyn. Meterol. Oceanogr. 60(2), 361–371 (2008). [CrossRef]