## Abstract

A procedure for the optimal utilization and representation of the information retrieved from atmospheric observations is discussed. We show that the “measurement-space solution” exclusively contains the information present in the observations, but is not suitable for a representation of the retrieved profile in the form of a graph. The new method of the “null-space regularization”, which provides in a complement space an external constraint and makes the solution compliant with this representation, is presented. In this way the measurement and the constraint are separately determined and are given in a vertical grid that can be freely chosen as fine as desired. The method is applied to simulated measurements and is used to define a procedure for data fusion.

© 2009 Optical Society of America

## 1. Introduction

Several techniques exist for the retrieval of atmospheric parameters from remote sensing observations [1, 2]. A typical retrieved quantity is the vertical distribution of an atmospheric constituent. Since the observations do not provide enough information to retrieve the continuous function of the vertical profile, the problem arises of how to represent, in terms of which and how many parameters, the finite number of independent pieces of information provided by the observations. The first choice is that of the vertical grid on which to represent the profile. This vertical grid should be as fine as possible in order both to make sure that all the information included in the observations is adequately represented by the profile and to limit the need for interpolation in the subsequent applications of the data. Indeed, operations of interpolation do introduce a loss of information [3]. However, whenever the information obtained from the observations is not sufficient to determine as many parameters as the grid points, a too fine grid makes the inversion problem ill-posed or ill-conditioned. This problem is generally overcome using some external information. In the optimal estimation method [1], according to the Bayesian approach, an a-priori probability distribution function of the atmospheric state is used to fill the gaps present in the observations. In this case the solution is represented by the state corresponding to the maximum of the a-posteriori probability distribution function, which combines the a-priori information with that of the observations. An alternative method consists in applying a smoothness constraint to the profile. This can be done adding to the chi-square function minimized by the retrieval a term that increases when the profile oscillates [4–9]. In this case the information not coming from the observations is in the smoothness constraint. A further method that allows to make the problem well conditioned consists in writing the profile as a linear combination of a finite number of continuous functions (generally triangular functions, boxcars, polynomials, sine and cosine functions etc…) and to determine the coefficients of these functions fitting the observations [1]. In this case the information not coming from the observations consists in the number and in the kind of the chosen functions. In the following discussion the external information not coming from the observations is referred to as “a-priori information” in all cases, even when it is applied in the form of a constraint.

A-priori information makes it possible to obtain a vertical profile on a fine grid, but generates other types of problems in utilizing the data. In validation activities, when the compared profiles use different a-priori information, it is necessary to assess the error due to this difference [10]. The estimation of the error component due to the use of different a-priori information is a difficult task because it requires a reliable knowledge of the climatological variability of the quantity to be compared, which often is not available. Also in data assimilation, which is the combining of diverse data sampled at different times and different locations into atmospheric models, the measurements containing a-priori information pose some problems [1]. Profiles in different geolocations obtained using the same a-priori information are correlated among themselves as well as profiles obtained using nearby retrieved profiles as a-priori. If not taken into account, these correlations can produce a bias in the products of the assimilation. Also in the case of data fusion, which consists in the synergetic use of more than one measurement of the same atmospheric state obtained with different instruments, the presence of a-priori information in the original measurements can produce a bias in the final product.

In order to overcome the problems posed by the a-priori information included in the retrieved profiles some approaches have been proposed. S. Ceccherini et al. [11] described a method to perform comparison of measurements that uses the intersection space of the spaces of the averaging kernels of the measurements to be compared. This approach does not involve degradation of the available information and properly accounts for the a-priori information included in the profiles to be compared. Joiner et al. [12] proposed two methods that allow to reduce or eliminate the a-priori information from the retrieved quantities: the null-space filtering of the retrievals and the partial eigen-decomposition retrieval. Both methods involve the singular value decomposition (SVD) of appropriate retrieval operators and provide the coefficients of the singular vectors as quantities to be assimilated. Clarmann et al. [13] proposed a method that eliminates the a-priori information by means of a resampling on a coarser vertical grid providing a profile with averaging kernel matrix equal to the unity matrix.

In this paper a procedure is discussed for the optimal exploitation and representation of the vertical profile information retrieved from atmospheric observations. With optimal exploitation it is here intended the use of all the information present in the observations without addition of external information and, therefore, implies the use of a retrieval grid as fine as needed without any a-priori information. This objective has already been attained by Joiner et al. [12], but they only obtain an expression suitable for data assimilation which, as we shall see in Section 3, is not suitable for a representation of the retrieved profile in the form of a graph. The combination of an optimal exploitation with an optimal representation of the profile is the first objective of this paper. The second objective is to extend the optimal exploitation and representation procedure to the case of data fusion.

In Section 2 the theory of the proposed new method for the representation of the retrieved profile is described and in Section 3 it is applied to a simulated case. Section 4 describes the proposed procedure for data fusion problem.

## 2. Theory

#### 2.1 Terminology and notation

Consistently with the notations adopted by Rodgers [1], we represent the observations (radiances) with a vector **y** of *m* elements and the vertical profile of the unknown atmospheric parameter with a vector **x** of *n* elements corresponding to a predefined altitude grid. The forward model is a function **F**(**x**) that provides the value of the observations when the profile **x** is known. Therefore, the relationship between the vectors **x** and **y** is

where **ε** is the vector containing the experimental errors of the observations, characterized by a variance-covariance matrix (VCM) **S _{y}** given by the mean value of the product of

**ε**times its transposed.

We shall consider the case of a nearly linear relationship between **y** and **x**. In this case **F**(**x**) can be expanded up to the first order around a specific value of **x**, identified by **x _{0}** and referred to as the

*linearization point*, and Eq. (1) becomes equal to:

where **K** is the Jacobian matrix (that is the partial derivatives of **F**(**x**) with respect to the elements of **x**) calculated at **x _{0}**.

Equation (2) implies that the components of **y**-**F**(**x _{0}**) are the scalar products between

**x**-

**x**and the rows of

_{0}**K**plus the error components. It follows that the knowledge of

**y**-

**F**(

**x**) determines the knowledge of the component of

_{0}**x**-

**x**that lies in the space generated by the rows of

_{0}**K**. In agreement with [11], and with the conventions that

**y**are the observations and

**x**are the aimed measurements, we shall refer to this space as the

*measurement space*. Notice that this is not the terminology adopted in [1] where this space is called

*row space*and the expression

*measurement space*is instead used to indicate the space of

*m*dimensions to which the observations

**y**belong.

With the inversion of Eq. (2) one obtains the correction **x**-**x _{0}** that has to be applied to

**x**in order to determine

_{0}**x**. Since the correction is made in the measurement space and no information is acquired about the components of

**x**-

**x**in the orthogonal space that is complementary to the measurement space, also the determination of

_{0}**x**is to be considered as only made in the measurement space. However, this statement requires some further explanation. In fact, if

**x**has a component in the orthogonal space, when

_{0}**x**is combined with

_{0}**x**-

**x**to give

_{0}**x**also

**x**acquires this component. However such a component cannot be considered to be a result of the retrieval process because it is a contingent quantity that depends on the procedure rather than on the observations. On the basis of these considerations it is correct to say that since

**x**-

**x**is determined in the measurement space, also

_{0}**x**is determined in this space. The orthogonal complement space to the measurement space is referred to as the

*null space*[1].

In order to weigh the observations with their errors and to avoid the complication of correlated errors it is useful to consider the quantities **S _{y}**

^{-1/2}

**y**instead of the observations

**y**. Multiplying both terms of Eq. (2) on the left by

**S**

_{y}^{-1/2}we obtain:

It is easy to verify that **S _{y}**

^{-1/2}is characterized by a VCM that is the unity matrix.

#### 2.2 Representation of the profile

The unknown profile **x** is a vector of n elements and belongs, therefore, to the space ℝ^{n} which can be decomposed in the direct sum of the measurement space and of the null space. If we indicate with *p* the dimension of the measurement space then the dimension of the null space is *n*-*p*. Since each vector of ℝ^{n} can be decomposed as the sum of a vector belonging to the measurement space and a vector belonging to the null space, we can write:

where **x _{a}** and

**x**, respectively, belong to the measurement space and to the null space. They can be expressed as:

_{b}where **V** is a matrix whose columns are an orthonormal basis of the measurement space, **W** is a matrix whose columns are an orthonormal basis of the null space and **a** and **b** are the projections of **x** on these orthornormal bases:

where the superscript^{T} denotes transposed matrices.

#### 2.3 Estimation of the components in the measurement space

The component of **x** in the measurement space is the only quantity that can be derived from the observations. In order to find **x _{a}** from Eq. (5) we need to identify

**V**and

**a**. To this purpose we perform the SVD of

**S**

_{y}^{-1/2}

**K**:

where **U** is a matrix of dimension *mxp* whose columns (referred to as left singular vectors) are an orthonormal basis of the space generated by the columns of **S _{y}**

^{-1/2}

**K**,

**Λ**is a nonsingular diagonal matrix of dimension

*pxp*and

**V**is a matrix of dimension

*nxp*whose columns (referred to as right singular vectors) are an orthonormal basis of the space generated by the rows of

**S**

_{y}^{-1/2}

**K**. Since

**S**

_{y}^{-1/2}is a nonsingular matrix the space generated by the rows of

**S**

_{y}^{-1/2}

**K**coincides with the space generated by the rows of

**K**, therefore the columns of

**V**are an orthonormal basis of the measurement space and, among all the possible orthonormal bases of the measurement space, it can be chosen for representing

**x**with Eq. (5). We can now determine the components of

_{a}**x**relative to this orthonormal basis.

_{a}Substituting Eq. (9) in Eq. (3), multiplying both terms on the left by **Λ**
^{-1}
**U**
* ^{T}* and using Eq. (7), after some rearrangements, we obtain that

**â**, i.e. the estimation of

**a**deduced from the observations, is given by:

where

is the error that we make taking **â** as the estimation of **a** and is characterized by the diagonal VCM:

From Eq. (12) we see that the components of the vector **â** are independent of each other and are characterized by variances given by the inverse of the squared singular values of **S _{y}**

^{-1/2}

**K**. Therefore, components corresponding to large singular values are well determined while components corresponding to small singular values are poorly determined.

The definition of **x _{a}** in terms of

**V**and

**a**, determined respectively by Eq. (9) and by Eq. (10), will be referred to as the

*measurement-space solution*(MSS). This solution is obtained exploiting all the information coming from the observations without any a-priori information. It is, therefore, the optimal quantity to be used for further post-retrieval processing. This conclusion is consistent with the results obtained by [12] where a solution equal to that of Eq. (10) is used for data assimilation.

The MSS, while optimal for further processing because comprehensive, unbiased and with a diagonal VCM, is not suitable for a representation of the retrieved profile in the form of a graph (i.e. for a graphical representation). Indeed, if the dimension *n* is chosen with the desirable redundancy, components of the MSS that correspond to small singular values are poorly determined and make this solution ill-conditioned. On the other hand, if the poorly determined components are removed, the size of the null space grows at the expenses of the measurement space and, as it will be shown in Section 3, the retrieved profile acquires a rather unphysical shape.

As discussed in the introduction, these difficulties have forced alternative solutions, such as the use of a coarse and instrument dependent altitude grid and the use of constraints in the retrieval (regularization [4–9] and optimal estimation [1]).

#### 2.4 Estimation of the components in the null space

In general the information provided by the observations is not sufficient for the selection of a single profile, indeed it selects a set of possible profiles given by Eq. (4) where **x _{a}** is determined by the observations and

**x**is given by Eq. (6) with an arbitrary value of the components

_{b}**b**. The problem is here how to choose these components in the null space. When further processing of the data is planned it is adequate to provide the measurement information in a sub-space and taking advantage of this opportunity it is better to avoid any choice about

**x**. Indeed, the further processing may provide some estimation about the null-space components and any constraint (even the condition

_{b}**b**=0) does introduce a bias. However, if a graphical representation of the profile is requested different considerations apply. In a graphical representation the result is to be presented in a complete space. The representation of the

**x**profile with the MSS corresponds to selecting the value zero for

**b**. This is a particular choice among the infinitive possible ones, which does not necessarily provide the best graphical representation, as it will be shown in Section 3.

Since we do not expect that real profiles show unphysical oscillations, a good graphical representation can be obtained selecting the smoothest profile compatible with the observations. This can be done choosing a well conditioned MSS (obtained selecting only the well measured components of **a**) and determining in the null space (that now includes the null space enlarged with the poorly measured components rejected from the measurement space) the value of **b** that minimizes the oscillations. To this purpose, we minimize the quantity:

as a function of **b**, where **L _{1}** is the first-derivative matrix [7] with respect to altitude and Eqs. (4)–(6) have been used for the definition of

**x**. In these equations the matrices

**V**and

**W**do not necessarily coincide with the matrices used in Sections 2.2 and 2.3; they are now, respectively, the matrices containing the bases of the well measured subspace of the measurement space and of the new complementary null-space. Setting equal to zero the derivatives of the quantity in Eq. (13) with respect the components of

**b**and substituting for

**a**our estimation

**â**it results that the estimation of

**b**is given by:

where **R**=**L**
_{1}
^{T}
**L _{1}**. This calculation of the components of

**x**in the new null space allows obtaining the smoothest profile compatible with the observations and corresponds to a regularization of the profile; accordingly, it will be referred to as the

*null- space regularization*(NSR).

An important feature that makes this operation different from the other constraints that are applied to the retrieved profile is that in the case of the NSR the constraint is only applied to the null-space component and does not affect the measured component.

#### 2.5 Solution and its characterization

An estimation **x**̂ of the profile **x** that makes use of both the MSS and the NSR is obtained from Eqs. (4)–(6) and (14) and is equal to:

where **â** is given by Eq. (10). As it will be shown in Section 3, the solution **x**̂ given by Eq. (15) provides a useful graphical representation of the retrieved profile and it will be referred to as *regularized measurement-space solution* (RMSS). The equations that provide the RMSS can be used to determine its VCM and its averaging kernel matrix [1].

From Eqs. (12) and (15) we obtain that the VCM of **x**̂ is equal to:

In order to calculate the averaging kernel matrix we have to differentiate **x**̂, given by Eq. (15), with respect to the true profile **x** [1]. Since from Eqs. (2), (9) and (10) it results that:

from Eq. (15) we obtain that the averaging kernel matrix of the estimation of the profile is equal to:

## 3. Application of the method to a simulated case

In this section we provide some examples of the new procedure in the case of simulated measurements of the MIPAS (Michelson Interferometer for Passive Atmospheric Sounding) instrument [14] that flies aboard Envisat (ENVIronmental SATellite). MIPAS is a Fourier-transform spectrometer operating in the middle infrared that observes the atmospheric emission at the limb for the retrieval of the vertical profiles of several minor atmospheric constituents. The code adopted by the European Space Agency for the operational retrieval [15,16] uses a non-linear least-squares fit of the observed spectra with forward model simulations to retrieve the vertical profiles of pressure, temperature, water vapor, ozone, nitric acid, methane, nitrous oxide and nitrogen dioxide between 6 and 68 km of altitude.

The simulation is made in the case of a North pole climatological atmosphere of July with a modified ozone volume mixing ratio (VMR) profile **x**. The tangent altitudes for which the spectra are simulated are those corresponding to the MIPAS measurement scenario adopted between July 2002 and March 2004, that is with a 3 km step between 6 and 42 km, with a 5 km step between 42 and 52 km and with an 8 km step between 52 and 68 km (for a total of 17 tangent altitudes). The simulated observations **y** are obtained adding a realistic random noise to the radiances calculated with the forward model. The microwindow approach, described in [17], is adopted for the retrieval and of the 17 spectra only a subset of 4231 spectral points that contain the maximum information on ozone profile is used. The variances of the random noise and the apodization that is applied to the observed spectra allow building the VCM **S _{y}**.

The method described in Section 2 requires the definition of a predefined vertical grid on which to represent the ozone profile **x** and of a linearization point **x _{0}** close enough to the true profile in such a way that the linear approximation of the forward model is appropriate. The predefined grid can be chosen as fine as wished and, thanks to this freedom, it can be determined on the basis of the application rather than according to the vertical resolution of the measurements. In this example we have chosen a vertical grid of 1 km step between 0 and 100 km. The linearization point

**x**has been obtained interpolating at the predefined grid the ozone profile retrieved by the operational retrieval at the tangent altitudes grid. Accordingly, we have calculated the forward model and the Jacobian in the linearization point (respectively

_{o}**F**(

**x**) and

_{0}**K**). With these quantities we have all the ingredients needed to apply the method described in Section 2.

The SVD of **S _{y}**

^{-1/2}

**K**(a matrix of dimension 4231×101) determines 98 singular values different from zero that indicate a measurement space of dimension

*p*=98. The large rank of the matrix comes out from the SVD operation; however only few components correspond to large singular values (we recall that, according to Eq. (12), components corresponding to small singular values are determined with a large error). All the 98 measured components with their errors can be considered with no harm in successive data assimilation or data fusion operations (because in these applications the components will be weighted according to their errors), however if we want to make a graphical representation of the profile it is preferable not to include components with large errors because they introduce large uncertainties in the values of the profile. In this case, as suggested in Subsection 2.4, it is preferable to reduce the dimension

*p*of the measurement space and to consider the components affected by a large error as belonging to the null space.

The result of this operation is reported in panel (a) of Fig. 1 which shows the **X**̂ profile (red line) obtained with the procedure described in Section 2 in the case of considering only the 18 largest singular values. In the same panel the two components **X̂ _{a}** (blue line) and

**X̂**(green line) in the measurement space and in the null space, respectively, of

_{b}**X̂**are shown, illustrating at each altitude the relative importance of MSS and NSR contributions. In panel (b) the differences between

**X̂**and the

*true profile*

**x**

_{true}used to generate the observations (red line) and between

**X̂**and

_{a}**x**

_{true}(blue line) are shown on an expanded scale.

The **X̂ _{a}** profile obtained with the MSS method is characterized by oscillations and the addition of the

**X̂**component obtained with the NSR method is able to remove them and to decrease significantly the difference between the retrieved profile and the true profile.

_{b}As expected, the contribution of **X̂ _{b}** is predominant at altitudes larger than 68 km (the highest measured tangent altitude), below 6 km (the lowest measured tangent altitude) and at some altitudes between 52 and 68 km where the step of the measured tangent altitudes is the largest (8 km).

The largest values of the residual difference are observed at around 24 km of altitude, where a better vertical resolution would be needed in order to resolve the variation of the profile that occurs on a smaller scale with respect to both the step of the measurement tangent altitudes and the field of view of the instrument (that are equal to about 3 km).

In order to identify the optimal number of the largest singular values to consider for a good graphical representation of the profile the trade-off between the noise error and the smoothing error [1] has to be analyzed. A too large number of singular values implies that the noise affecting the observations can propagate with a large amplification into the retrieved profile, with the consequence of a too large retrieval error. A too small number of singular values reduces the capability of the retrieved profile to reproduce the shape of the real profile. These effects are verified in Fig. 2 where the true ozone profile (black line) is compared with the profile **X̂** obtained with the procedure described in Section 2 in the case of considering the 10 (blue line) and the 30 (red line) largest singular values.

The use of only the 10 largest singular values does not allow reproducing the feature at 24 km of altitude and also the correct value of the maximum of ozone, on the other hand the use of the 30 largest singular values determines the presence of large oscillations due to the noise amplification introduced by the ill-conditioned components.

The trade-off associated with the choice of how many largest singular values to maintain in the MSS solution is further analyzed in Fig. 3 which shows the comparison between the noise error (black line) and the smoothing error (red line), averaged on all the altitudes, as a function of the number of considered singular values. The noise errors have been calculated taking the root square of the diagonal elements of **S _{x}** given by Eq. (16), while the smoothing errors have been calculated as the absolute value of the differences between the profile obtained with the procedure described in Section 2, where

**a**is calculated projecting directly the true profile on the basis represented by the columns of

**V**, and the true profile.

As expected, the noise error increases with the number of considered singular values while the smoothing error decreases with it. Above a certain value of the number of singular values a slow decrease of the smoothing error is observed as a consequence of the fact that in the true profile there are features at high altitude that are not caught by the singular values considered in the plot. A number of considered singular values that corresponds to a good compromise between the two errors is that for which the sum of the two errors is minimum. In our case the minimum is obtained considering 18 singular values and the corresponding **X̂** profile is shown in Fig. 1. It is interesting to notice that the value 18 is larger than the number of measured tangent altitudes, which is equal to 17, but not significantly different. This confirms that also in the operational retrieval [15, 16] a good optimization has been made with the choice of a retrieval grid with points on the 17 tangent altitudes.

In Fig. 4 the difference between **X̂** and **x**
_{true} (black line) is compared with the difference between the profile obtained by the operational retrieval interpolated on the predefined grid (that in our case coincides with the linearization point **x _{0}**) and

**x**

_{true}(red line). The two methods (RMSS and operational retrieval) have comparable quality even if the first better reproduces the values at high altitude and the feature at 24 km while the second does better reproduce the values below 20 km. The differences are due to the fact that the two methods provide a slightly different compromise between vertical resolution and precision; however the general agreement shows that the NSR has provided an efficient regularization at least as good as the constraint of a limited number of retrieval points.

In Subsection 2.3 we have seen that, when the retrieved information is used for subsequent operations, it is better not to provide information about the null-space components, while no constraint (apart from the opportunity of limiting the amount of data) exists as far as the number of provided measurement-space components is concerned. In this case the MSS solution is the ideal product, since it contains all the needed information without useless null space assumptions. On the other hand, as shown in Fig. 3, when the retrieved information is used for the graphical representation of the profile, the conflicting requirements of representation in a complete space and of minimum error impose that some criterion is used for the determination of the optimal number of measurement-space components and that the NSR is applied to the complementary null space.

With the algebra discussed in Section 2 a single procedure can be used to provide the optimal profile for further operation (MSS) as well as the optimal profile for a graphical representation (RMSS). Both profiles can be represented on the same predefined grid and no measurement constraint exists about the choice of this grid. The possibility of using a fine predetermined grid is a major advantage, because it can prevent the approximations that are introduced by interpolations.

## 4. Application to data fusion

The theory developed in Section 2 also provides a framework for the handling of the data fusion problem. When several independent measurements of the same profile are available there is the problem of how to combine the different measurements in order to exploit all the available information for a more comprehensive and accurate description of the atmospheric state. Generally the different measurements are represented on different grids (defined by the different observation and retrieval methods) and their combination implies some interpolation that produces a loss of information [3]. Furthermore, it is possible that the measurements contain a-priori information that may introduce a bias in the fused data. Interpolations and biases are the two main problems encountered in data fusion. The method described in Section 2 removes these two problems. Indeed, the MSS provides a representation of the profile without a bias due to a-priori information and on a predetermined grid as fine as one likes so that interpolation is no longer needed. A further advantage of the MSS solution is that it does not depend on procedure-dependent quantities, such as **x _{0}**, so that it is not necessary to maintain memory of them.

In the light of these advantages of the MSS, in this Section we analyze how the MSS can be used for the fusion of two independent indirect measurements of the same profile **x**. The results of this section can be easily extended to an arbitrary number of independent measurements.

Since the predetermined grid can be chosen freely and be common to the two measurements, we can assume that the two measurement spaces are subspaces of the same space ℝ^{n}. The two MSSs are characterized by the matrices **V _{1}** and

**V**that identify the measurement spaces (of dimensions

_{2}*p*and

_{1}*p*respectively) and by the two vectors

_{2}**â**and

_{1}**â**(with their VCM

_{2}**S**and

_{a1}**S**). Each MSS provides the elements of the relationship shown in Eq. (10). We can write the two relationships in the compact form:

_{a2}where the notation $\left(\begin{array}{c}\mathbf{P}\\ \mathbf{Q}\end{array}\right)$ means the matrix (vector) obtained arranging the rows of the matrix (vector) **Q** below the rows of the matrix (vector) **P** and **α _{a1}** and

**α**contain the errors with which

_{a2}**a**=

_{1}**V**

_{1}^{T}

**x**and

**a**=

_{2}**V**

_{2}^{T}

**x**are estimated by

**â**and

_{1}**â**.

_{2}Eq. (19) implies that $\left(\begin{array}{c}{\hat{\mathbf{a}}}_{\mathbf{1}}\\ {\hat{\mathbf{a}}}_{\mathbf{2}}\end{array}\right)$ are the measurements of **x** in the space generated by the columns of **V _{1}** and of

**V**. This space is in the union space of the two measurement spaces and the profile that determined by these measurements is the result of the data fusion. This problem is equal to that faced in Section 2 consisting in finding a representation of the profile

_{2}**x**of ℝ

^{n}when the component in a subspace is measured; accordingly, also the same procedure is followed in order to determine the solution. We can represent the profile

**x**in the form of the summation of a vector of the union space of the two measurement spaces and of a vector of the orthogonal complement space to this space (which coincides with the intersection space of the two null spaces of the original measurements). The vector in the union space is the MSS of the data fusion problem and its identification implies the determination of the elements of the following relationship:

where **â**
_{12} is the estimation of the components of **x** in the basis of the union space represented by the matrix **V _{12}**, and

**ε**is its error.

_{a12}As in Section 2 we transform the vector $\left(\begin{array}{c}{\hat{\mathbf{a}}}_{\mathbf{1}}\\ {\hat{\mathbf{a}}}_{\mathbf{2}}\end{array}\right)$ in such a way that the VCM becomes the unity matrix. This can be obtained multiplying Eq. (19) by the matrix $\left(\begin{array}{cc}{{\mathbf{S}}_{{\mathbf{a}}_{\mathbf{1}}}}^{-1/2}& \mathbf{0}\\ \mathbf{0}& {{\mathbf{S}}_{{\mathbf{a}}_{\mathbf{2}}}}^{-1/2}\end{array}\right),$, obtaining:

If we perform the SVD of the kernel of Eq. (21) we obtain:

where **U _{12}** is a matrix of dimension (

*p*+

_{1}*p*)x×

_{2}*p*, with

_{12}*p*≤(

_{12}*p*+

_{1}*p*),

_{2}**Λ**is a nonsingular diagonal matrix of dimension

_{12}*p*×

_{12}*p*and

_{12}**V**is a matrix of dimension

_{12}*n*×

*p*whose columns are an orthonormal basis of the space generated by the rows of $\left(\begin{array}{cc}{{\mathbf{S}}_{{\mathbf{a}}_{\mathbf{1}}}}^{-1/2}& {\mathbf{V}}_{\mathbf{1}}^{T}\\ {{\mathbf{S}}_{{\mathbf{a}}_{\mathbf{2}}}}^{-1/2}& {\mathbf{V}}_{\mathbf{2}}^{T}\end{array}\right)$. Since

_{12}**S**

_{a1}^{-1/2}and

**S**

_{a2}^{-1/2}are nonsingular matrices, the space generated by the rows of $\left(\begin{array}{cc}{{\mathbf{S}}_{{\mathbf{a}}_{\mathbf{1}}}}^{-1/2}& {\mathbf{V}}_{\mathbf{1}}^{T}\\ {{\mathbf{S}}_{{\mathbf{a}}_{\mathbf{2}}}}^{-1/2}& {\mathbf{V}}_{\mathbf{2}}^{T}\end{array}\right)$ coincides with the space generated by the columns of

**V**and of

_{1}**V**. Consequently the columns of

_{2}**V**are an orthonormal basis of the union space of the two measurement spaces.

_{12}The knowledge of **V _{12}** can be used to determine the vector

**â**

_{12}. Substituting Eq. (22) in Eq. (21) and multiplying both terms on the left by

**Λ**

_{12}^{-1}

**U**

_{12}*we obtain:*

^{T}where **ε _{a12}** is given by:

and is characterized by the VCM:

The solution of the data fusion problem consists in the determination of the components of x in the union space of the two measurement spaces and in performing a regularisation in the orthogonal complement space of the union space. Eq. (23) and **V _{12}** obtained from Eq. (22) provide the MSS of

**x**in the union space of the two measurements. The undetermined component of

**x**lies in the orthogonal complement space of the union space, and can be determined following the procedure described in subsection 2.4. Finally the complete expression of the solution and its characterisation can be obtained as described in subsection 2.5.

Since the space characterized by **V _{12}** exclusively includes the two original measurement spaces, the MSS in the union space has been obtained by utilizing all the information provided by the two sets of observations and without any a-priori information. This characteristic makes the proposed method for data fusion optimal and equivalent to the simultaneous analysis of the two sets of observations.

## 5. Conclusions

A procedure for the optimal exploitation and representation of the vertical profile retrieved from atmospheric observations has been discussed. The optimal exploitation implies to extract all the information contained in the observations and to make it fully available to subsequent application, such as data comparison, data assimilation and data fusion. To this purpose the retrieved profile must be without any a-priori information and be represented on a vertical grid as fine as needed both for its exhaustive description and for the prevention of subsequent interpolation approximations. We have shown how this result is obtained with the *measurement-space solution* (MSS) which gives the retrieved profile in terms of its components in the subspace defined by the rows of the jacobian matrix. The determination of the MSS coincides with the method proposed by Joiner et al. [12] for data assimilation and leaves completely undetermined the component of the retrieved profile in the null space (that is the complement space to the measurement space).

The MSS, while being optimal for further processing because comprehensive, unbiased by a-priori information and with a diagonal VCM, is not suitable for a graphical representation of the retrieved profile. Indeed, using simulations applied to a test case of ozone retrieval with the MIPAS instrument, we have shown that if the dimension of the vertical retrieval grid is chosen with the desirable redundancy, components of the MSS that correspond to small singular values are poorly determined and the retrieval problem may become ill-conditioned. On the other hand, if the poorly determined components are removed, the size of the null space grows at the expenses of the measurement space and the retrieved profile acquires a rather unphysical shape. This unphysical shape is frequently observed in retrieval techniques and is due to the fact that in a graphical representation the result is presented in a complete space. In a complete space the representation of the profile with only the MSS component corresponds to the selection of the value zero for the null-space component. This is a particular choice among the infinitive possible ones, which does not necessarily provide the best graphical representation.

The above considerations underline the importance of choosing an adequate null-space component when a graphical representation is made. To this purpose we have presented the *null-space regularization* (NSR) method for the calculation of the null-space component which allows to obtain the smoothest profile compatible with the observations. This is a regularization and is different from the other regularizations since it does not affect the measured component and only determines the one that has not been measured. The fact that two components (the MSS and the NSR) are determined respectively in the measurement space and in the null space provides a clear and easily traceable distinction between the measurement and the constraint. Simulations have shown that MSS plus NSR, that is referred to as the *regularized measurement-space solution* (RMSS), provides a good graphical representation of the retrieved profile.

Finally, the procedure developed for the calculation of the RMSS has been applied to data fusion problems. A method has been presented that allows, from the MSS of two or more independent measurements, the calculation of the fused profile with the determination of its MSS in the union space (that is the space obtained merging the measurement spaces of the original measurements). The components of the profile in the orthogonal complement of the union space can be calculated with the NSR method. The sum of the MSS component with the NSR component provides the RMSS of the fused profile. This RMSS is the optimal data fusion in the sense that the result is the same as it would have been obtained with the simultaneous analysis of the measurements.

These results show that the MSS, combined with the NSR, provides both an optimal exploitation and an optimal representation of the vertical profile retrieved from atmospheric observations.

## References and links

**1. **C. D. Rodgers, *Inverse Methods for Atmospheric Sounding: Theory and Practice*, Vol. 2 of Series on Atmospheric, Oceanic and Planetary Physics (World Scientific, 2000). [CrossRef]

**2. **S. Twomey, *Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements* (Elsevier, New York, 1977).

**3. **B. Carli, P. Raspollini, M. Ridolfi, and B. M. Dinelli, “Discrete representation and resampling in limb-sounding measurements,” Appl. Opt. **40**, 1261–1268 (2001). [CrossRef]

**4. **A. Tikhonov, “On the solution of incorrectly stated problems and method of regularization,” Dokl. Akad. Nauk. SSSR **151**, 501–504 (1963).

**5. **S. Twomey, “On the numerical solution of Fredholm integral equations of the first kind by the inversion of the linear system produced by quadrature,” J. Assoc. Comput. Mach. **10**, 97–101 (1963).

**6. **B. Schimpf and F. Schreier, “Robust and efficient inversion of vertical sounding atmospheric high-resolution spectra by means of regularization,” J. Geophys. Res. **102**, 16037–16055 (1997). [CrossRef]

**7. **T. Steck, “Methods for determining regularization for atmospheric retrieval problems,” Appl. Opt. **41**, 1788–1797 (2002). [CrossRef] [PubMed]

**8. **S. Ceccherini, “Analytical determination of the regularization parameter in the retrieval of atmospheric vertical profiles,” Opt. Lett. **30**, 2554–2556 (2005). [CrossRef] [PubMed]

**9. **M. Ridolfi and L. Sgheri, “A self-adapting and altitude-dependent regularization method for atmospheric profile retrievals,” Atmos. Chem. Phys. Discuss **8**, 18007–18037 (2008). [CrossRef]

**10. **C. D. Rodgers and B. J. Connor, “Intercomparison of remote sounding instruments,” J. Geophys. Res. **108**, 4116–4130 (2003). [CrossRef]

**11. **S. Ceccherini, B. Carli, E. Pascale, M. Prosperi, P. Raspollini, and B. M. Dinelli, “Comparison of measurements made with two different instruments of the same atmospheric vertical profile,” Appl. Opt. **42**, 6465–6473 (2003). [CrossRef] [PubMed]

**12. **J. Joiner and A. M. da Silva, “Efficient methods to assimilate remotely sensed data based on information content,” Q. J. R. Meteorol. Soc. **124**, 1669–1694 (1998). [CrossRef]

**13. **T. von Clarmann and U. Grabowski, “Elimination of hidden a priori information from remotely sensed profile data,” Atmos. Chem. Phys. **7**, 397–408 (2007). [CrossRef]

**14. **H. Fischer, M. Birk, C. Blom, B. Carli, M. Carlotti, T. von Clarmann, L. Delbouille, A. Dudhia, D. Ehhalt, M. Endemann, J. M. Flaud, R. Gessner, A. Kleinert, R. Koopman, J. Langen, M. López-Puertas, P. Mosner, H. Nett, H. Oelhaf, G. Perron, J. Remedios, M. Ridolfi, G. Stiller, and R. Zander, “MIPAS: an instrument for atmospheric and climate research,” Atmos. Chem. Phys. **8**, 2151–2188 (2008). [CrossRef]

**15. **M. Ridolfi, B. Carli, M. Carlotti, T. v. Clarmann, B. M. Dinelli, A. Dudhia, J.-M. Flaud, M. Höpfner, P. E. Morris, P. Raspollini, G. Stiller, and R. J. Wells, “Optimized forward model and retrieval scheme for MIPAS near-real-time data processing,” Appl. Opt. **39**, 1323–1340 (2000). [CrossRef]

**16. **P. Raspollini, C. Belotti, A. Burgess, B. Carli, M. Carlotti, S. Ceccherini, B. M. Dinelli, A. Dudhia, J. M. Flaud, B. Funke, M. Höpfner, M. López-Puertas, V. Payne, C. Piccolo, J. J. Remedios, M. Ridolfi, and R. Spang, “MIPAS level 2 operational analysis,” Atmos. Chem. Phys. **6**, 5605–5630 (2006). [CrossRef]

**17. **A. Dudhia, V. L. Jay, and C. D. Rodgers, “Microwindow selection for high-spectral-resolution sounders,” Appl. Opt. **1**, 3665–3673 (2002). [CrossRef]