## Abstract

Estimation of the bulk optical properties of turbid samples from spatially resolved reflectance measurements remains challenging, as the relation between the bulk optical properties and the acquired spatially resolved reflectance profiles is influenced by wavelength-dependent properties of the measurement system. The resulting measurement noise is apparent in the estimation of the bulk optical properties. In this study, a constrained inverse metamodeling approach is proposed to overcome these problems. First, a metamodel has been trained on a set of intralipid phantoms covering a wide range of optical properties to link the acquired spatially resolved reflectance profiles to the respective combinations of bulk optical properties (absorption coefficient and reduced scattering coefficient). In this metamodel, the wavelength (500 – 1700 nm) is considered as a third input parameter for the model to account for the wavelength dependent effects introduced by the measurement system. Secondly, a smoothness constraint on the reduced scattering coefficient spectra was implemented in the iterative inverse estimation procedure to robustify it against measurement noise and increase the reliability of the obtained bulk absorption and reduced scattering coefficient spectra. As the estimated values in some regions may be more reliable than others, the difference between simulated and measured values as a function of the evaluated absorption and scattering coefficients was combined in a 2D cost function. This cost function was used as a weight in the fitting procedure to find the parameters of the *µ _{s}*’ function giving the lowest cost over all the wavelengths together. In accordance with previous research, an exponential function was considered to represent the

*µ*’ spectra of intralipid phantoms. The fitting procedure also provides an absorption coefficient spectrum which is in accordance with the measurements and the estimated parameters of the exponential function. This robust inverse estimation algorithm was validated on an independent set of intralipid

_{s}^{®}phantoms and its performance was also compared to that of a classical single-wavelength inverse estimation algorithm. While its performance in estimating

*µ*was comparable (

_{a}*R*

^{2}of 0.844 vs. 0.862), it resulted in a large improvement in the estimation of

*µ*’ (

_{s}*R*

^{2}of 0.987 vs. 0.681). The change in performance is more apparent in the improvement of RMSE of

*µ*’, which decreases from 10.36 cm

_{s}^{−1}to 2.10 cm

^{−1}. The SRS profiles change more sensitively as a function of

*µ*. As a result, there is a large range of

_{a}*µ*’ and a small range of

_{s}*µ*resulting in a good fit between measurement and simulation. The robust inverse estimator incorporates information over the different wavelengths, to increase the accuracy of

_{a}*µ*’estimations and robustify the estimation process.

_{s}© 2015 Optical Society of America

## 1. Introduction

Visible and Near-Infrared (Vis/NIR) spectroscopy is widely used in the biomedical domain (e.g., early cancer detection) [1,2] and in the food and agricultural domain (quality assessment of fruits and vegetables, determining moisture content in food, fat content in meat, etc.) [1–3]. Optical measurements on biological tissues are challenging due to the presence of multiple layers (e.g., a skin around the fruit flesh, a skin over the subcutaneous tissue) and their high turbidity due to multiple scattering by the structured tissues [1–9]. In turbid media, light undergoes many directional changes as it interacts with the tissue.

The interaction of light with illuminated samples typically involves two important phenomena: scattering by physical microstructures with a refractive index which is different from the surrounding medium (e.g. cells, cell organelles, fat globules, air pores, fibers, starch granules, etc.), and absorption by chemical molecules. As spectra acquired from these samples are the result of both scattering and absorption, they can be used to detect changes in both the microstructure and chemical composition. As both scattering and absorption phenomena influence the diffuse reflectance spectra, a combination of multiple acquisitions has been suggested as a means to effectively separate these properties. The multiple acquisitions can differ in the distance from the point of illumination (Spatially Resolved Spectroscopy or SRS) [10–13], in the time after illumination with a short laser pulse (Time Resolved Spectroscopy) [14–16] or in frequency (Frequency Resolved Spectroscopy) [17,18]. From these three methods, SRS is the most cost-efficient and robust [1,2], and can be easily implemented in a dedicated reflectance probe design where multiple detector fibers are positioned at different distances from the illumination fiber(s). Accordingly, this technique is very suitable for on-line measurements on highly scattering media such as biological tissues and fluids. The design of this probe could be optimized in function of the sample type by changing the number and diameter of the detection fibers, number and diameter of the illumination fibers, source-detector distances, etc [19].

Light propagation models are necessary to translate the bulk optical properties (BOP) of a tissue into the corresponding optical measurements. Likewise, inverse light propagation models are required to deduce the optical properties from these measurements. When BOP are estimated, the absorption properties are represented by the absorption coefficient *µ _{a}*, while the scattering properties are summarized into an anisotropy factor

*g*and scattering coefficient

*µ*, both often combined into a single reduced scattering coefficient

_{s}*µ*=

_{s}’*µ*∙(1-

_{s}*g*) [1,2]. A powerful light propagation methodology is the Monte Carlo (MC) method, that has the potential of simulating the light propagation in multi-layered, complex structures. The drawback of this methodology is, however, that its stochastic nature can destabilize the inverse estimation process if the effects are not sufficiently averaged out by tracing a large number of photons. However, tracing a sufficiently large number of photons, typically > 10

^{6}, requires long computation times. This makes this method less suitable for use in an iterative estimation scheme.

Stochastic data-based surrogate models, which are referred to as ‘metamodels’, have been proposed to reduce this computational cost and enable tasks such as visualization, design space exploration, sensitivity analysis and optimization [20,21]. Such surrogate models are typically used to solve forward problems [21], where the metamodel serves as a high-way bridge between the design space (input parameters) and the performance space (output parameters). Stochastic data-based surrogate models could either be built on MC simulations or on SRS measurements [22]. Building themetamodels on MC simulations would provide high flexibility. However, generating a database requires long computation times.

Altough MC simulations are capable of providing a valid theoretical reflectance profile as a function of the BOP, the geometrical relation between the sample and the detection system (e.g., imaging or probe) still needs to be implemented separately [9]. Generally, the efficiency of the detection system depends on both the wavelength and the entrance angle of the detected photons. As a result, identical BOP at different wavelengths can results in different measured SRS profiles. Accounting for these effects in theoretical light propagation modeling (e.g. MC simulations) is very challenging as the wavelength-dependent properties of the detection system cannot be easily characterized. Creating a direct link between the BOP and actual SRS measurements seriously simplifies and robustifies the estimation procedure as both the light propagation characteristics, as well as the measurement geometry is accounted for in a single model. Nevertheless, the SRS measurements need to cover the entire search space of absorption and scattering properties which is relevant for the specific application of the system. In order to meet these demands, optical phantoms can be designed by mixing a scattering agent with a specific absorber and a dilution agent (reference). The wavelength-dependent absorption and scattering properties of the optical phantom are the primary design factors.

Solving the reverse problem - retrieving the BOP from measurements - is more focused on exploring the search space as efficiently as possible. This is done by an iterative estimation of the BOP, for which the spatially resolved reflectance profiles simulated with the metamodels match best with the measured profiles. Due to the complexity of the light propagation model, the separation of the BOP (absorption and scattering coefficients) is usually performed in a wavelength-by-wavelength manner, considering a measured spectrum as a set of *x* uncorrelated variables. At each individual wavelength, the set of BOP is sought, for which the light propagation model provides simulated values which agree best with the measured ones [24]. As the detector noise on each of the considered measurements cannot be removed, the estimation procedure can become unstable, resulting in a suboptimal separation of the BOP. In other words, the noise on the multiple measurements accumulates in the estimated absorption and scattering coefficients, especially at the major NIR absorption bands where the acquired transmitted or reflected intensities are lower [25]. This is not so problematic when the BOP of different product types are compared [24], or when the major variability is studied within a product type [26]. However, it becomes an essential problem when the obtained BOP are used to derive microstructural information or predict the concentration of absorbing constituents.

As biological tissues are highly polydisperse, the scattering coefficient is expected to be smooth in function of the wavelength [24–26]. This can be established by studying the BOP of polydisperse particles in an absorbing medium and evaluating the effect of the wavelength on the scattering efficiency [27]. This prior knowledge could be exploited to robustify the inverse estimation and increase the reliability of the obtained absorption and scattering coefficient spectra. Several researchers have proposed to apply a spectrally constrained approach to improve the robustness of the fitting procedures [10,28–33]. For example, it was proposed to fit an exponential function to the scattering coefficient spectra and a linear combination of the *µ _{a}* -spectra of the pure sample components to the absorption coefficient spectra [10,35]. This approach works well when the scattering particles are sufficiently small (Rayleigh scattering), while each constituent of interest represents a significant share in the sample’s

*µ*spectrum (high concentration and/or high molecular absorption) [33,35]. Nevertheless, the fitting operations were performed after applying the inverse light propagation model to separate the BOP wavelength-by-wavelength. As a result, any subtle absorption information identified as scattering due to a slightly unstable separation will be removed from the data by the fitting procedure. On the other hand, if some scattering information is erroneously contributed to the absorption coefficient, it may still be recognized as an additional absorption peak when fitting the pure absorption coefficient spectra of the potential sample components.

_{a}Therefore, the objective of this study was to robustify the extraction of the bulk absorption and reduced scattering coefficient (*µ _{s}*’) spectra from optical measurements, by including expert knowledge on the scattering coefficient spectra. Moreover, information from adjacent wavelengths is taken into account in the BOP estimation at a certain wavelength in order to obtain a smooth

*µ*’-function in the end. Finally, this approach is validated on an independent test set of simulated profiles and its performance is compared to that of a traditional wavelength-by-wavelength iterative inversion of the metamodel.

_{s}## 2. Materials and methods

#### 2.1 Optical characterization of liquid phantoms

Fourty-two liquid calibration phantoms (seven scattering levels and six absorption levels) and nine validation phantoms were prepared as aqueous mixtures of *Intralipid ^{®} 20%* (batch 10FH1726, Fresenius Kabi, Germany), serving as scatterer and Naphtol Blue Black (Sigma-Aldrich, Missouri, USA) 5 mM stock solution, acting as absorber in the range around 618 nm. The composition of the phantoms, necessary to obtain the different scattering and absorption levels, is summarized in Table 1. The scattering levels are designed by adjusting the concentration level of the intralipid in the phantom. Theoretically, a pure intralipid solution results in a µ

_{s}’ of 298.54 cm

^{−1}at 600 nm [25]. Decreasing the scattering coefficient is possible by reducing the intralipid concentration and adding water (influencing the absorption profile). The absorption levels are obtained by adjusting the concentration of Naphtol Blue Black. A 1% stock solution results in a µ

_{a}of 5.3 cm

^{−1}at 620 nm. The resulting µ

_{a}concentration is the result of the absorption by the Napthol Blue Black, water and intralipid mixture. The absorption levels are a guideline for the optical phantom design, but the exact µ

_{a}values are determined with the DIS-setup. The BOP displayed in Table 1 are on one hand the µ

_{a}at 620 nm and the a µ

_{s}’ at 600 nm. The absorption coefficient

*μ*and the reduced scattering coefficient

_{a}*μ*’ over a large wavelength range were estimated from double integrating sphere (DIS) and unscattered transmittance measurements [23]. This resulted in a profile of BOP as a function of the wavelength (550 nm – 1700 nm, step size of five nm).

_{s}The SRS setup consists of two 600 µm fibers (Romack, Williamsburg, USA), positioned at a center-to-center distance varying from 1.1 mm to 4 mm in steps of 0.1 mm (30 different distances). The first fiber serves as an illumination fiber and is connected to an Avalight 10 W halogen light source (Avantes, Eerbeek, Netherlands). The second fiber is used for detection and is mounted on a motorized Thorlabs 25 mm XYZ-translation stage (Thorlabs, New Jersey, USA). A spatially resolved diffuse reflectance profile is obtained by moving the detection fiber, thus obtaining reflectance measurements at different source-detector distances. An optical switch is present between the illumination fiber and the light source, in order to automatically switch between sample and system reference measurements. The latter is done by coupling the light source directly to the detector through an OD 3 neutral density filter (Qioptiq Limited, Luxembourg).

Both the sample and system reference profiles are measured with a Zeis Corona Vis/NIR 1.7 diode array spectrophotometer (Carl Zeiss, Jena, Germany) with a spectral range between 400 and 1700 nm. The tip of the illumination fiber and detection fiber are consistently positioned just below the upper surface of the liquid phantom. Before measuring the actual sample, a similar measurement is performed with both fibers positioned at the entrance of an integrating sphere (AvaSphere-50, Avantes, Eerbeek, The Netherlands) to obtain a white reference measurement. The entrance port of the integrating sphere is equipped with a baffle to prevent detection of the light which only experienced single reflection on the bottom of the sphere.

Every measurement is repeated 3 times. These three measurements are executed in direct succession, to obtain an estimate of the variability introduced by the measurement set-up.

#### 2.2 Forward light propagation model

A Stochastic Kriging surrogate model has been developed using the *ooDACE* toolbox [36–38], which is a versatile Matlab toolbox that implements the popular Gaussian Process based Kriging surrogate models. As 3 replicate measurements were available for each Intralipid phantom, information was available on the measurement noise as a function of the BOP. The toolbox uses the mean fiber scores (mean diffuse reflectance values of the detection fibers) and the intrinsic covariance matrix, representing the variance of the output values. More details on the use of Stochastic Kriging surrogate models for inverse BOP estimation can be found in [22].

As a result, the Stochastic Kriging surrogate model links the absorption coefficient *µ _{a}* and the reduced scattering coefficient

*µ*to a spatially resolved reflectance profile. In principle, the link between a set of BOP (

_{s}’*µ*and

_{a}*µ*’) and the respective SRS reflectance profile is independent of the wavelength. However, the angular collection efficiency of the system (detection fiber + spectrophotometer) depends on the refractive index of both the tissue and the fiber material and the numerical aperture and angular efficiency of the sensor. As these properties are typically dependent on the radiation wavelength, this needs to be accounted for in the model. Therefore, wavelength is a third input parameter of the metamodel, guaranteeing that the wavelength-dependent characteristics of the measurement set-up are correctly incorporated into the model.

_{s}Two sets of liquid phantoms have been developed, to obtain a a calibration and validation set. The calibration samples were used to construct the metamodel, while the validation set was adopted to evaluate the model performance. The large number of phantoms, combined with the large wavelength range over which the measurements have been performed, resulted in a large set of BOP that could be used for training the metamodels. Nevertheless, not all the wavelengths of the calibration samples were considered. Moreover, as the BOP at neigbouring wavelengths of the same sample are strongly intercorrelated, while the BOP of different phantoms can overlap, only 20% of the BOP were selected from the calibration database to train the model. This was done to reduce the time needed for the construction of the metamodels without reducing their performance. The datapoints were chosen as such that they follow the most evenly sparsed distribution, resulting in better metamodeling performance [36–38]. Because of the strongly intercorrelated effects, this large omission will not cause any problems [22].

#### 2.3 Inverse light propagation model

In a classical inverse estimation of BOP, an optimization procedure is used to find the set of BOP for which the simulated diffuse reflectance values at different distances most closely match the values measured for the fibers postioned at different source-detector distances [22]. The optimization is typically a wavelength-by-wavelength approach where the cost function is minimized for each individual wavelength. Generally, the sum of the squared relative errors is used as the cost function *F* which needs to be minimized [22]:

*F*approaches zero, the simulated profile approaches the measured one .

*I*and

_{i,meas}*I*respectively represent the measured and simulated intensity at log

_{i,sim}_{10}-scale for the

*i*fiber, and

^{th}*N*is the number of detection fibers. The log-transformation of the cost function is usefull to obtain a more robust convergence in the optimization process.

Compared to *µ _{a}*, an absolute change in

*µ*has a much smaller impact on the change in diffuse reflectance at a particular distance [22]. As a result, a relatively large range of

_{s}’*µ*’-values results in a similar value for the cost function. As one can see from Fig. 1, by strictly minimizing this cost function at each wavelength individually, a noisy

_{s}*µ*spectrum is obtained, especially if measurement noise is significant [22]. This phenomenon is also reflected in a sub-optimal

_{s}’*µ*spectrum.

_{a}The *µ _{s}*’-spectra are expected to follow a smooth pattern as function of the radiation wavelength [24–26]. This expert knowledge could be exploited to robustify the inverse estimation and increase the reliability of the obtained bulk

*µ*-and

_{a}*µ*’-spectra. Therefore, an alternative optimization is suggested, where the cost function is minimized for the entire range of wavelengths simultaneously. This approach consist of the following steps:

_{s}First, a grid search is performed for every wavelength independently. This involves evaluation of the cost function from Eq. (1) for the simulated spatially resolved reflectance profiles corresponding to all combinations of 721 *µ _{s}*’ values equally spaced on a logarithmic scale over the valid range of the metamodel (2.50 cm

^{−1}to 158.50 cm

^{−1}) and (initially) 8

*µ*values equally spaced on a logarithmic scale over the valid range of the metamodel (0.01 cm

_{a}^{−1}to 31.60 cm

^{−1}). Only 8

*µ*values are initially evaluated, since this number will be increased when required – as illustrated in Fig. 2.

_{a}This results in the initial evaluation of the forward metamodel and the calculation of the metamodel for 721x8 different combinations of BOP. The number of evaluated *µ _{a}* -coefficients is then iteratively increased, until the cost function does not change significantly as a function of

*µ*. This is determinded by setting a threshold value, which in this case is defined as 10

_{a}^{0.2}. This concept is illustrated in Fig. 2: as long as the difference in cost function at two consecutive points is larger than 10

^{0.2}, the algorithm will evaluate the cost function for an extra 8 intermediate

*µ*-values. As a result, the density of evaluated

_{a}*µ*values increases, as the cost function converges towards a (local) minimum. Consequently, at the end of the procedure, a large number (> 8) of

_{a}*µ*values is evaluated for each

_{a}*µ*value. The exact number varies as a function of the cost function complexity – as illustrated in Fig. 2. The resulting cost function in Fig. 2 illustrates the large reduction in the number of BOP combinations that need to be evaluated by the algorithm, ensuing a better computational performance. Lowering the threshold value (lower than 10

_{s}’^{0.2}) improves the accuracy of the methodology: more

*µ*values will be evaluated, and the methodology will converge more closely towards an accurate

_{a}*µ*. Increasing the number of

_{a}*µ*values will also improve the accuracy, but the impact of obtaining more

_{s}’*µ*values is larger with the current treshold value. However, it is recommended prioritize computation performance in this initial stage, as the BOP of a large set of phantoms need to be evaluated.

_{a}In the second step, the *µ _{a}* value giving the lowest value for the cost function is retained for each combination of wavelength and evaluated

*µ*’, together with the corresponding cost. This results in a 2D cost function and 2D

_{s}*µ*function. The 2D cost function –illustrated in Fig. 3 is then used as a weight in the fitting procedure to find the parameters of a smooth, parametric

_{a}*µ*’ function which corresponds to the lowest cumulative cost over all wavelengths together. By entering the selected

_{s}*µ*’ values for every wavelength into the 2D

_{s}*µ*function the corresponding

_{a}*µ*values can be obtained. In this way, a

_{a}*µ*spectrum is retrieved which is in accordance with the measurements and the estimated parameters of the smooth, parametric

_{a}*µ*’ function. Moreover, the optimizer minimizes the sum of the cost functions over all the wavelengths, while imposing a parametric function on the

_{s}*µ*’-profile. While this concept could work for any smooth, parametric

_{s}*µ*’ function, an exponential function was considered in this study, as this has been reported to resemble well the

_{s}*µ*’ spectrum of intralipid phantoms [23]:

_{s}In Eq. (2), *λ* is the wavelength, expressed in nanometers. The variables *p _{1}*,

*p*and

_{2}*p*are the fitting parameters, used to describe the exponential

_{3}*µ*’ function. The 2D cost function for a specific phantom sample is illustrated in Fig. 3. The minimum cost at each individual wavelength is indicated by the bold red line, representing the noisy

_{s}*µ*’ spectrum that would be obtained with the classical wavelength-by-wavelength approach. The goal of the proposed approach is to find the set of parameters (

_{s}*p*), minimizing the global cost function, i.e. the sum of all the costs at the different wavelengths.

_{1}, p_{2}, p_{3}The Nelder-Mead algorithm is a popular nonlinear optimization method [22], which is used here to minimize the global cost. It uses *N + 1* vertices in *N* dimensions, forming a triangle in 2 dimensions or a tetrahedron in 3 dimensions. The method approximates a local optimum of a problem when the objective function varies smoothly. The methodology evaluates the objective function at all *N + 1* points and generates a new test position by extrapolating the behavior of the objective function measured at each of the *N + 1* points arranged as a simplex. The algorithm then chooses to replace one of these test points with the new test point and starts over. The process is repeated with 8 starting points, which are evenly distributed over the search space. Therefore, even though the objective function is not strictly convex, at least one of the starting points will lead to the correct answer – the BOP providing fiber scores similar/identical to the measured ones. This procedure has been implemented in Matlab (v2010a, The MathWorks Inc., Natick, Massachusetts).

#### 2.4 Validation of inverse estimator

The robust inverse estimator described in section 2.3 was evaluated on the 42 phantoms of the calibration set and the 9 phantoms of the independent test set. The inverse estimation is repeated with a point-by-point approach, where the algorithm retrieves the combination of BOP at each wavelength resulting in the lowest value for the local cost function. The local cost function is described by Eq. (1), and is identical to the one used in the robust estimator. The BOP that have been found can be compared to the ones that were retrieved with the reference DIS methododology. The performance of both methodologies is quantified by computing the Root Mean Square Error of Prediction (RMSE) and the determination coefficient *R*^{2}. These parameters are an indication of how good the results of the inverse estimators – both with and without expert knowledge or spectral assumptions – correspond to the original BOP.

#### 2.5 Wavelength-dependency

As described in section 2.2., the metamodel links *µ _{a}*,

*µ*and

_{s}’*λ*to the SRS profiles, the signal measured at the 30 different source-detector distances. As this metamodel was trained on intralipid phantoms, there is a strong correlation between the wavelength and

*µ*values in the NIR range due to the dominant absorption by water in this range. The variation in the 1380-1700 nm wavelength range was limited to such an extent that it has been omitted from the inverse estimation procedure, because any inverse estimation would automatically result into

_{a}*µ*values situated into this small range. Another problem inherent to this set of intralipid phantoms, is that

_{a}*µ*and

_{a}*µ*are negatively correlated at larger wavelengths. Strongly scattering optical phantoms contain more intralipid, and thus less water, lowering

_{s}’*µ*from 900 to 1700 nm. Although it is possible to use the metamodel outside the boundaries of the calibration set, Kriging metamodels do not function well for extrapolating data. Therefore, the results outside the calibration range are not reliable. Furthermore, it has been reported before that the

_{a}*µ*’-spectra for intralipid phantoms typically follow an exponential decay with increasing wavelength. As a result, one might argue that this would result in an inherent bias in the estimation of

_{s}*µ*’-profiles.

_{s}The first issue could be solved in two ways: in the first approach, the wavelength dependency in the relation between the BOP and the acquired spatially resolved reflectance profiles should be eliminated, such that the metamodels are only function of the BOP, while they are valid for all wavelengths. The second approach would involve developping a much larger calibration set with different types of absorbers, active in different wavelength ranges, and different type of scatterers such that there is minimal correlation between the wavelength and the BOP values. This would, however, increase the required number of phantoms at least with a factor 5 to 10. Both solutions are very difficult to realize in practice and were therefore outside the scope of this study.

The second issue can be tested by generating arbitrary SRS-profiles with the metamodel and adding a realistic noise level to these profiles. The goal of a Kriging interpolator is to estimate function values for data that are not available in the calibration set. It is a statistical model that produces the most likely output value as a function of the BOP, as well as the uncertainty associated with those estimates. This uncertainty estimation is in first instance defined by the variance of the SRS profiles in the calibration set corresponding to the same BOP. Twenty-two independent *µ _{s}*’ spectra have been combined with twenty-two arbitrary

*µ*spectra. The

_{a}*µ*’ spectra are the result of superposing randomly generated 3rd order polynomials onto spectra of the calibration set. The

_{s}*µ*spectra are obtained by the superposition of a random linear trend on the spectra of the calibration set. In order to limit the total computation time, an arbitrary selection was made, resulting in twenty-two noisy SRS-profiles. The purpose of implementing these randomly generated polynomials is (1) to make sure the estimation procedure does not introduce an exponential decay bias, present in the optical phantoms used for developing the metamodel. Another important reason is (2) to demonstrate the applicability of the methodology on systems with more complex

_{a}*µ*’-profiles [41]. These simulated SRS profiles were then evaluated with the inverse estimation algorithm to see how accurately the corresponding BOP spectra could be retrieved. Since the function of Eq. (2) will no longer be suitable to describe the

_{s}*µ*’-profiles in the test set, the exponential function has been expanded with a polynomial of the 3th degree. As a result, the optimizer will try to find the parameters of the following smooth – yet more complex -

_{s}*µ*’ function by minimizing the global cost function Eq. (3) over all wavelengths:

_{s}## 3. Results

#### 3.1. Bulk optical properties of liquid phantoms

The combinations of BOP estimated from DIS measurements for the liquid phantoms are illustrated in Fig. 4 (calibration set and validation set). The measurements were originally obtained in the wavelength range of 400 and 1800 nm, and were selectively trimmed (550 – 1700 nm) to prevent significant detector noise. The optical phantoms have originally been developed to ensure an optimal coverage of the search space of BOP. The *µ _{a}* coefficients are situated within values of 10

^{−2}and 35 cm

^{−1}, while the

*µ*’-values are in the range of 1 - 10

_{s}^{2}cm

^{−1}– as illustrated in Fig. 4. This wide range was considered to obtain a metamodel which is suitable for a wide range of applications.

The range of the *µ _{a}* and

*µ*’ coefficients at every wavelength for the validation set are mostly situated within the boundaries of the calibration set. From Fig. 5, it can be seen that the range of the

_{s}*µ*values is limited for wavelengths larger than 1380 nm. As a result, the developed forward metamodel is only applicable for a limited range of

_{a}*µ*coefficients in that wavelength range. The

_{a}*µ*spectra of biological tissues in this wavelength range are dominated by water, and will most likely fall within this range. The inverse estimation procedure was only tested up to 1380 nm, to avoid any ambiguity due to this limited range.

_{a}#### 3.2. Results of the inverse estimation on the calibration and validation set

The SRS measurements on the 42 phantoms of the calibration set, have been used to develop a metamodel, linking the inputs (*µ _{a}*,

*µ*and

_{s}’*λ*) to the SRS signals. The inverse estimation procedure, explained in paragraph 2.3 and Fig. 3, has been tested on these 42 calibration phantoms and on the 9 validation phantoms which had not been used for building the metamodel.

In Fig. 6 the estimated BOP are plotted as a function of the reference BOP for the validation set. As mentioned higher, wavelengths larger than 1380 nm have not been included in the analysis. The data have been displayed in a single scatter plot, providing a summary of the performance of the estimation procedure over all phantoms. As one can see from Fig. 6, the estimated *µ _{s}*’-profiles are in close agreement with the original measurements. This results in an

*R*

^{2}-value of 0.987 and an RMSE of 2.10 cm

^{−1}. The error on the estimated

*µ*-profiles is larger, resulting in an

_{a}*R*

^{2}-value of 0.847 and an RMSE of 2.33 cm

^{−1}.

An overview of the results using the robust inverse estimator, both for the validation and calibration set, can be found in Table 2. One can see in Table 2 that the error in estimation of *µ _{a}*, increases from 0.719 cm

^{−1}in the calibration, to 2.330 cm

^{−1}in the validation process. Since the information grid is developed with equally spaced

*µ*and

_{a}*µ*values on a log

_{s}’_{10}-scale, the results are better capable of converging towards the actual result for smaller µa values (e.g., a deviation of 0.1 results in a different error if

*µ*equals to –2, compared to a

_{a}*µ*of 1). The reasoning for this approach, is that a larger error is acceptable for larger BOP, as long as the relative error remains constant. The relative error in

_{a}*µ*is more or less constant, visualized in the trend of Fig. 6. However, due to the larger absorption values found in the validation set –illustrated in Fig. 4 the average absolute error increases.

_{a}To investigate the added value of the presented methodology, the BOP have been estimated a second time, using a point-by-point approach. In this case, the algorithm searches for the combination of BOP that results in the smallest cost function value at each wavelength. Since no expert knowledge is used for establishing the BOP-profiles, these can be noisy, as previously illustrated in Fig. 1. This noisiness is clearly visible in Fig. 7. A large number of estimated *µ _{s}*’ values for the validation set are randomly scattered around the target line - red line in Fig. 7. This results in an

*R*

^{2}value of 0.681 and an RMSE of 10.361 cm

^{−1}. The general trend of the

*µ*estimations remains visible in Fig. 7. However, the variation of the estimations around the 1:1 line increases, resulting in a rise of RMSE. After comparison with these results, it is clear that the presented inverse estimator is very robust for the estimation of the

_{s}’*µ*’ values. The estimated

_{s}*µ*values are of similar accuracy as those obtained with the robust inverse estimation procedure – summarized in Fig. 6. The corresponding

_{a}*R*

^{2}value of 0.862 and an RMSE of 2.22 cm

^{−1}are even slightly better.

An overview of the results using the point-by-point inverse estimator, both for the validation and calibration set, can be found in Table 2. Even though the robust inverse estimator does not result in an improved *R*^{2} or RMSE for *µ _{a}*, the improvements in the estimation of

*µ*is very clear.

_{s}’Although the estimation of the *µ _{s}*’-profiles produces better results with the robust inverse estimator, the estimation of

*µ*did not improve. To investigate this in more detail, the estimated BOP spectra for all phantoms were plotted together with their corresponding reference BOP spectra, as obtained from DIS. Here, we focus on phantoms 32 and 33 of the calibration set, as these are quite characteristics for the problems which were identified in this way. In Fig. 8, the estimated and reference BOP of phantom 33 are illustrated. The estimation is clearly successful for the

_{a}*µ*’-coefficient spectrum. However, small artefacts can be observed in the estimated

_{s}*µ*spectrum.

_{a}The general trends are captured by the estimator, but the estimated profile is not as smooth as the one obtained from DIS. A first explanation can be found in a shift from noisy *µ _{s}’* spectra, to (extra) noise in the

*µ*spectra. However, because of the different sensitivity, this noise is less emphatic. Secondly, the discretization process during the generation of the information grid causes approximation errors, which can be reduced by lowering the threshold value in the development process of the information grid. The number of evaluated

_{a}*µ*-coefficients is iteratively increased, until the cost function does not change significantly as a function of

_{a}*µ*. This means that the accuracy of

_{a}*µ*is limited, due to limitations that have been set to achieve computational performance.

_{a}As illustrated in Fig. 9, the estimation of the BOP for phantom 32 of the calibration set is even more problematic. This phantom had the same intralipid concentration as phantom 33, resulting in the same scattering properties, which are also estimated correctly by the algorithm. However, the estimation of the *µ _{a}* values is less succesful. In the 1050–1300 nm wavelength range, the algorithm suggests that the

*µ*values are systematically higher than those obtained from the DIS measurements on this phantom. At around 1200 nm, the retrieved

_{a}*µ*exceeds the boundaries illustrated in Fig. 5, indicating a local problem in retrieving the correct BOP. The robust estimator is still capable of retrieving a correct

_{a}*µ*spectrum, because it can use information over the entire wavelength range. The BOP estimates from DIS measurements for this phantom are illustrated in Fig. 10 together with the error margins on these estimates computed from the 3 replicate measurements on the same phantom.

_{s}’It should be noted that the uncertainty on the BOP estimates from DIS is the highest in the 1050 – 1300 nm wavelength range. Although the stochastic Kriging algorithm takes this uncertainty on the BOP into account in the development of the metamodel - by increasing the margins of the interpolator in this wavelength range - this high uncertainty on the input values has a negative impact on the reliability of the metamodel in this range.

#### 3.3. Results of the inverse estimation on the wavelength dependency test

Figure 11 illustrates the BOP values estimated by the robust estimator on the virtual phantoms of the wavelength dependency test, as described in section 2.5. The estimated *µ _{s}*’ values are in close agreement with the original ones, resulting in an

*R*

^{2}value of 0.984 and an RMSE of 0.76 cm

^{−1}. In the case of the

*µ*, the

_{a}*R*

^{2}is 0.985 and the RMSE is 0.10 cm

^{−1}. These results are considerably better than those obtained for the calibration and validation set, since the estimation procedure is run on virtual phantoms, using simulation-based SRS profiles (with realistic noise).

The estimated *µ _{s}*’-spectra in Fig. 11 display a systematic bias. This can be explained by the robustness of the algorithm, used to optimize the global cost function. A more elaborate

*µ*’ function is used [Eq. (4)], however the optimizer is encouraged to describe the polynomial using low fitting parameters. This improves the computational efficiency of the algorithm, but can result in sub-optimal result (when a local minimum is found).

_{s}The point-by-point estimator, as proposed in [22], was also evaluated on these virtual phantoms of the wavelength dependency test. As can be seen from Fig. 12, the correspondence of the estimated *µ _{s}’* values do not fit as well with the designed values (

*R*

^{2}= 0.809, RMSE = 3.24 cm

^{−1}). The estimation of the

*µ*values is significantly worse with an

_{a}*R*

^{2}of 0.846 and an RMSE of 1.52 cm

^{−1}. The low

*R*

^{2}value can be attributed to the presence of some outliers, where the point-by-point estimator found the best fit to the spatially resolved reflectance profile for these combinations of BOP. Such outliers were not observed in Fig. 11, indicating that the inverse estimator proposed in this study is more robust and can avoid these deviating estimates.

## 4. Discussion

Although the validation was successful, it should be noted that the approach followed in this study still has some limitations which can be tackled in future research. Moreover, the efficiency of the process to develop the 2D cost function and to perform the inverse estimation procedure can be futher improved. This will allow the methodology to choose the set of BOP – evaluated in the generation of the information grid – more selectively, and hence improve the accuracy, while limiting the computation of redundant information.

However, this current approach will be perfectly suited for the final goal of this robust inverse estimator. In that study, 30 source-detector distances are availaible and are used for the inverse estimation procedure. The final goal is to evaluate the effect of different fiber configurations on the inverse estimation procedure, a relevant section of the information grid (e.g. 2D cost function) can be used to quickly generate a profile corresponding to a specific sensor design (i.e. a selection of specific fibers). This will allow a very fast inverse estimation for different sensor configurations, which in a later phase can be used to develop a computationally efficient sensor design algorithm.

A more important problem that needs to be addressed in future research, is the characterization of the BOP with the DIS reference methodology. Although the uncertainty on the BOP values obtained from DIS is taken into account by the Kriging model, a higher uncertainty still has a negative impact on the accuracy of the model. Therefore, more accurate estimates for the input BOP values would allow to build a more accurate inverse estimator. DIS measurements are a usefull tool to estimate the BOP of samples, but should serve as a control tool, rather then as input for a databased model. If that issue is not resolved, the accuracy of the robust inverse estimator cannot outperform the accuracy of DIS estimations. Furthermore, it is important to remark that the estimation procedure to obtain the DIS results is executed wavelength-by-wavelength. Consequently, the estimated BOP profiles are subdued to noise, which will influence any estimator that attempts to remove this wavelength dependency.

## 5. Conclusions

A computationally efficient method has been proposed for estimating BOP values from spatially resolved reflectance measurements. This estimator aims at retrieving smooth *µ _{s}*’-spectra from noisy SRS signals by means of a metamodel which describes the spatially resolved reflectance profile as a function of the

*µ*, the

_{a}*µ*’ and the wavelength. The wavelength is used as an input to characterize the wavelength dependent effects in the measurement system. This metamodel is then used in an inverse estimation procedure minimizing a 2D cost function, which quantifies the match between the simulated and measured SRS profile as a function of the reduced scattering coefficient and the wavelength. This cost function allows to constrain the estimated

_{s}*µ*’ spectrum to a smooth parametric function, while still maximizing the resemblance between simulations and measurements.

_{s}This robust inverse estimation algorithm was validated on an independent set of intralipid^{®} phantoms and its performance was also compared to that of a classical single-wavelength inverse estimation algorithm. While its performance in estimating *µ _{a}* was comparable (

*R*

^{2}of 0.844 vs. 0.862), it resulted in a large improvement in the estimation of

*µ*’ (

_{s}*R*

^{2}of 0.987 vs. 0.681). The change in performance is more apparent in the improvement of RMSE of

*µ*’, which decreases from 10.36 cm

_{s}^{−1}to 2.10 cm

^{−1}. The robust inverse estimator incorporates information over the different wavelengths, to increase the accuracy of

*µ*’estimations and robustify the estimation process.

_{s}## Acknowledgments

The authors gratefully acknowledge the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT-Flanders) for the financial support through the GlucoSens project (SB-090053) and the PhD grants of Rodrigo Watté and Robbe Van Beers (grant numbers 101552 and 131777). Ben Aernouts is funded as postdoctoral fellow of the Research Foundation–Flanders (FWO, Brussels, Belgium, grant 12K3916N).

## References and links

**1. **V. V. Tuchin, *Tissue Optics: Light Scattering Methods and Instruments for Medical Diagnosis* (SPIE, 2007).

**2. **V. V. Tuchin, *Handbook of Optical Sensing of Glucose in Biological Fluids and Tissues* (Chemical Rubber Company, 2008).

**3. **B. Nicolaï, K. Beullens, E. Bobelyn, A. Peirs, W. Saeys, K. Theron, and J. Lammertyn, “Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: q review,” Postharvest Biol. Technol. **46**(2), 99–118 (2007). [CrossRef]

**4. **J. Lammertyn, A. Peirs, J. De Baerdemaeker, and B. Nicolaï, “Light penetration properties of NIR radiation in fruit with respect to non-destructive quality assessment,” Postharvest Biol. Technol. **18**(2), 121–132 (2000). [CrossRef]

**5. **D. G. Fraser, R. B. Jordan, R. Künnemeyer, and V. A. McGlone, “Light distribution inside mandarin fruit during internal quality assessment by NIR spectroscopy,” Postharvest Biol. Technol. **27**(2), 185–196 (2003). [CrossRef]

**6. **D. G. Fraser, V. A. McGlone, R. B. Jordan, and R. Künnemeyer, “NIR (Near Infra-Red) light penetration into an apple,” Postharvest Biol. Technol. **22**, 191–194 (2001). [CrossRef]

**7. **E. Zamora-Rojas, A. Garrido-Varo, B. Aernouts, D. Pérez-Marin, W. Saeys, and J. É. Guerrero-Ginel, “Understanding near infrared radiation propagation in pig skin reflectance measurements,” Innov. Food Sci. Emerg. Technol. **22**, 137–146 (2014). [CrossRef]

**8. **H. Cen and R. Lu, “Optimization of the hyperspectral imaging-based spatially-resolved system for measuring the optical properties of biological materials,” Opt. Express **18**(16), 17412–17432 (2010). [CrossRef] [PubMed]

**9. **J. Qin and R. Lu, “Monte Carlo simulation for quantification of light transport features in apples,” Comput. Electron. Agric. **68**(1), 44–51 (2009). [CrossRef]

**10. **R. M. P. Doornbos, R. Lang, M. C. Aalders, F. W. Cross, and H. J. C. M. Sterenborg, “The determination of in vivo human tissue optical properties and absolute chromophore concentrations using spatially resolved steady-state diffuse reflectance spectroscopy,” Phys. Med. Biol. **44**(4), 967–981 (1999). [CrossRef] [PubMed]

**11. **T. S. Leung, N. Aladangady, C. E. Elwell, D. T. Delpy, and K. Costeloe, “A new method for the measurement of cerebral blood volume and total circulating blood volume using Near Infrared spatially resolved spectroscopy and indocyanine green: application and validation in neonates,” Pediatr. Res. **55**(1), 134–141 (2004). [CrossRef] [PubMed]

**12. **D. Arifler, C. MacAulay, M. Follen, and R. Richards-Kortum, “Spatially resolved reflectance spectroscopy for diagnosis of cervical precancer: Monte Carlo modeling and comparison to clinical measurements,” J. Biomed. Opt. **11**(6), 064027 (2006). [CrossRef] [PubMed]

**13. **A. Garcia-Uribe, J. Zou, T.-H. Chang, M. Duvic, V. Prieto, and L. V. Wang, “Oblique-incidence spatially resolved diffuse reflectance spectroscopic diagnosis of skin cancer,” Proc. SPIE **7572**, 75720L (2010). [CrossRef]

**14. **R. Cubeddu, A. Pifferi, P. Taroni, A. Torricelli, and G. Valentini, “Noninvasive absorption and scattering spectroscopy of bulk diffusive media: An application to the optical characterization of human breast,” Appl. Phys. Lett. **74**(6), 874–876 (1999). [CrossRef]

**15. **A. Torricelli, A. Pifferi, P. Taroni, E. Giambattistelli, and R. Cubeddu, “In vivo optical characterization of human tissues from 610 to 1010 nm by time-resolved reflectance spectroscopy,” Phys. Med. Biol. **46**(8), 2227–2237 (2001). [CrossRef] [PubMed]

**16. **A. Torricelli, L. Spinelli, A. Pifferi, P. Taroni, R. Cubeddu, and G. Danesini, “Use of a nonlinear perturbation approach for in vivo breast lesion characterization by multiwavelength time-resolved optical mammography,” Opt. Express **11**(8), 853–867 (2003). [CrossRef] [PubMed]

**17. **B. Guan, Y. Zhang, S. Huang, and B. Chance, “Determination of optical properties using improved frequency-resolved spectroscopy,” Proc. SPIE **3548**, 17–26 (1998). [CrossRef]

**18. **J. Y. Le Pommellec and J. P. L’Huillier, “Determination of the optical properties of breast tissues using frequency-resolved transillumination: basic theory and preliminary results,” Proc. SPIE **4161**, 202–215 (2000). [CrossRef]

**19. **R. Van Beers, B. Aernouts, L. León Gutiérrez, C. Erkinbaev, K. Rutten, A. Schenk, B. Nicolai, and W. Saeys, “Optimal illumination-detection distance and detector size for predicting Braeburn apple maturity from Vis/NIR laser reflectance measurements,” Food Bioprocess Tech.submitted.

**20. **W.-F. Cheong, S. A. Prahl, and A. J. Welch, “A Review of the Optical Properties of Biological Tissues,” IEEE J. Quantum Electron. **26**(12), 2166–2185 (1990). [CrossRef]

**21. **T. W. Simpson, J. D. Poplinsky, P. N. Koch, and J. K. Allen, “Meta-models for computer-based engineering design: survey and recommendations,” Eng. Comput. **17**(2), 129–150 (2001). [CrossRef]

**22. **R. Watté, N. N. Do Trong, B. Aernouts, C. Erkinbaev, J. De Baerdemaeker, B. Nicolaï, and W. Saeys, “Metamodeling approach for efficient estimation of optical properties of turbid media from spatially resolved diffuse reflectance measurements,” Opt. Express **21**(26), 32630–32642 (2013). [CrossRef] [PubMed]

**23. **B. Aernouts, E. Zamora-Rojas, R. Van Beers, R. Watté, L. Wang, M. Tsuta, J. Lammertyn, and W. Saeys, “Supercontinuum laser based optical characterization of Intralipid® phantoms in the 500-2250 nm range,” Opt. Express **21**(26), 32450–32467 (2013). [CrossRef] [PubMed]

**24. **E. Zamora-Rojas, B. Aernouts, A. Garrido-Varo, D. Pérez-Marin, J. Guerrero-Ginel, and W. Saeys, “Double integrating sphere measurements for estimating optical properties of pig subcutaneous adipose tissue,” Innov. Food Sci. Emerg. Technol. **19**, 218–226 (2013). [CrossRef]

**25. **B. Aernouts, R. Van Beers, R. Watté, J. Lammertyn, and W. Saeys, “Dependent scattering in Intralipid® phantoms in the 600-1850 nm range,” Opt. Express **22**(5), 6086–6098 (2014). [CrossRef] [PubMed]

**26. **E. Zamora-Rojas, B. Aernouts, A. Garrido-Varo, W. Saeys, D. Pérez-Marin, and J. Guerrero-Ginel, “Optical properties of pig skin epidermis and dermis estimated with double integrating spheres measurements,” Innov. Food Sci. Emerg. Technol. **20**, 343–349 (2013). [CrossRef]

**27. **B. Aernouts, R. Watté, R. Van Beers, F. Delport, M. Merchiers, J. De Block, J. Lammertyn, and W. Saeys, “Flexible tool for simulating the bulk optical properties of polydisperse spherical particles in an absorbing host: experimental validation,” Opt. Express **22**(17), 20223–20238 (2014). [CrossRef] [PubMed]

**28. **A. Li, Q. Zhang, J. P. Culver, E. L. Miller, and D. A. Boas, “Reconstructing chromosphere concentration images directly by continuous-wave diffuse optical tomography,” Opt. Lett. **29**(3), 256–258 (2004). [CrossRef] [PubMed]

**29. **S. Srinivasan, B. W. Pogue, S. Jiang, H. Dehghani, and K. D. Paulsen, “Spectrally constrained chromophore and scattering near-infrared tomography provides quantitative and robust reconstruction,” Appl. Opt. **44**(10), 1858–1869 (2005). [CrossRef] [PubMed]

**30. **A. Corlu, T. Durduran, R. Choe, M. Schweiger, E. M. C. Hillman, S. R. Arridge, and A. G. Yodh, “Uniqueness and wavelength optimization in continuous-wave multispectral diffuse optical tomography,” Opt. Lett. **28**(23), 2339–2341 (2003). [CrossRef] [PubMed]

**31. **A. Corlu, R. Choe, T. Durduran, K. Lee, M. Schweiger, S. R. Arridge, E. M. Hillman, and A. G. Yodh, “Diffuse optical tomography with spectral constraints and wavelength optimization,” Appl. Opt. **44**(11), 2082–2093 (2005). [CrossRef] [PubMed]

**32. **T. Durduran, C. Zhou, B. L. Edlow, G. Yu, R. Choe, M. N. Kim, B. L. Cucchiara, M. E. Putt, Q. Shah, S. E. Kasner, J. H. Greenberg, A. G. Yodh, and J. A. Detre, “Transcranial optical monitoring of cerebrovascular hemodynamics in acite stroke patients,” Opt. Express **17**, 3884–3902 (2009).

**33. **C. D’Andrea, L. Spinelli, A. Bassi, A. Giusto, D. Contini, J. Swartling, A. Torricelli, and R. Cubeddu, “Time-resolved spectrally constrained method for the quantification of chromophore concentrations and scattering parameters in diffusing media,” Opt. Express **14**(5), 1888–1898 (2006). [CrossRef] [PubMed]

**34. **W. Saeys, B. De Ketelaere, and P. Darius, “Potential applications of functional data analysis in chemometrics,” J. Chemometr. **22**(5), 335–344 (2008). [CrossRef]

**35. **L. Spinelli, A. Torricelli, A. Pifferi, P. Taroni, R. Cubeddu, H. Wabnitz, D. Grosenick, M. Möller, R. Macdonald, J. Swartling, T. Svensson, S. Andersson-Engels, R. van Veen, H. Sterenborg, J.-M. Tualle, H. Nghiem, S. Avrillier, M. Whelan, and H. Stamm, “Performance assesment of photon migration instruments: the MEDPHOT protocol,” J. Biomed. Opt. **9**, 2104–2114 (2005).

**36. **I. Couckuyt, A. Forrester, D. Gorissen, F. De Turck, and T. Dhaene, “Blind Kriging: Implementation and performance analysis,” Adv. Eng. Soft. **49**, 1–13 (2012). [CrossRef]

**37. **I. Couckuyt, K. Crombecq, D. Gorissen, and T. Dhaene, “Automated response surface model generation with sequential design,” in *Proceedings of First International Conference on Soft Computing Technology in Civil, Structural and Environmental Engineering*, Funchal, Portugal, ed. (Civil-Comp. Press, 2009), 52.

**38. **I. Couckuyt, F. Declerq, T. Dhaene, H. Rogier, and L. Knockaert, “Surrogate-Based Infill Optimization Applied to Electromagnetic Problems,” Int. J. RF Microw. C. E. **20**(5), 492–501 (2010). [CrossRef]

**39. **V. V. Tuchin, *Tissue Optics: Light Scattering Methods and Instruments for Medical Diagnosis* (SPIE, 2007), Chap. 2.

**40. **D. Sharma, A. Agrawal, L. S. Matchette, and T. J. Pfefer, “Evaluation of a fiberoptic-based system for measurement of optical properties in highly attenuating turbid media,” Biomed. Eng. Online **5**(1), 49 (2006). [CrossRef] [PubMed]

**41. **B. Aernouts, R. Van Beers, R. Watté, T. Huybrechts, J. Lammertyn, and W. Saeys, “Visible and near-infrared bulk optical properties of raw milk,” J. Dairy Sci. **98**(10), 6727–6738 (2015). [CrossRef] [PubMed]