## Abstract

A metamodeling approach is introduced and applied to efficiently estimate the bulk optical properties of turbid media from spatially resolved spectroscopy (SRS) measurements. The model has been trained on a set of liquid phantoms covering a wide range of optical properties representative for food and agricultural products and was successfully validated in forward and inverse mode on phantoms not used for training the model. With relative prediction errors of 10% for the estimated bulk optical properties the potential of this metamodeling approach for the estimation of the optical properties of turbid media from spatially resolved spectroscopy measurements has been demonstrated.

© 2013 Optical Society of America

## 1. Introduction

The propagation of light (Visible and Near-Infrared) in human and animal tissues has been extensively studied with the aim to develop nondestructive diagnostic techniques (e.g. early cancer detection, *in vivo* detection and monitoring of disorders, etc.) [1,2]. The interaction of the light with the illuminated sample mainly involves two phenomena: scattering caused by the inhomogeneities of the refractive index and absorption by the chemical bonds present in the constituents. Thanks to these interactions, the acquired spectra can be used to detect changes both in the sample microstructure and chemical composition. Since the post-interacting photons are the products of both scattering and absorption, multiple acquisitions of these photons have been suggested as a means to obtain information on the sample microstructure and chemical composition. These multiple acquisitions of the post-interacting photons can be implemented specifically in the domains of space (Spatially Resolved Spectroscopy or SRS) [3–6], time [7–9] or frequency [10,11].

The propagation of electromagnetic waves (photons) in biological media is fully and fundamentally described by Maxwell’s equations in which all the interactions are taken into account. Although this approach is mathematically rigorous and accurate, its employment for modeling the propagation of light in real biological systems has been very limited. It is considered computationally almost impossible, due to the microstructural and compositional complexities of the real biological systems [12].

An alternative and relatively simpler approach is to employ the Radiative Transfer Equation (RTE) describing the propagation of energy in turbid media. The RTE neglects the wavelike nature of the photons, yet remains sufficiently accurate [13]. However, analytical solutions of the RTE are also considered theoretically and mathematically impossible [14]. A common approach for numerically solving the RTE is through Monte Carlo simulations, which have been employed in many researches owing to its high accuracy and flexible applicability for many sample shapes [15,16]. Inspite of those advantages, Monte Carlo methods require large numbers of simulations to obtain an acceptable statistical accuracy [17,18]. Another way of obtaining simplified solutions for the RTE is to apply the diffusion approximation to solve this equation [19–22]. However, this approach is only valid in the cases when the scattering dominates the absorption and the measured spatial locations of the diffuse reflected lights are far enough from the illumination point to allow the light to become sufficiently diffuse [28]. These conditions are in many cases not met, which can lead to large errors in the estimated optical properties [23,24].

Practical applications of diffuse reflectance and transmittance measurements on food and agricultural samples for nondestructive quality assessment require a fast and sufficiently accurate method to extract the optical properties from the measured data. This requirement for a fast optical property extraction is especially important for on-line measurements, process monitoring or optimization in the agrofood industry, where shortening the inspection or production time means saving costs or increasing production efficiency. It is more convenient to use e.g. interpolation techniques, which are capable of giving similar output as the theoretical models, without the computational complexity [25,26].

Because to the computational limitations of the Monte Carlo method and the validity issues of the diffusion theory, different solutions have been developed. Dam et al. [27] proposed a method based on multiple polynomial regression. They developed a system which collects one set of reflectance data from six source–detector distances at four arbitrary wavelengths. Multivariate calibration techniques based on two-dimensional polynomial fitting are subsequently employed to extract and display the absorption and reduced scattering coefficients in real-time mode. A Newton-Ralphson algorithm is used to perform converging iterative calculations. Another proposed method uses a neural network, trained on the results of Monte Carlo simulations [28,29].

The major drawback of these computationally fast methods is their incapability of processing the stochastic noise of Monte Carlo simulations or the measurement noise of SRS measurements. Generating a Monte Carlo database requires long computation times, but allows for a fast calculation after a neural network has been established. However, stochastic simulation noise can influence the simulated output and as a consequence influence the inverse estimation process. SRS measurements with small source-detector distances are susceptible to noise, which should again be incorporated into the approximation method.

Stochastic data-based surrogate models, which are referred to as ‘metamodels’, have become a standard technique to reduce this computational cost and enable tasks such as visualization, design space exploration, sensitivity analysis and optimization [30,31]. Such surrogate models are typically used to solve forward problems [25], where the metamodel serves as a high-way bridge between the design space (input parameters) and the performance space (output parameters). The reverse problem is more focused on exploring the search space as efficiently as possible. As the goal is to estimate the optical properties starting from the spatially resolved diffuse reflectance profiles, it would be logical to attempt to generate a surrogate model, linking the output parameters to the input parameters, a so-called inverse metamodel. However, such inverse problems are typically ill-posed [28], which leads to non-uniqueness and possible instability. The easiest way to avoid these problems is to convert the inverse problem into an iterative estimation of the optical properties for which the spatially resolved reflectance profiles simulated with the metamodels match best with the measured profiles.

Stochastic data-based surrogate models can be built on both Monte Carlo simulations and SRS measurements on optical phantoms. Building them on Monte Carlo simulations provides more flexibility, however generating a database requires long computation times. In order to avoid this difficulty, this study focusses on building a metamodel on SRS measurements in a set of liquid optical phantoms, designed to cover a wide range of scattering and absorption properties.

## 2. Materials and methods

#### 2.1. Concept of metamodeling

Kriging is a popular technique to approximate deterministic noise-free data. It has been originally conceived as a geostatistical estimator that estimates the value for an unobserved location, by using samples from different locations surrounding it (e.g. estimating the altitude at unobserved points on the map). This Gaussian Process based surrogate model is compact and cheap to evaluate, and has proven to be very useful for exploration and optimization [31,33–35]. Kriging is based on the idea that the value at an unknown point should be the average of the known values at its neighbors – weighted by the neighbors’ distance to the unknown point [36]. While the interpolation properties of the Kriging methodology are advantageous for many deterministic simulation problems, it might produce undesirable results when dealing with stochastic simulations. Stochastic Kriging [37] has therefore been suggested as an extension of Kriging for approximation instead of interpolation and is also known as regression Kriging.

A Stochastic Kriging surrogate model has been developed using the *ooDACE* toolbox [28,37], which is a versatile Matlab toolbox that implements the popular Gaussian Process based Kriging surrogate models. Each measurement has been repeated 9 times to provide estimations of the measurement noise as a function of the optical properties. The toolbox uses the mean fiber scores (mean diffuse reflectance values of the detection fibers) and the intrinsic covariance matrix, representing the variance of the output values. There is little information indicating the extent to which the hyperparameters of a Kriging model need to be tuned for the resulting surrogate model to be effective [37]. The construction of surrogate models requires the selection of an optimal set of hyperpaprameters to insure that the model accurately represents the design space [37]. A nice overview of hyperparameter tuning strategies is given by [37], where a two-stage approach is suggested. Because this technique is computationally expensive in higher dimensions, another optimization strategy is used in the OODACE toolbox [33]. A first set of Kriging models is constructed using an initial set of candidate features. These candidates are ranked using Bayesian variable selection method. In the next step, a new promising feature is incorporated into the Kriging model, after which the new Kriging model is scored against a measure [33]. This process is repeated until the accuracy of the Kriging model stops improving [33].

Developing surrogate models has the advantage of being very robust: optical properties can be linked to any type of measurement. However, this robustness needs some caution. Optical properties can be linked to new optical measurements, as long as the measuring methodology remains unchanged.

#### 2.2. Inverse estimation

The inverse estimation of optical properties requires an optimization procedure to find the set of optical properties for which the simulated diffuse reflectance values at different distances most closely resembles the values measured for the fibers. To find this optimal set of optical properties the sum of squared relative errors is minimized:

*F* respresents the cost function, which is to be minimized. When *F* approaches zero, the measured profile has a close resemblance to the simulated one. *I _{i,meas}* and

*I*respectively represent the measured and simulated intensity, at one of the

_{i,sim}*N*measuring points. The main reason for minimizing the relative errors is that the diffuse reflectance values acquired by the different fibers can be of different order of magnitude, such that further fibers would only have a marginal impact on the optimization. The goal of the objective function is to find the set of optical properties producing the best resembling fiber scores.

The Nelder-Mead algorithm is a popular nonlinear optimization method. It uses *N + 1* vertices in N dimensions, forming a triangle in 2 dimensions or a tetrahedron in 3 dimensions. The method approximates a local optimum of a problem when the objective function varies smoothly. The methodology evaluates the objective function at all *N + 1* points and generates a new test position by extrapolating the behavior of the objective function measured at each of the *N + 1* points arranged as a simplex. The algorithm then chooses to replace one of these test points with the new test point and starts over.

This algorithm is used with several starting points, which are evenly distributed over the search space. Therefore, even though the objective function is not strictly convex, at least one of the starting points will lead to the correct answer – the optical properties resulting in similar/identical fiber scores. This procedure has been created in a Matlab algorithm, established in the research group (Matlab 2010a, The MathWorks Inc., Natick, Massachusetts).

#### 2.3. Liquid phantom set

Thirty six liquid phantoms were prepared as aqueous mixtures of intralipid 20% (Fresenius Kabi, Sweden) serving as scatterer and Indian ink (Chartpak Inc., USA) acting as absorber in the range 500 – 1000 nm. Scattering coefficients *μ _{s}* of the intralipid 20% were approximated as the attenuation coefficients determined from collimated transmittance measurements [38] on a series of six diluted solutions with small concentrations of intralipid 20% (from 1 × 10

^{−4}to 6 × 10

^{−4}v/v), taking into account the absorption of water. The anisotropy factor

*g*of the intralipid at each wavelength was assumed to be the same as in [39] and the reduced scattering coefficients

*μ*of the intralipid 20% were computed in the wavelength range 500-1000 nm according to the similarity relation [1]:

_{s}’*μ*.

_{s}’ = μ_{s}(1-g)The attenuation coefficients *μ _{t}* for the Indian ink were also obtained from collimated transmittance measurements for a series of six diluted ink solutions (from 1 × 10

^{−4}to 6 × 10

^{−4}w/w). The absorption coefficients

*μ*of the ink dilutions were calculated as 85% of the attenuation coefficients

_{a}*μ*, because 15% of the attenuation could be attributed to scattering by the carbon particles in the original ink solution [40]. The obtained reduced scattering coefficients

_{t}*μ*of the phantoms were computed as the scattering of intralipid 20% weighted by its concentrations in the phantoms. Analogously, the absorption coefficients

_{s}’*μ*were derived as a combination of the absorption by pure water [41] and ink, each of which was weighted by its concentration in the phantoms.

_{a}Each liquid phantom was labeled by assigning one of the six letters: A, B, C, D, E, and F, indicating 6 nominal reduced scattering coefficient values *μ _{s}’* at 600 nm (10, 15, 30, 50, 90, and 130 cm

^{−1}), and one of the six numbers: 1, 2, 3, 4, 5, and 6, representing 6 nominal absorption coefficient values

*μ*at 600 nm (0, 0.5, 1.0, 1.5, 2.0, and 2.5 cm

_{a}^{−1}). This liquid phantom set was prepared with the intention to cover the wide range of reduced scattering and absorption coefficient values of many agricultural and food products.

#### 2.4. The Spatially Resolved Spectroscopy (SRS) setup

The SRS setup for spatially resolved diffuse reflectance measurements in the 500-1000 nm range, which is the same as the one used in [31, 41], is schematically illustrated in Fig. 1. It consists of a light source, a fiber-optics probe, a spectrograph, a camera, a measurement card and a computer. An AvaLight-DHc (Avantes, Eerbeek, The Netherlands) halogen lamp delivers lights through an optical fiber to illuminate the sample. Five detection fibers placed at various center-to-center distances from the illuminating fiber, ranging approximately from 0.3 to 1.2 mm with steps varying from 0.15 to 0.3 mm, collect the diffuse reflected photons at different distances from the illumination fiber. All the fibers used are Thorlabs multimode silica fibers (FVP-200 PF) with a numerical aperture of 0.22 and a core diameter of 200 μm, and are integrated in a stainless steel probe with an outside diameter of 1.9 mm. The five detection fibers guide the collected diffuse reflected photons to the entrance slit of a CP200 133 g/mm spectrograph (Horiba Jobin-Yvon, New Jersey, USA) which splits the light from each of these fibers into its spectral components and projects the corresponding photons onto a Hamamatsu C7041 CCD camera with a S7041-1008 detector (Hamamatsu, Louvain-La-Neuve, Belgium). The signal from this camera is transferred to a computer through a NI PCI 6251 data acquisition card (National Instruments, TX, USA). The SRS setup and data acquisition are controlled by means of a custom written LabView program (National Instruments, TX, USA).

Each pixel on the spectral dimension of the CCD sensor was linked to the corresponding waveband after wavelength calibration by means of a calibration light source AvaLight-CAL-2000 (Avantes, Eerbeek, The Netherlands). The acquired waveband versus pixel number plot showed a very good linear relation (R^{2} = 0.99) (data not shown).

The relative diffuse reflectance spectra of the detection fibers from a measured sample were computed as the ratios of the dark-corrected intensities to the dark-corrected intensities collected in an integrating sphere (50 mm diameter, Avantes, Eerbeek, The Netherlands) reflecting more than 98% of the light intensity in the range 500-1000 nm. In this way, the measured signals were compensated for dark noise, variation in the light source intensity, differences in the efficiency of different camera pixels and of the detection fibers.

#### 2.5. Phantom measurements

The fiber-optics probe was placed at nine different positions on the surface of each liquid phantom to collect the spatially resolved diffuse reflectance profiles for that phantom. At each position, four scans were averaged. To determine the high signal-to-noise region for each liquid phantom, signals acquired from dark measurements (dark readings) and phantom measurements (intensity readings before dark-correction) of nine different positions of that liquid phantom were utilized. Mean, standard deviation values and 95% confidence intervals of the dark and intensity readings were computed. The wavelength regions which have the lower limits of the intensity readings higher than the upper limits of the dark readings at the furthest detection fiber were selected for each liquid phantom as high signal-to-noise regions. For the whole liquid phantom set, the spatially resolved diffuse reflectance profiles in the high signal-to-noise regions from 600 to 900 nm were retained for further analyses.

#### 2.6. Forward validation of the metamodeling on the liquid phantom set

The thirty six liquid phantoms were separated into 2 groups: a calibration set for building the metamodel and a validation set for evaluating the model performance when used on a sample which has not been used for building it. This separation in a calibration and validation set was repeated 16 times. Each time one phantom was selected out of 16 phantoms in the Table 1 as the validation phantom and the remaining 35 ones were added to the calibration set. The data from the nine different measurement positions on each phantom were averaged, which resulted in 102 spatially resolved diffuse reflectance profiles from 600 to 900 nm with 3 nm waveband step. Five calibration metamodels were constructed for the 5 detection fibers, each of which related 3570 sets of reference optical properties (measured absorption coefficient *µ _{a}* and scattering coefficient

*µ*, and anisotropy factor

_{s}*g*from literature) of the 35 calibration phantoms to the corresponding measured diffuse reflectance values for that detection fiber.

Forward validation evaluated the performance of the calibration metamodels in predicting the diffuse reflectance values of all the detection fibers from the input optical properties of the validation phantoms which were not used in metamodel construction. Sixteen forward validation steps were implemented for the 16 validation phantoms. In each forward validation, the optical properties of both the calibration and validation phantoms were fed into the acquired calibration metamodels to obtain the predicted diffuse reflectance values for all the detection fibers of these phantoms. Finally, the prediction results of the 16 validation steps were merged together. The merged results contained 57120 predicted diffuse reflectance values for each detection fiber of the calibration phantoms and 1632 predicted diffuse reflectance values for each detection fiber of the validation phantoms.

The prediction error, noted as *RMSECV (Root Mean Squared Error of Cross-Validation)* and the calibration error, noted as *RMSEC (Root Mean Squared Error of Calibration)* were calculated for each detection fiber according to Eqs. (2) and (3):

*N*: number of samples used in calibration and cross-validation, respectively_{C}, N_{CV}*R*: measured diffuse reflectance of sample_{i, meas}*i**R*: predicted diffuse reflectance of sample_{i, pred}*i*from the calibration metamodel

#### 2.7. Inverse validation of the metamodeling on the liquid phantom set

To evaluate the performance of the inverse algorithm, the optical properties of the validation phantoms were estimated based on the acquired spatially resolved reflectance spectra. Similar to the forward validation, 16 inverse validation steps were performed where each time one phantom was set aside as validation phantom while the other phantoms were used to build the calibration metamodel. In each inverse validation step, the 102 spatially resolved diffuse reflectance profiles of the validation phantom were used as inputs to the corresponding acquired calibration metamodels using the Nelder-Mead algorithm for estimating its optical properties. Finally, the estimated optical properties of the 16 validation phantoms from the inverse validation steps were again merged together to obtain cross-validation results containing 1632 estimated optical properties.

The average error in the estimated or predicted absorption coefficient *µ _{a}* and reduced scattering coefficient

*µ*values, noted as

_{s}’*RMSECV (Root Mean Squared Error of Cross-Validation)*, was calculated based on the following formula:

*N*: number of samples used in cross-validation_{CV}*µ*: measured optical properties_{i, meas}*(µ*of sample_{a}, µ_{s}’)*i**µ*: predicted optical properties_{i, pred}*(µ*of sample_{a}, µ_{s}’)*i*from the calibration metamodel

## 3. Results and discussion

#### 3.1. Optical properties of the liquid phantoms

The six absorption coefficient spectra of six liquid phantoms with the scattering level A (nominal reduced scattering coefficient *μ _{s}’* at 600 nm is 10 cm

^{−1}) and six reduced scattering coefficient spectra of six liquid phantoms with the absorption level 1 (nominal absorption coefficient

*μ*at 600 nm is 0 cm

_{a}^{−1}) are illustrated in Fig. 2. The absorption coefficient spectra of the phantoms with higher scattering levels (B, C, D, E, F) and the reduced scattering coefficient spectra of the phantoms with the higher absorption levels (2, 3, 4, 5, 6) were very similar to the spectra shown in Fig. 2.

From Fig. 2, it can be clearly observed that the optical properties of the liquid phantoms span a wide range (*µ _{a}* = 0.001 – 2.584 cm

^{−1};

*µ*= 5.673 – 135.326 cm

_{s}’^{−1}), which should cover the majority of food and agricultural products. The absorption spectra of the phantoms with scattering level A in Fig. 2(a) have no peak, which indicates that the ink absorbs quite well at all wavelengths in the range 600 – 900 nm. The reduced scattering spectra of the phantoms with absorption level 1 in Fig. 2(b) show a monotonous decrease of the reduced scattering coefficients with increasing wavelength, which can be attributed to the scattering behavior of the fat globules in the intralipid.

#### 3.2. Spatially resolved diffuse reflectance profiles of the liquid phantoms

The plot of the spatially resolved diffuse reflectance profiles acquired at 800 nm for the 4 phantoms combining the highest and lowest levels of scattering and absorption: A1, A6, F1, and F6 are representatively shown in Fig. 3.

In Fig. 3, the z-axis represents the ratio of the reflectance intensity acquired for the phantom compared to the reference (see section 2.4). The profile decrease with increasing source-detector distance, owing to the fact that photons that have traveled longer path lengths have had more chance to be absorbed or scattered. For higher scattering the spatially resolved diffuse reflectance values are higher and the whole curve is lifted closer to the source. A closer look at the phantoms having the same absorption levels indicates that the effects of scattering on the diffuse reflectance are higher for the detection fibers closer to the light source. For higher absorption the spatially resolved diffuse reflectance are smaller and the entire curve is lowered.

#### 3.3. Forward validation

In Fig. 4 the scatter plots of the predicted versus measured diffuse reflectance values of the 5 detection fibers for both the calibration and validation phantoms are presented. The light grey points are nicely distributed on the black lines, which indicates that the acquired metamodels fitted the calibration data very well (all *R ^{2}* values are 0.999 and all

*RMSEC*values are much smaller than the measured diffuse reflectance values). The predicted and measured diffuse reflectance values for the validation phantoms (black points) are also in good agreement with

*R*values around 0.99 and

^{2}*RMSECV*values which are much smaller than the measured fiber reflectance). From these results it can be concluded that the metamodels were well capable of predicting the measured spatially resolved diffuse reflectance values for new samples which have not been used for building the metamodels. Even for the furthest detection fiber [Fig. 4(e)], where the spatially resolved reflectance measurements are noisier, there was a good agreement between the predicted and measured values. There are some slight deviations of the black points from the lines in figures (a), (b), (c), and (d). However, these are small portions as compared to the total number since they occurred only for some detection fibers of the two validation phantoms D3 (Fiber 2) and E2 (Fiber 1, 3, and 4).

To have a better illustration of this deviation, the measured and predicted (by the metamodel) spatially resolved diffuse reflectance profiles at 700 nm of phantom D3 are illustrated in Fig. 5. There is a slight deviation between the measured and predicted diffuse reflectance values at the detection fiber 2, while those at other detection fibers match quite well. In our opinion, this could have been caused by unexpected factors such as improper cleaning of the probe surface, movement of the fiber connectors, bending of the fibers,… corresponding to these detection fibers during the moments of measuring this liquid phantom; while the calibration metamodel gave the predicted reflectance values based on other points in the data set which could have not been affected by these factors.

#### 3.4. Inverse validation: Estimation of optical properties of liquid phantoms

The scatter plots of the predicted versus measured optical properties values (absorption and reduced scattering coefficients) of the validation phantoms are shown in Fig. 6.

It can be seen in Fig. 6 that the predicted absorption and reduced scattering coefficients of the validation phantoms are very close to the measured optical properties with *R ^{2}* values of 0.968 and 0.978, respectively. The average prediction error for the absorption coefficient values

*µ*is 0.086 cm

_{a}^{−1}, which is very small compared to the average value of 1.023 cm

^{−1}. In Fig. 6(b), in the range below 60 cm

^{−1}, all the predicted reduced scattering coefficients

*µ*are very close to the measured ones, while in the range from 60 to 90 cm

_{s}’^{−1}, considerable deviations can be observed. Even though the

*RMSECV*value appears to be small (3.876 cm

^{−1}), the predicted scattering values can differ significantly from the measured ones. A possible explanation for the lower prediction performance for larger scattering coefficient values can be found in the larger steps between the reduced scattering coefficient values of the high scattering phantoms D, E, and F compared to those between the lower scattering phantoms A, B and C [Fig. 2(b)]. This may have made it more difficult for the metamodels to describe the spatially resolved reflectance profiles corresponding to scattering values in-between these phantoms. It is, therefore, suggested to investigate the effect of the resolution in the designed optical properties of the different phantoms on the performance of the metamodels in a future study.

In general, the acquired prediction errors are good, while the time needed for estimation of the optical properties is relatively fast with regard to the potential for industrial applications. The calculation of a spatially resolved spectroscopy profile based on a set of given optical properties on a desktop computer (AMD Phenom II X6 1100 T Processor 3.30 GHz) with the forward metamodeling approach takes approximately 10 ms, which is about sixty thousand times faster than the corresponding MC simulation with 10 million photons. The inverse metamodeling procedure for estimation of the optical properties with Nelder-Mead optimizer on the same computer for one spatially resolved diffuse reflectance profile takes approximately 10 s, or around 17 min for each liquid phantom over the wavelength range 600 – 900 nm with 3 nm resolution. However, it should be noted that this computation time could be further reduced by using a more powerful computer, selecting a coarser wavelength resolution, or the employment of a more efficient and faster algorithm for the inverse estimation.

## 4. Conclusions

A novel metamodeling concept was introduced and applied to model the relationship between bulk optical properties and spatially resolved diffuse reflectance profiles acquired by means of a fiber-optic SRS setup. This concept was applied to a set of liquid phantoms covering a large range of optical properties representative for a wide range of agricultural and food products. Forward validation results showed good performance of the metamodels in predicting the measured spatially resolved diffuse reflectance values for phantoms which had not been used for building the metamodels (*R ^{2}* = 0.989 - 0.998 for the five detection fibers). These metamodels were incorporated in an iterative inversion scheme for optical properties estimation from acquired spatially resolved diffuse reflectance spectra. The inverse validation results showed good prediction performance for the

*µ*and

_{a}*µ*values of the validation phantoms (

_{s}’*R*= 0.968,

^{2}*RMSECV =*0.086 cm

^{−1}for

*µ*;

_{a}*R*= 0.978,

^{2}*RMSECV =*3.876 cm

^{−1}for

*µ*). This clearly confirms the potential of this inverse metamodeling approach for rapid (sixty thousand times faster than the corresponding Monte Carlo simulation with 10 million photons) and accurate optical properties estimation. However, the absolute prediction accuracy for the phantoms with low scattering values was considerably better than for the phantoms with high scattering values. Therefore, future research should further explore the impact of experimental design on the metamodeling and the capability of the metamodels for handling noisy data sets.

_{s}’## Acknowledgments

This publication has been produced with the financial support of the European Commission (project FP7-226783 - InsideFood). The opinions expressed in this document do by no means reflect the official opinion of the European Union or its representatives. The authors would like to thank Professor Depeursinge and his associates at the Swiss Federal Institute of Technology (EPFL, Lausanne, Switzerland) for the aid in building the setup for spatially resolved spectroscopy. The provision of the *ooDACE* toolbox from the Department of Information Technology (INTEC), Ghent University, Belgium is greatly acknowledged. Rodrigo Watté is a PhD student funded by the Institute for the Promotion of Innovation through Science and Technology (IWT-Vlaanderen). Nghia Nguyen Do Trong is a PhD student funded by the Interfaculty Council for Development Cooperation, KU Leuven (IRO scholarship). Wouter Saeys has been funded as a Postdoctoral Fellow of the Research Foundation – Flanders (FWO).

## References and links

**1. **V. Tuchin, *Tissue Optics – Light Scattering Methods and Instruments for Medical Diagnosis,* 2^{nd} ed. (SPIE Press, 2007).

**2. **V. Tuchin, *Handbook of Optical Sensing of Glucose in Biological Fluids and Tissues* (CRC Press, 2008).

**3. **R. M. P. Doornbos, R. Lang, M. C. Aalders, F. W. Cross, and H. J. C. M. Sterenborg, “The determination of in vivo human tissue optical properties and absolute chromophore concentrations using spatially resolved steady-state diffuse reflectance spectroscopy,” Phys. Med. Biol. **44**(4), 967–981 (1999). [CrossRef] [PubMed]

**4. **T. S. Leung, N. Aladangady, C. E. Elwell, D. T. Delpy, and K. Costeloe, “A new method for the measurement of cerebral blood volume and total circulating blood volume using near infrared spatially resolved spectroscopy and indocyanine green: application and validation in neonates,” Pediatr. Res. **55**(1), 134–141 (2004). [CrossRef] [PubMed]

**5. **D. Arifler, C. MacAulay, M. Follen, and R. Richards-Kortum, “Spatially resolved reflectance spectroscopy for diagnosis of cervical precancer: Monte Carlo modeling and comparison to clinical measurements,” J. Biomed. Opt. **11**(6), 064027 (2006). [CrossRef] [PubMed]

**6. **A. Garcia-Uribe, J. Zou, T.-H. Chang, M. Duvic, V. Prieto, and L. V. Wang, “Oblique-incidence spatially resolved diffuse reflectance spectroscopic diagnosis of skin cancer,” Proc. SPIE **7572**, 75720L (2010). [CrossRef]

**7. **R. Cubeddu, A. Pifferi, P. Taroni, A. Torricelli, and G. Valentini, “Noninvasive absorption and scattering spectroscopy of bulk diffusive media: An application to the optical characterization of human breast,” Appl. Phys. Lett. **74**(6), 874–876 (1999). [CrossRef]

**8. **A. Torricelli, A. Pifferi, P. Taroni, E. Giambattistelli, and R. Cubeddu, “In vivo optical characterization of human tissues from 610 to 1010 nm by time-resolved reflectance spectroscopy,” Phys. Med. Biol. **46**(8), 2227–2237 (2001). [CrossRef] [PubMed]

**9. **A. Torricelli, L. Spinelli, A. Pifferi, P. Taroni, R. Cubeddu, and G. Danesini, “Use of a nonlinear perturbation approach for in vivo breast lesion characterization by multiwavelength time-resolved optical mammography,” Opt. Express **11**(8), 853–867 (2003). [CrossRef] [PubMed]

**10. **B. Guan, Y. Zhang, S. Huang, and B. Chance, “Determination of optical properties using improved frequency-resolved spectroscopy,” Proc. SPIE **3548**, 17–26 (1998). [CrossRef]

**11. **J. Y. Le Pommellec and J. P. L'Huillier, “Determination of the optical properties of breast tissues using frequency-resolved transillumination: basic theory and preliminary results,” Proc. SPIE **4161**, 202–215 (2000). [CrossRef]

**12. **A. Ishimaru, “Theory and application of wave propagation and scattering in random media,” Proc. IEEE **65**(7), 1030–1061 (1977). [CrossRef]

**13. **A. Ishimaru, *Wave propagation and scattering in random media* (Academic, 1978).

**14. **T. Tarvainen, M. Vauhkonen, V. Kolehmainen, and J. P. Kaipio, “Finite element model for the coupled radiative transfer equation and diffusion approximation,” Int. J. Numer. Methods Eng. **65**(3), 383–405 (2006). [CrossRef]

**15. **L. Wang, S. L. Jacques, and L. Zheng, “MCML--Monte Carlo modeling of light transport in multi-layered tissues,” Comput. Methods Programs Biomed. **47**(2), 131–146 (1995). [CrossRef] [PubMed]

**16. **D. Boas, J. Culver, J. Stott, and A. Dunn, “Three dimensional Monte Carlo code for photon migration through complex heterogeneous media including the adult human head,” Opt. Express **10**(3), 159–170 (2002). [CrossRef] [PubMed]

**17. **E. Alerstam, T. Svensson, and S. Andersson-Engels, “Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration,” J. Biomed. Opt. **13**(6), 060504 (2008). [CrossRef] [PubMed]

**18. **Q. Fang and D. A. Boas, “Monte Carlo simulation of photon migration in 3D turbid media accelerated by graphics processing units,” Opt. Express **17**(22), 20178–20190 (2009). [CrossRef] [PubMed]

**19. **A. Ishimaru, “Diffusion of light in turbid material,” Appl. Opt. **28**(12), 2210–2215 (1989). [CrossRef] [PubMed]

**20. **T. J. Farrell, M. S. Patterson, and B. Wilson, “A diffusion theory model of spatially resolved, steady-state diffuse reflectance for the noninvasive determination of tissue optical properties *in vivo*,” Med. Phys. **19**(4), 879–888 (1992). [CrossRef] [PubMed]

**21. **A. Kienle and M. S. Patterson, “Improved solutions of the steady-state and the time-resolved diffusion equations for reflectance from a semi-infinite turbid medium,” J. Opt. Soc. Am. A **14**(1), 246–254 (1997). [CrossRef] [PubMed]

**22. **G. Alexandrakis, T. J. Farrell, and M. S. Patterson, “Accuracy of the diffusion approximation in determining the optical properties of a two-layer turbid medium,” Appl. Opt. **37**(31), 7401–7409 (1998). [CrossRef] [PubMed]

**23. **I. Couckuyt, F. Declerq, T. Dhaene, H. Rogier, and L. Knockaert, “Surrogate-Based Infill Optimization Applied to Electromagnetic Problems,” Int. J. RF Microw. C. E. **20**(5), 492–501 (2010). [CrossRef]

**24. **D. Gorissen, I. Couckuyt, P. Demeester, T. Dhaene, and K. Crombecq, “A Surrogate Modeling and Adaptive Sampling Toolbox for Computer Based Design,” J. Mach. Learn. Res. **11**, 2051–2055 (2010).

**25. **T. W. Simpson, J. D. Poplinsky, P. N. Koch, and J. K. Allen, “Meta-models for computer-based engineering design: survey and recommendations,” Eng. Comput. **17**(2), 129–150 (2001). [CrossRef]

**26. **G. G. Wang and S. Shan, “Review of metamodeling techniques in support of engineering design optimization,” J. Mech. Des. **129**(4), 370–380 (2007). [CrossRef]

**27. **J. S. Dam, C. B. Pedersen, T. Dalgaard, P. E. Fabricius, P. Aruna, and S. Andersson-Engels, “Fiber-optic probe for noninvasive real-time determination of tissue optical properties at multiple wavelengths,” Appl. Opt. **40**(7), 1155–1164 (2001). [CrossRef] [PubMed]

**28. **A. Kienle, L. Lilge, M. S. Patterson, R. Hibst, R. Steiner, and B. C. Wilson, “Spatially resolved absolute diffuse reflectance measurements for noninvasive determination of the optical scattering and absorption coefficients of biological tissue,” Appl. Opt. **35**(13), 2304–2314 (1996). [CrossRef] [PubMed]

**29. **L. Zhang, Z. Wang, and M. Zhou, “Determination of the optical coefficients of biological tissue by neural network,” J. Mod. Opt. **57**(13), 1163–1170 (2010). [CrossRef]

**30. **A. Pifferi, A. Torricelli, A. Bassi, P. Taroni, R. Cubeddu, H. Wabnitz, D. Grosenick, M. Möller, R. Macdonald, J. Swartling, T. Svensson, S. Andersson-Engels, R. L. P. van Veen, H. J. C. M. Sterenborg, J.-M. Tualle, H. L. Nghiem, S. Avrillier, M. Whelan, and H. Stamm, “Performance assessment of photon migration instruments: the MEDPHOT protocol,” Appl. Opt. **44**(11), 2104–2114 (2005). [CrossRef] [PubMed]

**31. **B. W. Pogue and M. S. Patterson, “Review of tissue simulating phantoms for optical spectroscopy, imaging and dosimetry,” J. Biomed. Opt. **11**(4), 041102 (2006). [CrossRef] [PubMed]

**32. **I. Couckuyt, A. Forrester, D. Gorissen, F. De Turck, and T. Dhaene, “Blind Kriging: Implementation and performance analysis,” Adv. Eng. Soft. **49**, 1–13 (2012). [CrossRef]

**33. **D. R. Jones, M. Schonlau, and W. J. Welch, “Efficient global optimization of expensive black-box functions,” J. Glob. Optim. **13**(4), 455–492 (1998). [CrossRef]

**34. **J. Sacks, W. J. Welch, T. Mitchell, and H. P. Wynn, “Design and analysis of computer experiments,” Stat. Sci. **4**(4), 423–435 (1989). [CrossRef]

**35. **J. Staum, “Better simulation metamodeling: The why, what, and how of stochastic Kriging,” in Proceedings of the Winter Simulation Conference (Austin, TX, 2009), pp. 119–133. [CrossRef]

**36. **I. Couckuyt, K. Crombecq, D. Gorissen, and T. Dhaene, “Automated response surface model generation with sequential design,” in Proceedings of First International Conference on Soft Computing Technology in Civil, Structural and Environmental Engineering, (Funchal, 2009), p. 52.

**37. **D. J. J. Toal, N. W. Bressloff, and A. J. Keane, “Kriging Hyperparameter Tuning Strategies,” AIAA J. **46**(5), 1240–1252 (2008). [CrossRef]

**38. **H. J. van Staveren, C. J. M. Moes, J. van Marie, S. A. Prahl, and M. J. C. van Gemert, “Light scattering in Intralipid-10% in the wavelength range of 400-1100 nm,” Appl. Opt. **30**(31), 4507–4514 (1991). [CrossRef] [PubMed]

**39. **P. Di Ninni, F. Martelli, and G. Zaccanti, “The use of India ink in tissue-simulating phantoms,” Opt. Express **18**(26), 26854–26865 (2010). [CrossRef] [PubMed]

**40. **G. M. Hale and M. R. Querry, “Optical constants of water in the 200 nm to 200 μm wavelength region,” Appl. Opt. **12**(3), 555–563 (1973). [CrossRef] [PubMed]

**41. **E. Verhoelst, F. Bamelis, B. De Ketelaere, N. N. Trong, J. De Baerdemaeker, W. Saeys, M. Tsuta, and E. Decuypere, “The potential of spatially resolved spectroscopy for monitoring angiogenesis in the chorioallantoic membrane,” Biotechnol. Prog. **27**(6), 1785–1792 (2011). [CrossRef] [PubMed]