## Abstract

Functional near-infrared spectroscopy (fNIRS) can non-invasively measure hemodynamic responses in the cerebral cortex with a portable apparatus. However, the observation signal in fNIRS measurements is contaminated by the artifact signal from the hemodynamic response in the scalp. In this paper, we propose a method to separate the signals from the cortex and the scalp by estimating both hemodynamic changes by diffuse optical tomography (DOT). In the inverse problem of DOT, we introduce smooth regularization to the hemodynamic change in the scalp and sparse regularization to that in the cortex based on the nature of the hemodynamic responses. These appropriate regularization models, with the spatial information of optical paths of many measurement channels, allow three-dimensional reconstruction of both hemodynamic changes. We validate our proposed method through two-layer phantom experiments and MRI-based head-model simulations. In both experiments, the proposed method simultaneously estimates the superficial smooth activity in the scalp area and the deep localized activity in the cortical area.

© 2013 Optical Society of America

## 1. Introduction

Functional near-infrared spectroscopy (fNIRS) is a technique that non-invasively measures hemodynamic responses to neuronal activation in the cerebral cortex [1–6]. To observe these responses, near-infrared light is emitted into the head from source probes on the scalp, and the light that passes through the head is measured by detector probes on the scalp. Since fNIRS measurement provides such advantages as safety, portability, low cost as well as few physical restrictions, it is expected to extend the application studies of neuroscience as an easy-to-use neuro-monitoring tool.

However, a major problem with fNIRS measurement is the artifact originating from the hemodynamic change in the scalp. The sensitivity of fNIRS measurement is very high near the observation surface, namely the scalp, and it exponentially decreases with increased distance from the scalp. Therefore, a small hemodynamic fluctuation in the scalp results in a large observation signal. Recently, it has been reported that scalp hemodynamic responses show task-related changes, suggesting that the scalp artifact may be misinterpreted as cortical activities in fNIRS measurements [7, 8]. Many methods have been developed for removing the scalp signals. It is empirically known that the hemodynamic change in the superficial layers, including the scalp and the skull, is not localized but instead spatially global [9–13]. To remove or isolate this global artifact, most methods have used temporal information, and often the temporal values of the short-distance measurement channel as a reference of superficial hemodynamic fluctuations for regression [9–15]. However, the inhomogeneity of the temporal patterns in the hemodynamic change of the superficial layers [16, 17] or the temporal correlation between the hemodynamic responses in the scalp and in the cortex [7, 8, 18] negatively impacts such artifact-removal methods based on temporal information. In this study, by contrast, the spatial information of the optical paths of all observation channels, which are not affected by temporal inhomogeneity or correlation, is fully exploited for removing the scalp signals.

We propose a method that separates the signals from the cortex and scalp by estimating the hemodynamic changes in both regions by diffuse optical tomography (DOT), which is an advanced imaging method designed to increase the spatial resolution of fNIRS measurements. Continuous-wave DOT reconstructs the distribution of the changes in optical properties inside the tissue from the changes in measured light intensity using knowledge of the spatial information of the optical paths of measurement channels [6, 19–22]. In our previous study, we proposed a three-dimensional reconstruction method with high spatial resolution in both the depth and horizontal directions [23], where we introduced sparse regularization to improve the depth accuracy and the spatial resolution. In this study, we extend this method and reconstruct the distributions of hemoglobin concentration change in both the cortex and the scalp. Although the introduction of sparse regularization is reasonable for accurate localization of cortical activity, it is not reasonable for estimating the hemodynamic change in the scalp because it is rather global. Therefore, we introduced different types of regularization for the cortex and the scalp. Our proposed method applies sparse regularization to the cortical hemodynamic change and smooth regularization to the scalp’s hemodynamic change. The parameter controlling the smoothness of the scalp regularization is adjusted from the data in the estimation process. Consequently, inhomogeneous scalp hemodynamic change distributions can be appropriately estimated. In this way, our proposed method simultaneously estimates the hemodynamic changes in both the cortex and the scalp.

We conducted two experiments to validate our proposed method. First, we applied it to the data obtained from a two-layer phantom experiment measured with a commercially available near-infrared imaging system. The global absorption change in the scalp and the local absorption change in the cortex were mimicked by a two-layer phantom. Second, we evaluated it by a simulation study using an MRI-based five-layer head model [20]. In both experiments, the proposed method separated the signals from the cortical area and from the scalp area and accurately estimated the three-dimensional position of the local activities in the cortical area.

The following notations are used throughout this paper. Scalars are represented in normal-type letters as *x*. Vectors are represented in lower-case boldface letters as ** x**. Matrices are represented in upper-case boldface letters as

**. The transpose, trace, and determinant of matrix**

*X***are denoted as**

*X*

*X**, tr(*

^{T}**), and |**

*X***|, respectively. A diagonal matrix with diagonal elements**

*X***is denoted as diag(**

*x***). A vector consisting of the diagonal elements of matrix**

*x***is denoted as diag(**

*X***). The**

*X**L*

_{2}norm of vector

**is denoted as ${\Vert \mathit{x}\Vert}_{2}^{2}$. The expectation of**

*x***with respect to a probability**

*X**P*(

**) is denoted as 〈**

*X***〉**

*X*

_{P}_{(}

_{X}_{)}. The mean value $\frac{1}{T}{\mathbf{\Sigma}}_{t=1}^{T}{\mathit{x}}_{t}$ is denoted as mean(

**). The**

*x**N*-dimensional vector consisting of 1s and the identity matrix of size

*N*are denoted as

**1**

*and*

_{N}**I**

*, respectively.*

_{N}*N*(

**;**

*x***,**

*μ***Σ**) denotes that a vector random variable

**follows a multivariate Gaussian distribution with mean vector**

*x***and covariance matrix**

*μ***Σ**. Γ(

*x*;

*μ*,

*γ*) denotes that a scalar random variable

*x*follows a gamma distribution with mean

*μ*and shape parameter

*γ*(note that the scale parameter corresponds to

*μγ*

^{−1}).

## 2. Methods

Here, we introduce our method to reconstruct the three-dimensional distributions of the concentration changes of oxygenated and deoxygenated hemoglobin, ΔHbO_{2} and ΔHbR, in both the cortical and scalp areas. First, we reconstructed the three-dimensional distributions of the absorption changes at multiple wavelengths. Then we converted these absorption change distributions to the ΔHbO_{2} and ΔHbR distributions.

We used the Bayesian approach to reconstruct the distributions of the absorption changes. The forward model of DOT, which relates the absorption changes inside the biological tissue to the observed light intensity change, is formulated as a conditional probability function. The regularization used to resolve the ill-posed nature of the inverse problem of DOT is formulated as a prior probability function. We obtained the estimate by inverting the conditional probability function with Bayes’ theorem.

Our proposed Bayesian estimation method is an extended version of a previous work [23], where we applied the automatic relevance determination (ARD) sparse prior to regularize the cortical activities. Here, we add the Laplacian smooth prior to regularize the scalp activities.

#### 2.1. Forward model

Assuming that the absorption changes are small, we use the Rytov approximation [24–27], which linearly relates the absorption changes and the measured light intensity changes. Here, the target area of the absorption changes is separated into cortical and scalp areas:

where ${\mathit{x}}^{c}={({x}_{1}^{c},\cdots ,{x}_{{N}^{c}}^{c})}^{T}$ and ${\mathit{x}}^{s}={({x}_{1}^{s},\cdots ,{x}_{{N}^{s}}^{s})}^{T}$ represent absorption changes Δ*μ*in the voxels of the cortical and scalp areas,

_{a}**= (**

*y**y*

_{1}, ···,

*y*)

_{M}*represents the measured light intensity changes,*

^{T}

*A**and*

^{c}

*A**represent the sensitivity matrix of the cortical and scalp areas, and*

^{s}**represents the measurement noise.**

*ξ**N*and

^{c}*N*denote the number of voxels in the cortical and scalp areas, and

^{s}*M*denotes the number of measurement logical channels.

In a measurement, we observe multiple wavelength data and a number of time sample data. We summarize them in a matrix form. Observed light intensity changes *y** _{k,t}* of wavelength

*k*(

*k*= 1,···,

*K*) and time sample

*t*(

*t*= 1,···,

*T*) are summarized as

**= (**

*Y*

*Y*_{1}, ···,

*Y**, ···,*

_{k}

*Y**), where*

_{K}

*Y**= (*

_{k}

*y*

_{k}_{,1}, ···,

*y**, ···,*

_{k,t}

*y**). Likewise, the absorption changes are summarized as ${\mathit{X}}^{r}=({\mathit{X}}_{1}^{r},\cdots ,{\mathit{X}}_{k}^{r},\cdots ,{\mathit{X}}_{K}^{r})$, where ${\mathit{X}}_{k}^{r}=({\mathit{x}}_{k,1}^{r},\cdots ,{x}_{k,t}^{r},\cdots ,{x}_{k,T}^{r})$ and*

_{k,T}*r*∈ [

*c*,

*s*].

We constructed a probabilistic forward model assuming that ** ξ** is a Gaussian noise. This assumption is validated by the phantom experimental data. We also assume that the noise covariance matrix in the absorption change measurement (task condition) is proportional to the noise covariance matrix under baseline condition (rest condition)

**Σ**

*. Then the conditional probability for observed light-intensity change*

_{y}**given absorption change**

*Y*

*X**,*

^{c}

*X**is given by*

^{s}*σ*is the scale parameter inversely proportional to the measurement noise variance. The scale parameter is estimated by introducing a hierarchical prior:

#### 2.2. Regularization model

We used different types of regularization for the cortical and scalp areas. We applied the ARD prior and localized smoothing filter to the cortical area because the cortical activities are localized. In contrast, we applied a smoothing Laplacian prior to the scalp area because the hemo-dynamic responses in the scalp are rather global.

For the cortical area, we assume that solution *x** ^{c}* becomes a mixture of a small number of focal activities. Therefore, we represent the solution as

*x**=*

^{c}

*Wz**, where*

^{c}**is a spatially localized smoothing filter. The ARD prior is adopted for**

*W*

*z**. The probability distribution of*

^{c}

*z**is given by a Gaussian distribution with a zero mean vector and a diagonal covariance matrix whose diagonal elements are adjustable hyperparameters:*

^{c}**Λ**= diag(

**) and**

*λ***= (**

*λ**λ*

_{1}, ···,

*λ*

_{Nc})

*represent the hyperparameters that are automatically adjusted from the data. Here, we use the common*

^{T}**value for all of the wavelength data. Equation (4) can be equivalently rewritten as**

*λ***is estimated by introducing a hierarchical prior:**

*λ**λ̄*

_{0i}and

*γ*

_{0i}represent the mean and reliability (or confidence) of the hierarchical prior, respectively [28, 29].

For the scalp area, we assume solution *x** ^{s}* is spatially smooth. That is, the Laplacian (or 2nd derivative) of solution

*x**,*

^{s}

*Lx**, becomes small. Therefore, regularization is applied for*

^{s}

*z**=*

^{s}

*Lx**as*

^{s}*η*is the hyperparameter controlling the degree of smoothness. Equation (7) can be equivalently rewritten as

*η*is estimated by introducing a hierarchical prior: where

*η̄*

_{0}and

*γ*

_{η0}represent the mean and reliability (or confidence) of the hierarchical prior.

#### 2.3. Variational Bayesian method

The posterior probability of unknown *X** ^{c}*,

*X**,*

^{s}**,**

*λ**η*,

*σ*given observation data

**is obtained by Bayes’ theorem,**

*Y**P*(

**,**

*Y*

*X**,*

^{c}

*X**,*

^{s}**,**

*λ**η*,

*σ*) =

*P*(

**|**

*Y*

*X**,*

^{c}

*X**,*

^{s}*σ*)

*P*(

*X**|*

^{c}**)**

*λ**P*(

*X**|*

^{s}*η*)

*P*(

**)**

*λ**P*(

*η*)

*P*(

*σ*) and

*P*(

**) = ∫**

*Y**dX*

*X*

^{c}dX

*X*

^{s}d

*λ**dηdσP*(

**,**

*Y*

*X**,*

^{c}

*X**,*

^{s}**,**

*λ**η*,

*σ*).

Because analytically calculating the posterior is difficult, we converted this posterior calculation problem to a maximization problem of a free energy function. The free energy for trial distribution *Q*(*X** ^{c}*,

*X**,*

^{s}**,**

*λ**η*,

*σ*), which is the approximation of posterior

*P*(

*X**,*

^{c}

*X**,*

^{s}**,**

*λ**η*,

*σ*|

**), is defined by**

*Y**F*(

*Q*) is equivalent to the minimization of the Kullback-Leibler distance between the posterior and the trial distribution. The maximization problem was iteratively solved using a factorization approximation of the trial distribution

*Q*(

*X**,*

^{c}

*X**,*

^{s}**,**

*λ**η*,

*σ*) (see Appendix). We used the solution of the sensitivity-normalized Tikhonov regularization as an initial value of the iteration algorithm (see Appendix). Finally, we obtained the approximated mean value with respect to the posterior,

*X̂**= 〈*

^{c}

*X**〉*

^{c}_{Q(Xc,Xs,λ,η,σ)}≃ 〈

*X**〉*

^{c}_{P(Xc,Xs,λ,η,σ|Y)}and

*X̂**= 〈*

^{s}

*X**〉*

^{s}_{Q(Xc,Xs,λ,η,σ)}≃ 〈

*X**〉*

^{s}_{P(Xc,Xs,λ,η,σ|Y)}.

#### 2.4. Conversion from absorption changes to hemoglobin concentration changes

Estimated absorption changes *x̂* = Δ*μ _{a}* in multiple wavelengths are converted to the concentration changes of the oxygenated and deoxygenated hemoglobin (ΔHbO

_{2}and ΔHbR) in each voxel. These values are connected by the spectral extinction coefficients:

*k*-th wavelength, respectively [30]. The estimated values of ΔHbO

_{2}and ΔHbR are derived using the least-squares method [10]:

## 3. Validation of method by two-layer phantom experiments

In this section, we describe how we validated our proposed method using a two-layer phantom experiment. Here, only one-wavelength data were used, and the positional accuracy of the estimated absorption change was evaluated. Because this experiment was static, the temporal information was not useful and the spatial information was only useful for separating the artifact signals.

#### 3.1. Experimental conditions

We used a two-layer phantom consisting of liquid and silicone layers. The liquid layer was a substitute for the cortex, and the silicone layer was a substitute for the scalp. The measurement was done under two conditions: rest and task. We used the light intensity changes between these two conditions for our analysis. In the rest condition, a tank was filled with the liquid, and a silicone plate was placed on the bottom of the tank. In the task condition, we replaced the silicone plate with another one whose absorption coefficient was slightly higher than that of the rest condition to mimic the global activity in a scalp layer. We also added absorbers to mimic the local cortical activity (Fig. 1(a)). These black spherical absorbers had 5-mm diameters made of polyacetal. The liquid was made of 1.4% intralipid and ink. The silicone plates were 5-mm thick and made of silicone, kaolin powder, and ink. The kaolin powder and ink were stirred uniformly into a silicone liquid and cured to make a uniform silicone plate. Note that uncontrollable inhomogeneity remains in the silicone plate. Intralipid and kaoline powder were used to regulate the scattering coefficients. The ink regulated the absorption coefficients. The estimated absorption and reduced scattering coefficients (*μ _{a}*,

*μ′*) of the liquid, the silicone plate for the rest condition, and the silicone plate for the task condition were (0.018

_{s}*mm*

^{−1 }, 1.1

*mm*

^{−1}), (0.015

*mm*

^{−1}, 0.9

*mm*

^{−1}), and (0.016

*mm*

^{−1}, 0.9

*mm*

^{−1}), respectively. These estimations were based on measurements with a spectral photometer and a near-infrared imaging system.

We used a continuous-wave near-infrared imaging system (FOIRE-3000, Shimadzu Corporation) to acquire data to reconstruct the absorption change. To achieve a sufficient dynamic range for DOT, we customized the system’s amplifier to reach 256 times gain at maximum, which is 4 times higher than the commercially available FOIRE-3000. Here, we only used 805-nm wavelength data. Source and detector probes were alternately placed on the tank’s bottom in a rectangular arrangement (Fig. 1(b)). The sampling rate was 130 ms. Each time, we measured for 20 seconds in the rest and task conditions and tested the estimation results under two probe-interval conditions: *l* = 13 mm and 18.4 mm (
$=13\hspace{0.17em}\text{mm}\times \sqrt{2}$).

We tested the proposed method by one- and two-absorber experiments. First, we evaluated the depth accuracy using one spherical absorber placed at the *x*-*y* center position (i) (0, 0, *d*) at various depths (Fig. 1(b)). The results are summarized in section 3.3. Second, we evaluated the spatial resolution using two spherical absorbers placed in (ii) (
$-\frac{l}{2}$, 0, *d*) and (
$-\frac{l}{2}+h$, 0, *d*) at various depths *d* and various horizontal distances *h* (Fig. 1(b)). The results are summarized in section 3.4. One example of this experiment is shown in section 3.2 with a three-dimensional image to understand the proposed method.

The voxel size was 2.5 *mm* × 2.5 *mm* × 2.5 *mm*. The numbers of voxels in the cortical and scalp areas were *N ^{c}* = 6250 = 25 × 25 × 10 and

*N*= 1250 = 25 × 25 × 2, respectively. The number of measurement logical channels

^{s}*M*was 56 (

*l*= 13 mm, up to 3rd nearest-neighbor (NN) channels) or 48 (

*l*= 18.4 mm, up to 2nd NN channels). We used a 5-mm FWHM Gaussian filter as

*W*. Control constant

*β*of sensitivity-normalized Tikhonov regularization (see Appendix) was set with reference to the maximum value of the sensitivity for a region deeper than 20 mm from the surface:

*β*= max(

*ρ*),

_{i}*i*∈

*I*

_{≥20 mm}, where

*I*

_{≥20 mm}denotes the set of voxel indices deeper than 20 mm.

#### 3.2. Estimation example

In this section, we show the advantage of the proposed method compared to previous methods by applying it to one example of a 3D reconstruction in the two-layer phantom experiment. We obtained the example data by the phantom experiment with an arrangement of 18.4-mm probe intervals. Here, 5-mm-thick silicone plates were used to mimic the global scalp activity, and two spherical absorbers were used to mimic the local cortical activities. Center positions of the absorbers were (−9.2, 0, 15) [mm] and (5.8, 0, 15) [mm] (Fig. 2(a)). The horizontal distance between two absorbers *h* was 15 mm, which was shorter than the probe intervals of 18.4 mm. Note that the coordinate system of Fig. 2 is flipped upside down to that of Fig. 1 so that it resembles the functional neuroimaging in which the probes are usually placed on top of the surface.

Figure 2(b) shows the estimation results of the sensitivity-normalized Tikhonov regularization. Because Tikhonov regularization cannot localize the solution, the two local absorption changes in the liquid layer and the global absorption change in the silicone layer are mixed. Values deeper than 20 mm are suppressed by the introduction of control constant *β* (see Appendix). The following hierarchical Bayesian estimations used this solution as an initial value of their iteration algorithm.

Figure 2(c) shows the estimation results of the hierarchical Bayesian estimation where the ARD sparse prior was applied to both the liquid and silicone areas. The sparse prior mismatches the global absorption changes in the silicone area. As a result, the local absorption changes in the liquid layer and the global absorption change in the silicone layer could not be completely distinguished. Accordingly, the estimated position of the two absorption changes in the liquid area is different from the true position.

Figure 2(d) shows the estimation result of the hierarchical Bayesian estimation where the ARD sparse prior is applied to the liquid area and the Laplacian smooth prior is applied to the silicone area. The two local absorption changes in the liquid layer and the global absorption change in the silicone layer can be accurately distinguished. The positional error of the two local absorption changes in the liquid area is smaller than the voxel size of 2.5 mm. This result is summarized again in section 3.4 (*d* = 15 mm, *h* = 15 mm).

#### 3.3. One-absorber experiment

In this section, we examine whether the depth of a local absorption change can be accurately estimated by our proposed method in spite of the presence of a superficial global absorption change. We tested the proposed method using a one-absorber setting. The spherical absorber mimicking the local cortical activity was placed in the center position (i) (Fig. 1(b)) at various depths: *d* = 10 mm, 12.5 mm, 15 mm, 17.5 mm, and 20 mm. The silicone plates were also used to mimic the global scalp activity. The analysis was done under two conditions of probe intervals: *l* = 18.4 mm and 13 mm.

Figure 3(a) shows the estimation results for various depths under the two probe-interval conditions. Here, the 3D stimation results are represented as the 2D *x*-*z* projection images to clearly visualize the estimation accuracy. In the figure, the maximum intensity of the three central voxel regions in the *y* direction (7.5-mm-thick layer in Fig. 3(b)), including the true position of the 5-mm-diameter spherical absorber, was projected onto the *x*-*z* plane to check the positional accuracy in the *y* direction. If a positional error occurs in the *y* direction, the estimation value in this region becomes nearly zero. The white circles in the projection images in Fig. 3(a) represent the true position of the spherical absorber. We judged the estimation results as successful when the positional error between the center position of the true spherical absorption change and the voxel center of the maximum estimation value in the liquid layer was within one voxel (2.5 mm) in each *x*, *y*, *z* direction and when the maximum estimation value was greater than 0.025 *mm*^{−1}. The estimation image is surrounded by a bold frame if the estimation result was judged as successful. Figure 3(c) represents the depth of the maximum estimation value in the liquid layer exceeding 0.025 *mm*^{−1} as a function of the true depth of the spherical absorption change.

The first and second rows from the top of Fig. 3(a) show the estimation results for the 18.4-and 13-mm probe-interval conditions, respectively. The spherical absorption changes were accurately estimated from *d* = 10 to 17.5 mm in the 18.4-mm probe-interval condition and from *d* = 10 to 20 mm in the 13-mm probe-interval condition, respectively, where the positional errors were within one voxel (see also Fig. 3(c)). The amplitudes of the estimated absorption changes were maintained at a roughly constant level. No absorption change deeper than *d* = 20 and *d* = 22.5 mm could be detected in the 13- and 18.4-mm probe-interval conditions, respectively. These results show that the depth of the local absorption change can be accurately estimated in spite of the presence of the superficial global absorption change if the local absorption change is within the depth reached by an adequate amount of diffuse light.

#### 3.4. Two-absorber experiment

In this section, we evaluate the accuracy of the spatial resolution with respect to the local absorption changes in the presence of a superficial global absorption change. We tested the proposed estimation method using a two-absorber setting. In this phantom experiment, two spherical absorbers were placed in (ii) (
$-\frac{l}{2}$, 0, *d*) and (
$-\frac{l}{2}+h$, 0, *d*) (Fig. 1(b)) to mimic the local cortical activities, where *h* denotes the horizontal distance, *d* denotes the depth of the two absorbers, and *l* denotes the probe interval. Silicone plates were also used to mimic the global scalp activity. We evaluated the estimation results under two probe-interval conditions: 18.4 and 13 mm.

Figure 4 shows the estimation results for the 18.4-mm probe-interval condition. The estimations were tested under four conditions of horizontal distance, *h* = 17.5, 15, 12.5, and 10 mm, and under three depth conditions, *d* = 12.5, 15, and 17.5 mm. The 3D estimation results are represented as the 2D *x*-*z* intensity projection images, as in Fig. 3(a). The white circles represent the true positions of the two absorbers. We judged the estimation results to be successful if the positional errors of both spherical absorbers were within one voxel (2.5 mm) in each *x*, *y*, *z* direction and both maximum estimation values exceeded 0.025 *mm*^{−1}. The estimation image is surrounded by a bold frame if the estimation result was judged successful.

For horizontal distance *h* = 17.5 mm, which is almost the same as probe interval *l* = 18.4 mm, the estimation was accurate up to the depth condition of *d* = 17.5 mm (Fig. 4) (*h* = 17.5 mm, *d* = 17.5 mm). For horizontal distance *h* = 12.5 mm, which is much shorter than the probe interval, the estimation was difficult but successful up to a depth condition of *d* = 12.5 mm (Fig. 4) (*h* = 12.5 mm, *d* = 12.5 mm).

The same analysis was also done under a 13-mm probe-interval condition. For horizontal distances *h* = 17.5, 15, 12.5, and 10 mm, the estimation was accurate up to depth conditions of *d* = 17.5, 15, 12.5, and 12.5 mm, respectively. These results show that even if the observation signal is contaminated by the superficial global absorption change, it is still possible to discriminate the two local absorption changes with distances shorter than the probe interval to a certain depth condition.

## 4. Validation of method by MRI-based head-model simulations

In this section, we explain how we validated our proposed method by a simulation study using an MRI-based head model with a much more complex shape than that used in the two-layer phantom experiment. Time-varying hemoglobin concentration changes were estimated using multiple wavelength data. The scalp hemodynamic change measured in a real experiment with human subjects was used in this simulation study. Note that the results investigated in section 3, such as the depth limits up to which the estimations were successful or the accuracy of the spatial resolution, do not necessarily correspond to those of this simulation study because the optical parameters of the media and the strengths of the observation noise and absorption changes were different.

#### 4.1. Simulation conditions

The head model was constructed based on an anatomical MRI image (Trio 3T, Siemens, Ger-many) of an experimental subject (YO). T1-weighted MPRAGE (TE = 3.06 ms, TR = 2250 ms, flip angle = 9 deg, voxel size = 1 *mm*×1 *mm*×1 *mm*) was segmented to a five-layer head model by FreeSurfer [31] and VBMEG [32], which consists of the scalp, the skull, cerebrospinal fluid (CSF), and the gray and white matter.

We made a DOT measurement of the left motor area during a right-finger-tapping task and put 5 × 5 probes on the left hemisphere to cover the primary motor area (Fig. 5). The probe interval was *l* = 13 mm. The sensitivity was calculated by MCX [33] based on the head model and the probe arrangement. We set the optical parameters summarized in the paper by Fang [34] to each layer of the head model: The absorption coefficient, scattering coefficient, anisotropy, and refractive index (*μ _{a}*,

*μ*,

_{s}*g*,

*n*) of the scalp, the skull, CSF, and the gray and white matter were (0.019

*mm*

^{−1}, 7.8

*mm*

^{−1}, 0.89, 1.37), (0.019

*mm*

^{−1}, 7.8

*mm*

^{−1}, 0.89, 1.37), (0.004

*mm*

^{−1}, 0.009

*mm*

^{−1}, 0.89, 1.37), (0.02

*mm*

^{−1}, 9.0

*mm*

^{−1}, 0.89, 1.37), and (0.08

*mm*

^{−1}, 40.9

*mm*

^{−1}, 0.84, 1.37), respectively. The finger-tapping task continued from 0 to 15 sec and the measurement time was from −5 to 35 sec.

To reduce the amount of calculation, we restricted the area used for our DOT simulation study to an 80 *mm* × 80 *mm* × 60 *mm* region (Fig. 5). For the same reason, 1 *mm* × 1 *mm* × 1 *mm* voxels were merged to 2 *mm* × 2 *mm* × 2 *mm* voxels.

We tested our proposed method by putting one localized hemoglobin concentration change in the gray matter area and the global hemoglobin concentration change in the scalp area. The localized hemoglobin concentration change was placed at 84 various positions near the 5 × 5 probes: every second voxel of the area within a 35-mm depth from the head surface and within the square pole made by the four normal directions of R4, R5, R9, and R10 to the head surface. First, the total amount of changes in hemoglobin concentration of ΔHbO_{2} and ΔHbR was placed in one of these voxels. Then it was smoothed by an 8-mm FWHM Gaussian smoothing filter, *W*, whose value was set to zero outside the gray matter area. The integral of *W* was normalized to 1. This *W* was also used in the estimation process. The peak values of ΔHbO_{2} and ΔHbR were set to +6*μM* and −2*μM*, respectively, on the condition that the Gaussian distribution was all inside the gray matter area. The actual peak values depended on the shape of the gray matter area. The actual mean values of ΔHbO_{2} and ΔHbR for the 84 peaks were +9.71*μM* and −3.24*μM*, respectively. The localized hemoglobin concentration changes of ΔHbO_{2} and ΔHbR vary in time along with the hemodynamic response function; a gamma function of *a* = 6 and *b* = 0.9 sec is convoluted with the finger-tapping task period from 0 to 15 sec [35,36]. The global hemoglobin concentration changes in the scalp area were generated based on the measured data of a human finger-tapping task with the same probe arrangement. For this generation process, we randomly chose one trial of the finger-tapping task from 45 measured trials. The ΔHbO_{2} and ΔHbR of the 1st NN channels calculated by the modified Beer-Lambert law were placed in the mid-points of the channels and smoothed on the scalp by the inverse of Laplacian *L*^{−1} of the scalp area. The strength of the hemoglobin concentration changes in the scalp area was adjusted so that the strengths of the observation signals from the gray matter and the scalp were the same on average as
$\overline{\text{max}\left(\sqrt{{\mathbf{\Sigma}}_{\text{t}}{({\mathit{A}}^{\text{c}}{\mathit{x}}^{\text{c}})}^{2}}\right)}=\overline{\text{max}\left(\sqrt{{\mathbf{\Sigma}}_{\text{t}}{({\mathit{A}}^{\text{s}}{\mathit{x}}^{\text{s}})}^{2}}\right)}$, where the maximum was applied over the channels of all wavelengths and the bar represents the average over all 84 simulations. The observation noise was set to half the strength of the phantom experiment, which corresponds to one-fourth or one-fifth the strength of the human experiments. This means that we used the data from an average of around 20 trials as the observation data in this simulation study. Three values of wavelength data (780, 805, and 830 nm) were used for spectroscopy. The sampling rate was 190 ms. One estimation example of this simulation study is shown in section 4.2. The summarized results of all 84 conditions are shown in section 4.3.

The numbers of voxels in the cortical and scalp areas were *N ^{c}* = 5475 and

*N*= 4119, respectively. The number of measurement logical channels

^{s}*M*was 108 (up to 3rd NN channels). Control constant

*β*of the sensitivity-normalized Tikhonov regularization (see Appendix) was set with reference to the mean value of the sensitivity for the 25-mm deep region from the surface near the probes: log

*β*= mean(log

*ρ*),

_{i}*i*∈

*I*

_{≃25 mm}, where

*I*

_{≃25 mm}denotes the set of voxel indices within a range of 24.5 to 25.5 mm deep and within the square pole made by the four normal directions of R4, R5, R9, and R10.

#### 4.2. Estimation example

We first show one estimation example to understand the simulation conditions. Figure 6(a) shows the true spatial distribution and the true time courses of the hemoglobin concentration changes. Figure 6(b) shows the observation value resulting from the hemoglobin concentration changes shown in Fig. 6(a) and the observation noise. Figure 6(c) shows the spatial distribution and the time courses of the hemoglobin concentration changes estimated from the observation value shown in the top panel of Fig. 6(b).

The top panel of Fig. 6(a) shows the true spatial distribution of the oxygenated hemoglobin concentration change, whose center position in the cortex was (x, y, z) = (21, 41, 19) [mm]. The cross-section view of the *y* = 41 plane is shown. The time average of the local oxygenated hemoglobin concentration change in the cortex within the task response period from 5 to 20 sec is displayed in a color map. The square root of the time-averaged squared value of the global change in the scalp within the measurement period from −5 to 35 sec is also displayed in a color map. Since the mean value of the global change in the scalp area becomes very small, we show the square root of the integrated squared value. The time courses of ΔHbO_{2} and ΔHbR of the peak voxel in the gray matter are shown in the middle panel of Fig. 6(a) and those in the scalp are shown in the bottom panel of Fig. 6(a).

The time courses of observation value ** Y** of all channels are superimposed in the top panel of Fig. 6(b). The observation value consists of cortical signal

*A*

^{c}

*x**, scalp signal*

^{c}

*A*

^{s}

*x**, and observation noise*

^{s}**. The middle and bottom panels of Fig. 6(b) show the time courses of the cortical and scalp signals, respectively. The strengths of the cortical and scalp signals are at the same level, as explained in section 4.1. The strength of the noise is stronger than the strengths of the cortical and scalp signals.**

*ξ*The top panel of Fig. 6(c) shows the spatial distribution of the estimated oxygenated hemoglobin concentration change. The peak position of the estimated oxygenated hemoglobin concentration change in the cortex was (x, y, z) = (21, 41, 17) [mm], and the localization error was 2 mm. The time average of the estimated oxygenated hemoglobin concentration change in the cortex within the task response period is displayed in a color map. The square root of the time-averaged squared value of the estimated oxygenated hemoglobin concentration change in the scalp within the measurement period is also displayed in a color map. The time courses of the estimated ΔHbO_{2} and ΔHbR of the peak voxel in the gray matter are shown in the middle panel of Fig. 6(c) and those in the scalp are shown in the bottom panel of Fig. 6(c). The spatial distribution and the time courses of the hemoglobin concentration changes were accurately estimated by the proposed method.

#### 4.3. Evaluation of estimation errors

To investigate the effect of the complex shape of the five-layer head model on the estimation, we evaluated the estimation errors with 84 various positions of the localized cortical hemoglobin change. In each position, we repeated the simulation five times with different random seeds to generate the observation noise and the global hemoglobin concentration change in the scalp and thus reduce fluctuations in the error evaluations. The following values of indices for evaluating the errors were averaged five times.

Figure 7 summarizes the localization errors defined as the distance between the true and estimated peak voxels of the time-averaged change in oxygenated hemoglobin concentration in the cortex within the task response period. The point’s color in the gray matter area represents the value of the localization error when the center position of the true hemoglobin concentration change was in that point.

Figure 7 shows that the localization error mostly depends on the depth from the head surface and that the positional dependency other than the depth is relatively small. Thus, the complex shape does not have a crucial effect on the estimation. Because the observation signal of the cortical hemoglobin concentration change exponentially decreases with increased depth, the depth has a dominant effect on the estimation errors.

Figure 8 summarizes the estimation errors as functions of the true depth of the cortical hemoglobin concentration change from the head surface. Figure 8(a) summarizes the localization errors shown in Fig. 7 as a function of the true depth. Figure 8(b) shows the estimated depth as a function of the true depth. If the depth is accurately estimated, the value is on the dotted line. Figure 8(c) shows the ratio of the estimated peak value to the true peak value of the time-averaged hemoglobin concentration change in the cortex within the task response period. Figure 8(d) shows the root mean squared errors (RMSE) of the hemoglobin concentration change in the scalp. Here, the squared errors of all of the voxels in the scalp and all of the time samples within the measurement time were averaged.

Figures 8(a) and (b) show that the estimation was successful and not biased toward the depth direction if the true depth was less than 25–26 mm but that the estimation suddenly became impossible above this depth. The averaged localization errors of ΔHbO_{2} and ΔHbR are less than 5 mm up to the 26- and 25-mm depth conditions, respectively, and are much larger than 5 mm in deeper conditions (Fig. 8(a)). The averages of the estimated depths are consistent with the true depths within an error of 1.5 mm up to a 26-mm depth and plateaued beyond it (Fig. 8(b)). This means that the depth estimation became impossible in conditions over a 26-mm depth. Figure 8(c) shows that the hemoglobin concentration changes were underestimated because of the regularization term. The degree of underestimation depends on the depth. In deep conditions, because the regularization effect is more dominant than the observation signal, the estimated hemoglobin concentration changes became nearly zero. Figure 8(d) shows that the estimation errors in the scalp are small and only slightly affected by the position of the hemoglobin concentration change in the gray matter. The RMSE value around 0.05*μM* is sufficiently small compared to the hemoglobin concentration change in the scalp (see bottom panel of Fig. 6(a)). The subtle increase of RMSE in the shallow conditions represents a slight effect from the cortical signal.

To see the effect of the signal-to-noise ratio on the estimation errors, we ran the same simulation except for the observation noise level, which was halved or doubled compared to the standard noise condition shown in Fig. 8 (explained in section 4.1). Figure 9(a) shows the localization errors of ΔHbO_{2} in the cortex. The red solid line is the same as that of Fig. 8(a). The dotted and dashed lines represent the localization errors with the half- and double-strength conditions of the observation noise. The depth limit until which the estimation is successful depends on the signal-to-noise ratio. The localization error is less than 5 mm up to depths of 28, 26, and 23 mm with observation noise levels of half, standard, and double strengths, respectively. Figure 9(b) shows the ratio of the estimated peak value to the true peak value of the time-averaged hemoglobin concentration change in the cortex within the task response period. The dotted, solid, and dashed lines represent the ratio with the half-, standard-, and double-strength conditions of the observation noise. This result shows that the ratio approaches 1 in the high signal-to-noise ratio condition.

Next, we show the advantage of our proposed method over the regression method for removing the scalp artifact by comparing the localization errors. Figure 10 shows the localization errors of ΔHbO_{2} in the cortex. The dotted line represents the estimation error without scalp artifact removal, where only the cortical hemodynamic change was estimated by the hierarchical Bayesian method with the sparse regularization model. The dashed line represents the estimation error with the scalp artifact removal by the regression method, where the cortical hemodynamic change was estimated by the hierarchical Bayesian method after removing scalp signals by the regression method. Here, we used the 1st principal component of the time series of the 1st NN channels for regression removal. The solid line represents the estimation error obtained by the proposed method, where both the cortical and scalp hemodynamic changes were estimated for removing the scalp artifact. This red solid line is the same as that of Fig. 8(a). These results show that the proposed method increased the estimation accuracy to a higher level than did the regression method. This is because the regression method removed the globally coherent scalp signal but it could not remove the spatiotemporally inhomogeneous scalp signal, which was better estimated by the proposed method. The inhomogeneous scalp signal made a difference in the estimation accuracy, even though the scalp hemodynamic changes were spatially very smooth in this simulation study as shown in the top panel of Fig. 6(a).

Finally, we evaluated the robustness of the proposed method, which is important for practical use. We considered two kinds of variabilities. The first one is the optical parameter variability. We tested the case where there was a discrepancy in the optical parameters. Although we generated the data in the same manner described in section 4.1, we estimated the hemodynamic changes using a head model whose optical parameters were different from the true ones. We used the optical parameters summarized in the paper by Okada et al. [37] in the estimation process: The absorption coefficient and reduced scattering coefficient (*μ _{a}*,

*μ′*(=

_{s}*μ*(1 −

_{s}*g*))) of the scalp, the skull, CSF, and the gray and white matter were (0.018

*mm*

^{−1}, 1.9

*mm*

^{−1}), (0.016

*mm*

^{−1}, 1.6

*mm*

^{−1}), (0.002

*mm*

^{−1}, 0.001

*mm*

^{−1}), (0.036

*mm*

^{−1}, 2.2

*mm*

^{−1}), and (0.014

*mm*

^{−1}, 9.1

*mm*

^{−1}), respectively. We used the same anisotropy

*g*and refractive index

*n*summarized in section 4.1. The second kind of variability involves the anatomical structure. We tested the case where there was a discrepancy in the anatomical structure. Although we generated the data in the same manner described in section 4.1, we estimated the hemodynamic changes using a simplified head model that was different from the true head model. The simplified head model had the same external boundary (scalp) but the internal boundaries were given by the parallel surfaces with the external boundary. They were determined by the shortest distance from the external boundary as follows: scalp (0–3 mm), skull (3–12 mm), CSF (12–16mm), and gray matter (16∼ mm). Consequently, there was no folding structure in the tissue boundaries. Figure 11(a) and (b) show the localization error and the estimated depth, respectively, of ΔHbO

_{2}in the cortex as a function of the true depth. The red solid line, which is the same as that of Fig. 8(a) and (b), represents the case where there was no discrepancy. The dotted lines represent the estimation results in the case where there was a discrepancy in the optical parameter. The results show only a slight increase in estimation error. The dashed lines represent the estimation result in the case where there was a discrepancy in the anatomical structure. The results show that the estimation was still possible with a relatively low error level despite the large difference in shape between the simplified head model and the true MRI-based head model. The localization error particularly increased in the shallow area (Fig. 11(a)), where the shape of the gray matter widely varied depending on the location. The increase in localization error in the shallow area was mostly due to the estimation error in the depth direction (Fig. 11(b)).

## 5. Conclusion

We proposed a method to reconstruct the three-dimensional distributions of hemoglobin concentration changes in both cortical and scalp areas by three-dimensional diffuse optical tomography. This method spatially separates the signals from the cortex and the scalp in the estimation process using the spatial information of the optical diffusion process. Therefore, it accurately estimates the position of the cortical hemodynamic change regardless of whether the hemodynamic changes in the cortex and the scalp are temporally correlated. Different types of regularization models were introduced for the cortical and scalp areas depending on the nature of their hemodynamic responses. We applied the sparse regularization model to the cortical area for better localization [23, 38] and a smooth regularization model to the scalp area. Because the parameter controlling the smoothness of scalp regularization is automatically adjusted from the data, our method can estimate the inhomogeneous distribution of the scalp hemodynamic changes. We validated our proposed method by two-layer phantom experiments and MRI-based head-model simulations.

First, we evaluated our proposed method by applying it to the data obtained from the two-layer phantom experiment. The silicone layer of the two-layer phantom mimics the scalp and the liquid layer mimics the cortex. We found from the one-absorber experiment that we could accurately estimate both the horizontal and vertical positions of the local absorption change in the liquid layer even if the observation signal were contaminated by the superficial global absorption change in the silicone layer. Next, we confirmed the method’s high spatial resolution in the two-absorber experiment. The two local absorption changes in the liquid layer with distances shorter than the probe interval could be discriminated in spite of the presence of the superficial global absorption change in the silicone layer.

Second, we evaluated our proposed method by a simulation study using an MRI-based head model to examine the effect of the head model’s complex shape on the estimation. Our results show that the degree of estimation error mostly depends on the depth of the true position and that the positional dependency other than the depth is relatively small. Thus, the complex shape did not have a crucial effect on the estimation. We found that the depth limit up to which the estimation was successful strongly depended on the signal-to-noise ratio. We also confirmed that the proposed method has better accuracy than the regression method for removing the scalp artifact due to its ability to accurately estimate inhomogeneous hemodynamic changes in the scalp.

Finally, we tested the robustness of the proposed method with respect to the two kinds of variabilities. In the first case, we tested the robustness on the optical parameters of the head model because they may vary by individual, and thus the exact values are not known. As a result, we found only a slight increase in estimation error due to discrepancies in the optical parameters. In the second case, we tested the method’s robustness on the anatomical structure. The results showed that estimation was still possible with a relatively low level of error, even though we used a simplified head model instead of the MRI-based head model. This is probably because the diffused light is not so sensitive to the fine anatomical structure. It is also possible to register the head model with the standard MNI space, even when there is no MRI information, by the virtual spatial registration method proposed by Tsuzuki et al. [39]. These suggest that the simplified head model or the standard brain could be used for our proposed DOT method as an alternative to an MRI-based head model if some degree of estimation error were accepted.

## Appendix Details of variational Bayes method

The maximization of free energy can be solved using a factorization approximation that restricts the solution space [40, 41]:

**= (**

*X*

*X**,*

^{cT}

*X**)*

^{sT}*to simplify the expression. Under the factorization assumption, maximum free energy is obtained by alternately maximizing the free energy with respect to*

^{T}*Q*and

_{X}*Q*. In the first step (

_{λ}*X*-step), free energy

*F*(

*Q*) is maximized with respect to

*Q*while

_{X}*Q*is fixed. Solution

_{λ}*Q*(

_{X}**) becomes a normal distribution:**

*X*

*x̄**and*

_{k,t}**Σ**

*denote the expectation value and the covariance matrix of*

_{x,k}

*x**, respectively. In the second step (*

_{k,t}*λ*-step), free energy

*F*(

*Q*) is maximized with respect to

*Q*while

_{λ}*Q*is fixed. Solution

_{X}*Q*(

_{λ}**,**

*λ**η*,

*σ*) becomes a gamma distribution:

*λ̄*and

_{i}*γ*denote the expectation and shape parameter values of

_{i}*λ*,

_{i}*η̄*and

*γ*denote the expectation and shape parameter values of

_{η}*η*, and

*σ̄*and

*γ*denote the expectation and shape parameter values of

_{σ}*σ*. The above

*X*- and

*λ*-steps are repeated until the free energy converges.

In this procedure, the free energy function can be expressed as

*x̄**,*

_{k,t}*λ̄*,

_{i}*η̄*, and

*σ̄*represent the expectation values of

*x**,*

_{k,t}*λ*,

_{i}*η*, and

*σ*with respect to trial distribution

*Q*, respectively.

**Σ**

*represents the covariance matrix of*

_{x,k}

*x**with respect to trial distribution*

_{k,t}*Q*. Here, to simplify the expression, the sensitivity matrix of the cortical and scalp areas is summarized as ${\mathit{A}}_{k}=\left({\mathit{A}}_{k}^{c},{\mathit{A}}_{k}^{s}\right)$.

## Sensitivity-normalized Tikhonov regularization

We used a solution of the sensitivity-normalized Tikhonov regularization [23] as an initial value of the variational Bayes algorithm. The solution, *X̂** _{D}* = min

_{X}*C*(

_{D}**;**

*X**λ*), is obtained by minimizing the following cost function:

*λ*value by minimizing Akaike’s Bayesian information criterion (or maximizing the marginal likelihood) [42, 43].

## Estimation algorithm

- The hierarchical prior and initial values are set.
In this study, we set the hierarchical prior and initial values of the cortical regularization model to

*γ*_{0}=**0**and ${\overline{\mathit{\lambda}}}_{\mathit{init}}={\left(\text{mean}\left({\widehat{\mathit{x}}}_{D}^{c}\right)/\text{max}(\mathit{W})\right)}^{-2}$. We set the hierarchical prior and initial values of the scalp regularization model as*γ*_{η0}= 0 and ${\overline{\eta}}_{\mathit{init}}={\left(\text{mean}\left({\left(\mathit{L}{\widehat{\mathit{x}}}_{D}^{s}\right)}^{2}\right)\right)}^{-1}$. We set the initial value of the noise scale parameter as*σ̄*= 1._{init} - The free energy is maximized by alternately repeating the
*X*-and*λ*-steps. In the following equations, := denotes the assignment operator; the value of the left-hand-side variable is updated by the current value of the right-hand-side equations.We repeated these steps until the iteration count reached 2,000 for the estimation in the two-layer phantom experiments. We repeated these steps until the iteration count reached 10,000 for the estimation in the MRI-based head-model simulations.

[

*X*-step] In the*X*-step, the following values are calculated:$${\mathbf{\Sigma}}_{k}:={\overline{\sigma}}^{-1}{\mathbf{\Sigma}}_{y,k}+{\mathit{A}}_{k}^{c}\mathit{W}{\overline{\mathbf{\Lambda}}}^{-1}{\mathit{W}}^{T}{\mathit{A}}_{k}^{{c}^{T}}+{\mathit{A}}_{k}^{s}{\mathit{L}}^{-1}{\overline{\eta}}^{-1}{\mathit{L}}^{-{1}^{T}}{\mathit{A}}_{k}^{{s}^{T}},$$$${\mathit{h}}^{c}:=\underset{k=1}{\overset{K}{\mathbf{\Sigma}}}\text{diag}\left[{\overline{\mathbf{\Lambda}}}^{-1}{\mathit{W}}^{T}{\mathit{A}}_{k}^{{c}^{T}}{\mathbf{\Sigma}}_{k}^{-1}{\mathit{Y}}_{k}{\mathit{Y}}_{k}^{T}{\mathbf{\Sigma}}_{k}^{-1}{\mathit{A}}_{k}^{c}\mathit{W}+T\left({\mathbf{1}}_{{N}^{c}}-{\overline{\mathbf{\Lambda}}}^{-1}{\mathit{W}}^{T}{\mathit{A}}_{k}^{{c}^{T}}{\mathbf{\Sigma}}_{k}^{-1}{\mathit{A}}_{k}^{c}\mathit{W}\right)\right],$$$${h}^{s}:=\underset{k=1}{\overset{K}{\mathbf{\Sigma}}}\text{tr}\left[{\overline{\eta}}^{-1}{L}^{-{1}^{T}}{\mathit{A}}_{k}^{{s}^{T}}{\mathbf{\Sigma}}_{k}^{-1}{\mathit{Y}}_{k}{\mathit{Y}}_{k}^{T}{\mathbf{\Sigma}}_{k}^{-1}{\mathit{A}}_{k}^{s}{\mathit{L}}^{-1}+T\left({\mathbf{I}}_{{N}^{s}}-{\overline{\eta}}^{-1}{\mathit{L}}^{-{1}^{T}}{\mathit{A}}_{k}^{{s}^{T}}{\mathbf{\Sigma}}_{k}^{-1}{\mathit{A}}_{k}^{s}{\mathit{L}}^{-1}\right)\right],$$$$S:=\underset{k=1}{\overset{K}{\mathbf{\Sigma}}}\text{tr}\left[{\overline{\sigma}}^{-1}{\mathbf{\Sigma}}_{y,k}{\mathbf{\Sigma}}_{k}^{-1}{\mathit{Y}}_{k}{\mathit{Y}}_{k}^{T}{\mathbf{\Sigma}}_{k}^{-1}+T\left({\overline{\mathbf{\Lambda}}}^{-1}{\mathit{W}}^{T}{\mathit{A}}_{k}^{{c}^{T}}{\mathbf{\Sigma}}_{k}^{-1}{\mathit{A}}_{k}^{c}\mathit{W}+{\overline{\eta}}^{-1}{\mathit{L}}^{-{1}^{T}}{\mathit{A}}_{k}^{{s}^{T}}{\mathbf{\Sigma}}_{k}^{-1}{\mathit{A}}_{k}^{s}{\mathit{L}}^{-1}\right)\right].$$The expectation and covariance ofwith respect to*X**Q*can be calculated using these values as ${\overline{\mathit{X}}}_{k}=\left(\begin{array}{c}{\overline{\mathit{X}}}_{k}^{c}\\ {\overline{\mathit{X}}}_{k}^{s}\end{array}\right)={\left(\begin{array}{cc}{\mathit{W}}^{-1T}\overline{\mathbf{\Lambda}}{\mathit{W}}^{-1}& \mathbf{0}\\ \mathbf{0}& {\mathit{L}}^{T}\overline{\eta}\mathit{L}\end{array}\right)}^{-1}\hspace{0.17em}\hspace{0.17em}{\mathit{A}}_{k}^{T}{\mathbf{\Sigma}}_{k}^{-1}{\mathit{Y}}_{k}$ and ${\mathbf{\Sigma}}_{x,k}^{-1}=\overline{\sigma}{\mathit{A}}_{k}^{T}{\mathbf{\Sigma}}_{y,k}^{-1}{\mathit{A}}_{k}+\left(\begin{array}{cc}{\mathit{W}}^{-1T}\overline{\mathbf{\Lambda}}{\mathit{W}}^{-1}& \mathbf{0}\\ \mathbf{0}& {\mathit{L}}^{T}\overline{\eta}\mathit{L}\end{array}\right)$._{X}[

*λ*-step] In the*λ*-step, the expectations and shape parameters of**Λ**,*η*, and*σ*with respect to*Q*are calculated as_{λ}$${\gamma}_{i}:={\gamma}_{0i}+\frac{K}{2}T,\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}{\gamma}_{i}{\overline{\lambda}}_{i}^{-1}:={\gamma}_{0i}{\overline{\lambda}}_{0i}^{-1}+\frac{1}{2}{h}_{i}^{c}{\overline{\lambda}}_{i}^{-1},$$$${\gamma}_{\eta}:={\gamma}_{{\eta}_{0}}+\frac{K}{2}{N}^{s}T,\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}{\gamma}_{\eta}{\overline{\eta}}^{-1}:={\gamma}_{{\eta}_{0}}{\overline{\eta}}_{0}^{-1}+\frac{1}{2}{h}^{s}{\overline{\eta}}^{-1},$$

## Acknowledgments

This research was supported by a contract with the National Institute of Information and Communications Technology entitled, 'Development of network dynamics modeling methods for human brain data simulation systems' and 'Multimodal integration for brain imaging measurements'. A part of this research was supported by SRPBS, MEXT.

## References and links

**1. **F. F. Jöbsis, “Noninvasive, infrared monitoring of cerebral and myocardial oxygen sufficiency and circulatory parameters,” Science **198**, 1264–1267 (1977). [CrossRef] [PubMed]

**2. **Y. Hoshi and M. Tamura, “Detection of dynamic changes in cerebral oxygenation coupled to neuronal function during mental work in man,” Neurosci. Lett. **150**, 5–8 (1993). [CrossRef] [PubMed]

**3. **T. Kato, A. Kamei, S. Takashima, and T. Ozaki, “Human visual cortical function during photic stimulation monitoring by means of near-infrared spectroscopy,” J. Cereb. Blood Flow Metab. **13**, 516–520 (1993). [CrossRef] [PubMed]

**4. **A. Villringer, J. Planck, C. Hock, L. Schleinkofer, and U. Dirnagl, “Near infrared spectroscopy (NIRS): a new tool to study hemodynamic changes during activation of brain function in human adults,” Neurosci. Lett. **154**, 101–104 (1993). [CrossRef] [PubMed]

**5. **B. Chance, Z. Zhuang, C. UnAh, C. Alter, and L. Lipton, “Cognition-activated low-frequency modulation of light absorption in human brain,” Proc. Natl. Acad. Sci. U.S.A. **90**, 3770–3774 (1993). [CrossRef] [PubMed]

**6. **M. Ferrari and V. Quaresima, “A brief review on the history of human functional near-infrared spectroscopy (fNIRS) development and fields of application,” NeuroImage **63**, 921–935 (2012). [CrossRef] [PubMed]

**7. **T. Takahashi, Y. Takikawa, R. Kawagoe, S. Shibuya, T. Iwano, and S. Kitazawa, “Influence of skin blood flow on near-infrared spectroscopy signals measured on the forehead during a verbal fluency task,” NeuroImage **57**, 991–1002 (2011). [CrossRef] [PubMed]

**8. **E. Kirilina, A. Jelzow, A Heine, M. Niessing, H. Wabnitz, R. Brühl, B. Ittermann, A. M. Jacobs, and I. Tachtsidis, “The physiological origin of task-evoked systemic artefacts in functional near infrared spectroscopy,” NeuroImage **61**, 70–81 (2012). [CrossRef] [PubMed]

**9. **Y. Zhang, D. H. Brooks, M. A. Franceschini, and D. A. Boas, “Eigenvector-based spatial filtering for reduction of physiological interference in diffuse optical imaging,” J. Biomed. Opt. **10**, 011014 (2005). [CrossRef]

**10. **S. Kohno, I. Miyai, A. Seiyama, I. Oda, A. Ishikawa, S. Tsuneishi, T. Amita, and K. Shimizu, “Removal of the skin blood flow artifact in functional near-infrared spectroscopic imaging data through independent component analysis,” J. Biomed. Opt. **12**, 062111 (2007). [CrossRef]

**11. **Q. Zhang, E. N. Brown, and G. E. Strangman, “Adaptive filtering for global interference cancellation and real-time recovery of evoked brain activity: a Monte Carlo simulation study,” J. Biomed. Opt. **12**, 044014 (2007). [CrossRef] [PubMed]

**12. **Q. Zhang, G. E. Strangman, and G. Ganis, “Adaptive filtering to reduce global interference in non-invasive NIRS measures of brain activation: how well and when does it work?” NeuroImage **45**, 788–794 (2009). [CrossRef] [PubMed]

**13. **T. Yamada, S. Umeyama, and K. Matsuda, “Multidistance probe arrangement to eliminate artifacts in functional near-infrared spectroscopy,” J. Biomed. Opt. **14**, 064034 (2009). [CrossRef]

**14. **N. M. Gregg, B. R. White, B. W. Zeff, A. J. Berger, and J. P. Culver, “Brain specificity of diffuse optical imaging: improvements from superficial signal regression and tomography,” Front. Neuroenergetics **2**, 1–8 (2010).

**15. **R. B. Saager, N. L. Telleri, and A. J. Berger, “Two-detector Corrected Near Infrared Spectroscopy (C-NIRS) detects hemodynamic activation responses more robustly than single-detector NIRS,” NeuroImage **55**, 1679–1685 (2011). [CrossRef] [PubMed]

**16. **L. Gagnon, R. J. Cooper, M. A. Yücel, K. L. Perdue, D. N. Greve, and D. A. Boas, “Short separation channel location impacts the performance of short channel regression in NIRS,” NeuroImage **59**, 2518–2528 (2012). [CrossRef]

**17. **L. Gagnon, M. A. Yücel, D. A. Boas, and R. J. Cooper, “Further improvement in reducing superficial contamination in NIRS using double short separation measurements,” NeuroImage (in press).

**18. **T. Funane, H. Atsumori, T. Katura, A. N. Obata, H. Sato, Y. Tanikawa, E. Okada, and M. Kiguchi, “Quantitative evaluation of deep and shallow tissue layers’ contribution to fNIRS signal using multi-distance optodes and independent component analysis,” NeuroImage (in press).

**19. **A. P. Gibson, J. C. Hebden, and S. R. Arridge, “Recent advances in diffuse optical imaging,” Phys. Med. Biol. **50**, R1–R43 (2005). [CrossRef] [PubMed]

**20. **D. A. Boas and A. M. Dale, “Simulation study of magnetic resonance imaging-guided cortically constrained diffuse optical tomography of human brain function,” Appl. Opt. **44**, 1957–1968 (2005). [CrossRef] [PubMed]

**21. **A. T. Eggebrecht, B. R. White, S. L. Ferradal, C. Chen, Y. Zhan, A. Z. Snyder, H. Dehghani, and J. P. Culver, “A quantitative spatial comparison of high-density diffuse optical tomography and fMRI cortical mapping,” NeuroImage **61**, 1120–1128 (2012). [CrossRef] [PubMed]

**22. **C. Habermehl, S. Holtze, J. Steinbrink, S. P. Koch, H. Obrig, J. Mehnert, and C. H. Schmitz, “Somatosensory activation of two fingers can be discriminated with ultrahigh-density diffuse optical tomography,” NeuroImage **59**, 3201–3211 (2011). [CrossRef] [PubMed]

**23. **T. Shimokawa, T. Kosaka, O. Yamashita, N. Hiroe, T. Amita, Y. Inoue, and M. Sato, “Hierarchical Bayesian estimation improves depth accuracy and spatial resolution of diffuse optical tomography,” Opt. Express **20**, 20427–20446 (2012). [CrossRef] [PubMed]

**24. **A. C. Kak and M. Slaney, *Principles of Computerized Tomographic Imaging* (IEEE Press, New York, 1988).

**25. **M. A. O’Leary, “Imaging with diffuse photon density waves,” Ph.D. Thesis, University of Pennsylvania (1996).

**26. **S. R. Arridge, “Optical tomography in medical imaging,” Inverse Probl. **15**, R41–R93 (1999). [CrossRef]

**27. **A. Kienle, M. S. Patterson, N. Dögnitz, R. Bays, G. Wagnières, and H. van den Bergh, “Noninvasive determination of the optical properties of two-layered turbid media,” Appl. Opt. **37**, 779–791 (1998). [CrossRef]

**28. **C. M. Bishop, “Variational principal components,” In Proc. of ICANN **1**, 509–514 (1999).

**29. **M. Sato, T. Yoshioka, S. Kajihara, K. Toyama, N. Goda, K. Doya, and M. Kawato, “Hierarchical Bayesian estimation for MEG inverse problem,” NeuroImage **23**, 806–826 (2004). [CrossRef] [PubMed]

**30. **S. J. Matcher, C. E. Elwell, C. E. Cooper, M. Cope, and D. T. Delpy, “Performance comparison of several published tissue near-infrared spectroscopy algorithms,” Anal. Biochem. **227**, 54–68 (1995). [CrossRef] [PubMed]

**31. **“FreeSurfer,” http://surfer.nmr.mgh.harvard.edu/.

**32. **“VBMEG, Variational Bayesian Multimodal EncephaloGraphy,” http://vbmeg.atr.jp/.

**33. **Q. Fang and D. A. Boas, “Monte Carlo simulation of photon migration in 3D turbid media accelerated by graphics processing units,” Opt. Express **17**, 20178–20190 (2009). [CrossRef] [PubMed]

**34. **Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express **1**, 165–175 (2010). [CrossRef] [PubMed]

**35. **N. Lange and S. L. Zeger, “Non-linear Fourier time series analysis for human brain mapping by functional magnetic resonance imaging,” Appl. Statist. **46**, 1–29 (1997). [CrossRef]

**36. **K. J. Worsley, C. H. Liao, J. Aston, V. Petre, G. H. Duncan, F. Morales, and A. C. Evans, “A general statistical analysis for fMRI data,” NeuroImage **15**, 1–15 (2002). [CrossRef] [PubMed]

**37. **E. Okada and D. T. Delpy, “Near-infrared light propagation in an adult head model. I. Modeling of low-level scattering in the cerebrospinal fluid layer,” Appl. Opt. **42**, 2906–2914 (2003). [CrossRef] [PubMed]

**38. **V. C. Kavuri, Z. J. Lin, F. Tian, and H. Liu, “Sparsity enhanced spatial resolution and depth localization in diffuse optical tomography,” Biomed. Opt. Express **3**, 943–957 (2012). [CrossRef] [PubMed]

**39. **D. Tsuzuki, V. Jurcak, A. K. Singh, M. Okamoto, E. Watanabe, and I. Dan, “Virtual spatial registration of stand-alone fNIRS data to MNI space,” NeuroImage **34**, 1506–1518 (2007). [CrossRef] [PubMed]

**40. **H. Attias, “Inferring parameters and structure of latent variable models by variational Bayes,” *Proc. 15th Conf. on Uncertainty in Artificial Intelligence*, Morgan Kaufmann, 21–30 (1999).

**41. **M. Sato, “Online model selection based on the variational Bayes,” Neural Comput. **13**, 1649–1681 (2001). [CrossRef]

**42. **H. Akaike, “Likelihood and the Bayes procedure,” in *Bayesian Statistics*,J. M. Bernardo, M. H. De Groot, D. V. Lindley, and A. F. M. Smith, eds. (Univ. Press, Valencia, 1980), 143–166.

**43. **O. Yamashita, “Dynamical EEG inverse problem and causality analysis of fMRI data,” Ph.D. Thesis, Graduate University for Advanced Studies (SOKENDAI) (2004).