Abstract
The development of non-invasive, biomedical optical imaging from time-dependent measurements of near-infrared (NIR) light propagation in tissues depends upon two crucial advances: (i) the instrumental tools to enable photon “time-of-flight” measurement within rapid and clinically realistic times, and (ii) the computational tools enabling the reconstruction of interior tissue optical property maps from exterior measurements of photon “time-of-flight” or photon migration. In this contribution, the image reconstruction algorithm is formulated as an optimization problem in which an interior map of tissue optical properties of absorption and fluorescence lifetime is reconstructed from synthetically generated exterior measurements of frequency-domain photon migration (FDPM). The inverse solution is accomplished using a truncated Newton’s method with trust region to match synthetic fluorescence FDPM measurements with that predicted by the finite element prediction. The computational overhead and error associated with computing the gradient numerically is minimized upon using modified techniques of reverse automatic differentiation.
© Optical Society of America
1. Introduction
Conventional imaging modalities such as magnetic resonance imaging (MRI) and x-ray computer-aided tomography (x-ray CT) provide high-resolution medical imaging enabled by the direct geometric correlation between incident and detected radiation. Yet the high cost of operating hospital MRI facilities, and the inability to detect important diseases through x-ray CT imaging suggests an opportunity for the development of near-infrared (NIR) biomedical optical imaging. NIR biomedical optical imaging, or optical tomography, depends upon the low absorbance, yet high scattering of non-mutagenic near-infrared light in the “therapeutic wavelength window” (600–1000 nm) enabling it to safely propagate through several centimeters of tissues. While the propagation of low-energy NIR light can occur safely, the high scattering properties of tissues renders the direct geometric correlation between incident and detected irradiation invalid. The forward imaging problem, (i.e., prediction of the propagation of NIR light through tissues when a map of the tissue optical properties is known) can be described in terms of the diffusion approximation to radiative transfer. The inverse imaging problem (i.e., prediction of the interior optical properties from measurements of light propagation made at the exterior tissue-air interface positions) requires solution of a series of equations that are non-linear in the optical properties to be estimated. In this contribution, we describe a novel algorithm to estimate a solution to the nonlinear, inverse imaging problem for fluorescence frequency-domain photon migration (FDPM). In the following sections of this introduction, we briefly (i) present the background of frequency-domain photon migration (FDPM) imaging, (ii) compare prior work and our current approach towards solution of its inverse imaging problem, and (iii) introduce the concept of fluorescent contrast agents which can accelerate the convergence of the inverse imaging solution.
1.1 Frequency-domain photon migration: forward and inverse problems
Frequency-domain photon migration, depends upon launching intensity modulated (30-200 MHz) light at the interface of a highly scattering medium (such as tissue), and detecting the intensity-modulated wave that successfully propagates to the detector located a distance ρ away from the incident source. Depending upon the spatial distribution of interior absorption and scattering optical properties, the detected light is both phase-delayed by θ, and amplitude attenuated by factor M when compared to the incident light (see Figure 1). The solution to the forward imaging problem involves predicting the phase-delay, θ, and amplitude demodulation, M as a function of angular modulation frequency, ω, and position, r, along the tissue surface given a known spatial distribution of absorption and scattering optical properties. The absorption coefficient, μ_{ax} , representing the inverse mean absorption path and the isotropic scattering coefficient, μ′_{sx}, equivalent to the inverse isotopic scattering length, govern the propagation of light through scattering media, such as tissues. Since near-infrared light is multiply scattered in tissues, the numerical solution to the diffusion approximation of the radiative transfer equation provides solution to the forward biomedical optical imaging problem. For a propagating intensity modulated wave of light, the optical diffusion equation is written as
where Φ_{x} represents the complex number describing the scalar flux of photons at position r; ω is the modulation frequency; c is the speed of light within the medium; μ_{axi} is the absorption due to chromophores and μ_{axf} is the absorption due to fluorophores (μ_{ax} = μa_{xi} + μ_{axf}). D_{x} is an optical diffusion coefficient equivalent to 1/[3(μ_{ax} + μ′_{sx})]. From numerical solution of Eqn (1) with a Robin boundary condition, the complex scalar flux at the surface, Φ_{x} = (=Me ^{iθ}) and the phase delay, θ, and amplitude attenuation, M, can be determined. It is important to note that the diffusion equation applies when μ_{ax} ≪ μ′_{sx} and D_{x} is approximated by 1/[3(μ′_{sx})].
The successful implementation of biomedical optical imaging involves the effective solution of the inverse imaging problem, i.e., determining the interior optical property map of absorption and scattering given measurements of θ and M along the tissue surface. To date, approaches have focused upon the linearization of the problem using first order Born or Rytov approximations [1, 2]. Non-linear optimization employing iterative Born [3, 4], or distorted Born methods [3,5] require accurate information regarding the normal “background” optical properties which the tissue “heterogeneity” resides. Since the normal tissue background optical properties can be expected to vary greatly, this a priori information is not realistically feasible for biomedical optical imaging.
Other investigators have used the modified Newton-Raphson method with the Levenberg-Marquardt regularization as the central inversion step [6, 7, 8 and 9]. While not requiring accurate “background” optical properties for initial parameter estimate guesses for successful reconstruction, the Newton-Raphson with Levenberg-Marquardt regularization has three distinct disadvantages for biomedical optical imaging. The first disadvantage is that the full Hessian matrix is not fully considered making the method inefficient for highly nonlinear problems such as optical imaging. Numerical experiments by McKeown [10] and Veapucci [111] have shown that while the Gauss-Newton and Levenberg-Marquardt methods solve nonlinear least squares problems efficiently when the residuals are small or zero at the solution, these methods are less efficient than the quasi-Newton method when the residuals are significantly nonlinear. Meyer [12] provides theoretical support of this observation. In the work presented herein, we demonstrate the use of a truncated Newton method with trust region for the efficient solution of the large scale, non-linear optical imaging problem. The second disadvantage to Newton-Raphson approaches is that the methods require expensive numerical calculation of the Jacobian that can be subject to error. In this work, we adapt modifications of the reverse automatic differentiation to compute the gradient defining the search direction efficiently and without numerical error [13, 14]. Finally, the Levenberg-Marquardt method requires storage of the full Jacobian matrix, which represents a limitation for optical imaging in three dimensions and when the number of unknowns increases. Since the truncated Newton method requires only the product of the Hessian with the direction at any point, the method is comparatively more efficient for large problems. Our inversion strategy is presented in its entirety in Section 3.
While the difficulties associated with the nonlinear and large scale of inverse imaging problem can be addressed with our inversion strategy, additional significant challenges exist when attempting to reconstruct absorption and scattering properties that are physiologically feasible. In a study involving actual optical property measurements of normal and diseased tissues, Troy et al. [15] found that the optical properties of absorption and isotropic scattering of normal and diseased breast tissues, for example, may not provide sufficient contrast for successful image reconstruction. Indeed, the small influence of “heterogeneity” absorption and scattering on FDPM measurements measured at the boundary can often be within or close to the measurement error of 1 degree in phase and 1% in amplitude attenuation.
1.2 Fluorescence frequency-domain photon migration: forward and inverse problems
In order to over come the difficulties associated with insufficient contrast and costly nonlinear optimization, we and other investigators have developed fluorescence FDPM imaging approaches and inversion strategies [8, 16, 17, 18]. Fluorescence FDPM physically depends upon the administration of a fluorescent contrast agent, or another agent or gene vector, which results in the expression of a fluorescent protein that emits in the near-infrared wavelength regime. When activated by the absorption of an intensity modulated wave, Φ _{x} (r, ω), an intensity modulated fluorescent wave, Φ _{m} (r, ω), is generated at the same modulation frequency, but is amplitude attenuated and phase-delayed relative to the activating excitation wave. The phase-delay, θ_{m}, and amplitude attenuation, M_{m}, of the generated fluorescent wave can be as high as 90 degrees and a factor of 10’s –100’s of the incident excitation wave owing to the fluorescence properties (lifetime and quantum efficiency) of the dye. Specifically, as the biochemical environment of the fluorophore changes, its lifetime, τ (i.e., the lifetime of the activated fluorophore, or the mean time between absorption of excitation and release of fluorescent light) also alters, providing discrimination of diseased tissues possibly on the basis of biochemical environment. The detected intensity modulated fluorescence wave that has been generated within and which has propagated to the tissue boundary possesses a phase-delay, θ_{m}, and amplitude attenuation, M_{m}, that is exquisitely more sensitive to embedded “heterogeneities” than possible with absorption FDPM imaging described above [16].
Predictions of fluorescence FDPM measurements of phase-delay, θ_{m}, and amplitude attenuation, M_{m} are achieved through the solution of the complex fluorescence fluence, Φ _{m} (r, ω), at the boundary from the fluorescence optical diffusion equation:
where D_{m} is the optical diffusion equation at the emission wavelength (=1/3⌊μ_{am}+μ′_{sm}⌋)≈1/3⌊μ′_{sm}⌋; μ_{am} and μ′_{sm} are the absorption and isotropic scattering coefficients at the fluorescent wavelength; and the right hand term describes the generation of fluorescence within the medium. The term ϕ represents the quantum efficiency of the fluorescence process and the absorption owing to fluorophore is represented by the coefficient μ_{ax→m}. Here we consider first order relaxation decays only. Note that the source term requires coupling with the solution of excitation fluence described by Eqn (1), where by the absorption coefficient at the excitation light, μ_{ax} is now provided by the naturally occurring, non-fluorescent chromophores, μ_{axi}, and the fluorescent contrast agent, μ_{ax→m}. The numerical solution for the excitation fluence distribution Eqn (1) with the Robin boundary condition enables prediction of Φ _{m} from Eqn (2) at the medium boundary and determination of θ_{m} and M_{m}. It is important to note that the diffusion equation applies when μ_{am}<<μ′_{sm}.
With the added unknowns at the emission wavelength, the disadvantages using Newton-Raphson optimization for fluorescence FDPM imaging may become more severe. Paithankar et al., [8] and Troy and Sevick-Muraca [19] employed Newton-Raphson, multi-grid finite difference reconstructions of μ_{ax→m} τ, and ϕμ_{ax→m}, given correct priors on other optical properties. Recently, Jiang [20] performed similar work, employing dual meshing finite element methods instead of finite difference methods. Both Paithankar et al. [8] and Jiang [20] resorted to removal of “spurious” optical property values or “filtering” using empirically chosen parameters to achieve the “correct” image distorted through inefficient numerical computation of the Jacobian. O’Leary et al., [17] and Chang, et al., [18] employed Born and iterative Born which again not only required known “background” optical properties, but which also was unable to handle fluorescence in the “background.” Later Chang et al., [21] demonstrated the ability to reconstruct fluorescence lifetime under conditions of “imperfect” uptake given the spatial distribution of absorption owing to fluorophore; and Lee and Sevick-Muraca [22] employed an iterative Born-type solution of simultaneous absorption and lifetime from fluorescence FDPM synthetic data.
1.3 Fluorescence FDPM imaging: current work
In this contribution, we also focus upon fluorescence FDPM imaging, owing to the significant influence of interior fluorescent heterogeneities on measured signals at the medium boundary and adapt our inversion strategy for the non-linear, large scale optimization required for biomedical optical imaging of absorption and fluorescence lifetime. In the following section, the numerical methodology for solving the FDPM forward imaging problem and the simulator used for generation of synthetic data sets for testing our inversion algorithm are presented. Section 3 describes our formulation of the inverse imaging problem as an optimization problem using truncated Newton’s methods with trust region and the reverse automatic differentiation in order to reduce the computational overhead associated with optimization. The performance of our algorithm using synthetic data sets is presented in Section 4 and the conclusions, prognosis for FDPM imaging and ongoing work are summarized in Section 5.
2.0 Model prediction of FDPM
2.1 Boundary conditions for FDPM forward simulator
Prediction of FDPM measurements on surface dΩ resulting from diffusion of intensity-modulated waves within volume Ω, is accomplished from solution of Eqn (1) and (2) using the Robin boundary condition. We employ the Robin boundary condition of the form given by Ishimaru[23, 24]:
Where S is the complex flux density associated with a point source of excitation light located at r_{s} on dΩ; n is the unit vector normal to the surface, and γ accounts for refractive index mismatch at the boundary which through Snell’s reflection, results in internal reflection at the surface.
Groenhuis et al. [25] have developed an empirical approach to determine the value of the parameter γ, viz.
where
n_{rel} is the relative index of refraction of the scattering medium with respect to the surrounding medium.
2.2 Galerkin finite element formulation for solution of Φ_{x}
We employ the Robin boundary condition in our numerical solution of Φ_{x,m} employing the Galerkin finite element model with linear triangular elements. In our formulation, we assume μ_{ax,m}<<μ_{sx,m}, so that D_{x,m} ≈1/3μ_{sx,m}. The formulation of the Galerkin finite element solution of Φ_{x} begins by multiplying Eqn (1) by a weighting function, w_{j}, and integrating over the domain of interest:
where w_{j} {w_{j} : j = 1,⋯,N} is a set of linearly independent weighting functions. N is the number of unknows.
Using Green’s theorem, equation (6) becomes
where the second term is a line integral along the external boundary and w_{j} is assumed to be continuous over the whole domain. Now introducing equation (3) in the line integral
Upon combing Eqns (8) and (7), once can write
Suppose that the solution domain Ω is divided into M _{1} triangular elements. Now we can obtain the element equation from Eqn (9) as follows:
$$\sum _{\mathrm{el}}^{M}\left[\frac{1}{\gamma}\underset{{\Gamma}^{\mathrm{el}}}{\int}S{w}_{j}\mathrm{d\Gamma}\right]$$
Assuming that ${\mathrm{\Phi}}_{x}^{\mathit{\text{el}}}$ , ${\mathrm{D}}_{x}^{\mathit{\text{el}}}$ and μ^{el}_{axf} vary linearly within each triangular element el and μ_{axi} is constant, we may then write ${\mathrm{\Phi}}_{x}^{\mathit{\text{el}}}$ , ${\mathrm{D}}_{x}^{\mathit{\text{el}}}$ and μ^{el}_{axf} for each element as:
where the L_{i} are the natural co-ordinates for the triangle [26]. According to the Bubnov-Galerkin method, the weighting functions are chosen to be the same as the approximation functions used to represent ${\mathrm{\Phi}}_{x}^{\mathit{\text{el}}}$ , that is,
The equation (10) can be written as:
$$\sum _{\mathrm{el}=1}^{M}\left[\frac{1}{\gamma}\underset{{\Gamma}^{\mathrm{el}}}{\int}S\phantom{\rule{.2em}{0ex}}{L}_{j}\mathrm{d\Gamma}\right]$$
After integration of Eqn (12) and rearrangement, the resulting element stiffness equations become:
Assembly of these element stiffness matrices ${\mathbf{K}}_{i}^{\mathit{\text{el}}}$ and local vector r ^{el} yields the system equations KΦ̄ _{x} =b [26].
Since S denotes the complex point source applied to the surface dΩ. at position r_{s} and S is expressed as:
where r_{s} is the position vector of the point source. By definition, we have
where el is the element of the finite element mesh. From Eqn(12) we have
If the position r_{s} of the point source coincides with the position of node j, when L_{j}(r_{s}) = 1, i.e. we have
2.3 Galerkin formulation of finite element solution Φ_{m}
The forward finite element solution of the emission diffusion equation is formulated similarly to the excitation diffusion equation with the exception that (i) the complex excitation fluence must be first computed, and that (ii) the formulation of fluorescence lifetime poses some additional considerations. As in the formulation of the solution for Φ_{x}, parameters of μ_{ax→m}, τ, and D_{m} are expected to vary linearly within the triangular elements. In order to facilitate the formulation of the Galerkin finite element method, the RHS term of Eqn (2) is expressed in terms of a binomial expansion assuming ωτ<1 and higher order terms are neglected. Hence, the RHS source term of Eqn (2) becomes S_{m} (r,ω)=ϕμ_{ax→m}[1 + iω(r)τ]Φ _{x} (r, ω). This assumption is appropriate, since the phase-delay, θ_{m}, of the generated fluorescent intensity modulated wave reaches a maximum of 90 degrees with respect the local excitation intensity modulated wave and becomes insensitive to fluorescence lifetime when ωτ>1. The formulation of the emission finite element solution for Φ _{m} is identical to that presented in section 2.2 for Φ_{x} with the exception that the parameters Φ _{m} , D_{m}, μ_{ax→m}, and τ are assumed to vary linearly within each element. The finite element formulation of emission diffusion is given below:
$$\sum _{\mathit{el}=1}^{\mathit{el}}\left[\underset{{\Omega}^{\mathit{el}}}{\int}{\Phi}_{x}^{\mathit{el}}{\mu}_{{a}_{x\to m}}^{\mathit{el}}\varphi \left(1+\omega {\tau}^{\mathit{el}}\right){L}_{j}d\Omega \right]$$
It is important to emphasize that the absorption owing to fluorophore is equivalent in our case to μ_{ax→m} and to μ_{axf}. We differentiate between the two parameters when reconstructing absorption from synthetic excitation, μ_{axf}, and absorption from emission data μ_{ax → m}.
2.4 Solution of the system of finite element equations
For both excitation and emission finite element formulations, the local stiffness matrices, ${\mathbf{K}}_{\mathrm{j}}^{\mathit{\text{el}}}$, are formed and then combined into a complex, global stiffness matrix, K, which depends upon the absorption and scattering properties at each wavelength. For solution of excitation fluence, Φ̄_{x}, the vector b reflects the complex source of excitation light from boundary point sources at positions r_{s}, while for solution of the emission fluence, Φ̄_{m}, vector b represents the source generated from within volume Ω.
To solve for Φ̄_{x,m} , the complex set of sparse, linear equations generated by the finite element method,
are efficiently solved using a subroutines ZGBTRF and ZGBTRS (LAPACK subroutines). ZGBTRF computes a LU factorization of a complex M _{2} by N band matrix using partial pivoting with row interchanges. (M _{2} = 2*KL + KU + 1, KL and KU are the numbers of subdiagonals and superdiagonals within the band of K, respectively). ZGBTRS solves a system of linear equations with a general band matrix K using the LU factorization computed by ZGBTRF. The main advantage of this method is that once the decomposition is done, the solution vector can be obtained for any number of right hand side vectors. The accuracy of the solution for Φ̄_{x,m} depends crucially on the mesh size, h, with the accuracy improving with h ^{2}.
3.0 Inverse solution for fluorescence FDPM
The variables in the inverse problems are the absorption coefficients, μ_{axf} (for excitation), μ_{ax→m} (for emission), and the lifetime τ at each nodal point of a finite element model. If there are N nodal points then the number of measurements must be greater than N. The number of measurements made at N_{B} boundary nodes for N_{s} sources, requires that
Due to the reciprocality of sources and detectors, i.e., when a detector position is swapped with a source position equivalent measurements are registered, then we require:
If l =1,…,N_{s} denotes the different distribution of source s and j =1,…, N_{B} denotes the boundary nodes at which measurements are made, the error function for optimization of absorption and lifetime imaging may be taken as
The subscript c denotes the values calculated by the forward simulator problem and the subscript me denotes the experimental (or in this case synthetically generated) fluence values. The superscript * denotes the complex conjugate of the complex function Φ_{x,m}. The subscript l and j denote fluence values that arise between source l and detector j.
It is seen from the equation (20) that the calculation of error function E involves N_{s} solutions of the direct problem. The global stiffness matrix K _{x,m} for excitation and emission remains constant with only the vector b changing with each excitation source. Since the global stiffness matrix K _{x} depends on μ̄_{axf} and b is independent of μ̄_{axf}, LU decomposition of K _{x} for solution of Φ_{x} need only be performed once in each iteration of the optimization problem.
An efficient way of calculating E_{x,m} for excitation and emission equations consists of the following steps:
- Given μ_{axf} calculate the local stiffness matrices K ^{el} (μ̄_{axf})
- $\mathbf{K}=\sum _{\mathrm{el}}{\mathbf{K}}^{\mathrm{el}}$and set F = 0
- Decompose K into the LU decomposition LU = K
For each source l
- calculate b(s)
- Solve, LUΦ̄_{x,m} =b gives (Φ̄_{x,m})
- ${E}_{x,m}=\frac{1}{2}\sum _{j\in d\Omega}{\left(\frac{{\left({\Phi}_{x,m}\right)}_{c}-{\left({\Phi}_{x,m}\right)}_{\mathit{me}}}{{\left({\Phi}_{x,m}\right)}_{\mathit{me}}}\right)}_{j}{\left(\frac{{\left({\Phi}_{x,m}^{*}\right)}_{c}-{\left({\Phi}_{x,m}^{*}\right)}_{\mathit{me}}}{{\left({\Phi}_{x,m}^{*}\right)}_{\mathit{me}}}\right)}_{j}$
- F = F + E_{x,m}
As most of the efficient optimization algorithms, the truncated Newton method requires the function value E_{x,m} and the gradient of the error function, ∇E_{x,m} , in order to efficiently update parameter estimates. The gradients of the error function in Eqn (20) represent the direction in which the error function is increasing/decreasing most rapidly. Since absorption coefficients μ_{axf} μ_{ax→m} and τ are the unknown parameters, the gradients are obtained by taking partial derivatives of Eqn (20) with respect to μ_{axf}, μ_{ax→m} and τ.
〈.,.〉 denotes the inner product of two complex vectors and Re(.) denotes the real part of a complex. number. The gradient of the real valued error function with respect to real valued μ̄_{axf} μ_{ax→m} , and τ remains real valued. At minimum these gradients should be zero, so the real part of the gradients are zero. Therefore, we consider only the real part of the gradients. In the following, the optimization strategy using the truncated Newton method with the gradients computed by reverse automatic differentiation is described.
3.1 Truncated Newton’s method as an optimization strategy for FDPM imaging
The truncated Newton method with trust region is used for non-linear optimization problem. The reader interested in background literature is referred to the references [27, 28].
Newton’s method is based upon approximating the function E_{x,m} (μ̄_{ak} +d) (μ_{a} denotes μ_{axf} {for excitation} or μ_{ax→m} {for emission} in this section) at the kth iteration by the quadratic model:
$$\phantom{\rule{.2em}{0ex}}Q\left(\mathbf{d}\right)={\mathbf{g}}_{k}^{T}\mathbf{d}+\frac{1}{2}{\mathbf{d}}^{T}{\mathbf{G}}_{k}\mathbf{d}$$
In these equations g _{k} =g(μ̄_{ak} =∇E_{x,m} (μ̄_{ak} and G _{k} =∇^{2} E_{x,m} (μ̄_{ak}) denote the gradient vector and Hessian matrix, respectively. The Newton direction is obtained from an exact minimum of above equation; i.e. the search direction d at iteration k is defined by the equation
A sequence of approximate solutions of the above equation is generated by the conjugate gradient method until a vector d _{i} (see Step 3) is obtained for which the following condition on the relative residual is satisfied as suggested by Dembo and Steihaug [29];
where r _{i} = G _{k} d _{i} + g _{k}. The conjugate gradient iteration is then truncated and d _{i} is used as the search direction for the minimization. The relative residual is used as a measure of accuracy and the required level of accuracy improves as the minimization proceeds and as (μa_{k} ) approaches a minimum. It is not necessary to compute the Hessian G as only the product Gd is required. The product is calculated using the finite difference formula given below:
where $\sigma =\frac{\sqrt{\mathit{machine\; precision}}}{\parallel \mathbf{d}\parallel}$. This avoids the calculation and storage of the Hessian but requires an additional gradient evaluation for each minor iteration. The pseudo codes for truncated Newton optimization is as follows:
- Initialization
An initial guess (μ̄_{a0} is made of the solution: The radius R _{1} = 0.01 of the trust region is defined around (μ̄_{a0}), E _{0} = E(μ̄_{a0}), g _{0} = g(μ̄_{a0}) are computed.
- Stopping Criteria
If ∥g _{k}∥ ≤ ε exit, ε is specified by the user
Else continue with next Step 3
- Computing Newton Direction
Apply conjugate gradient method to find an approximate solution d _{k} of the Newton equation G _{k} d _{k} = -g _{k} such that ∥d _{k}∥ ≤ R_{k} and ${\mathbf{d}}_{k}^{T}$ g _{k} ≤ -ε_{1}∥d _{k}∥∥g _{k}∥, typically ε_{1} = 0.001 [30]. The conjugate gradient algorithm ensures that at the j-th iteration d _{j+1} =d _{j} + α_{j} p _{j} remains within the trust region (α_{j} steplength and p_{j} is the conjugate direction)
- Line Search
After computing the truncated Newton direction d _{k}, a line search is used to find an appropriate λ_{k} that reduces the function E_{x,m} along the line μ̄_{ak} + λ_{k} d _{k}. The Armijo line search [31] is used. It calculates λ_{k} such that Wolfe’s [32] conditions 2 and 3 are satisfied.
Condition 2:
E(μ̄_{a} + αd) - E(μ̄_{a}) ≤ ε_{2}αd ^{T} g where ε_{2} is a constant, such that 0.5 > ε_{2} 0 (typically ε_{2} = 0.1).
Condition 3
E(μ̄_{a} + 2αd) > E(μ̄_{a} + ε_{3}αd ^{T} g) where ε_{3} is a constant, such that 0.5 > ε_{3} 0
- New Approximation
Compute μ̄_{ak+1} = μ̄_{ak} + λ_{k} d _{k}, E _{x,mk+1}, g _{k+1}
- Update the trust region
Once λ_{k} has been found the radius of the trust region R _{k+1} is altered as follows:
The truncated Newton search direction is constructed so that it always satisfies Wolfe’s condition 1 (Step 3, ${\mathbf{d}}_{k}^{T}$ g _{k} ≤ -ε_{1}∥∥d _{k} ∥d _{k}∥). Wolfe’s conditions 2 and 3 are satisfied within the line search, as discussed is Step 4. Since all the Wolfe’s conditions have been satisfied, the method is globally convergent, and will terminate in a finite number of iterations.
A similar formula applies for lifetime τ and absorption coefficient μ_{ax→m}.
3.2 Reverse automatic differentiation for computation of ∇E_{x,m}
The reverse automatic differentiation (RAD) method is used to calculate the gradient g = g(μ̅_{a} or τ̅ or μ̅_{ax→m}). The reader interested in background literature is referred to References [28, 33, and 34]. A brief description of the reverse differentiation and analytic implementation of this method is discussed in the following section.
Reverse automatic differentiation is discussed in terms of the Wengert [35] list. This list decomposes complicated function of many variables into a sequence of simpler functions of one or two variables. Functions of two variables are called the “binary function”. Binary functions are either addition, or subtraction or multiplication. Functions of one variable are either reciprocation, raise of power, exponential, logarithm, trigonometric, or hyperbolic. These functions are defined as unary functions.
If f(x) is a function of the n variables x _{1},…,x_{n} then a set of new variables x _{n+1},…x _{P} are introduced, where P is the number of arithmetic operations involved in calculating the function f(x). A Wengert list for the calculation for the calculation of f(x) consists of a list of these unary and binary function processed in a given order;
Reverse differentiation can be explained as follows. Given the Wengert list for the calculation of f a set of adjoint variables x̂ = ∂f /∂x_{i} is introduced and the order of the operations reversed. The method is based on the function of a function rule of calculus that implies
The automatic method involves a pass in the reverse direction through the list after the function has been evaluated in a forward sweep. Hence the method is:
$$\begin{array}{cc}\mathrm{then,\; if}\phantom{\rule{.2em}{0ex}}{F}_{i}\phantom{\rule{.2em}{0ex}}\mathrm{is}\mathrm{binary},& {\hat{x}}_{j}={\hat{x}}_{j}+{\hat{x}}_{i}\frac{\partial {F}_{i}}{\partial {x}_{j}}\phantom{\rule{.2em}{0ex}}i>j\\ \mathrm{and}\phantom{\rule{6.2em}{0ex}}& {\hat{x}}_{k}={\hat{x}}_{k}+{\hat{x}}_{i}\frac{\partial {F}_{i}}{\partial {x}_{k}}\phantom{\rule{.2em}{0ex}}i>j\\ \mathrm{else\; if}\phantom{\rule{.2em}{0ex}}{F}_{i}\phantom{\rule{.2em}{0ex}}\mathrm{is}\mathrm{unary},& {\hat{x}}_{j}={\hat{x}}_{j}+{\hat{x}}_{i}\frac{\partial {F}_{i}}{\partial {x}_{j}}\phantom{\rule{.2em}{0ex}}i>j\\ \mathrm{derivatives}\phantom{\rule{3.em}{0ex}}& {g}_{i}={\hat{x}}_{i}\phantom{\rule{.2em}{0ex}}i=1,\dots ,n\end{array}$$
Griewank [14] showed that it is always possible to calculate any gradient vector of a function in less than 3 times the computational cost of the function by the reverse automatic differentiation method irrespective of the dimension of the problem. The main disadvantage of the automatic code is that it requires large storage and significant data accessing overheads [28, 33, and 34]. In order to overcome these difficulties, the reverse differentiation method is extended and has been performed analytically. This does not require large storage and at the same time gradients are calculated in less than three times the operation count of the function. The gradients of E_{x,m} with respect to parameter μ_{axf} are calculated using the reverse differentiation method as follows.
- ${F}_{x,m}=0,\hat{\mathbf{K}}=\frac{\partial {E}_{x,m}}{\partial K}=0,\hat{\mathbf{b}}=\frac{\partial {E}_{x,m}}{\partial b}=0$
- Calculate element matrix K ^{el} (μ̄_{a})
- Assemble global matrix K
- Decompose K = LU
For each light source s:
- Calculate b(s)
- Solve LU Φ̄ _{x,m} = b(s) to give (Φ̄_{x,m})_{c}
- ${E}_{x,m}=\sum _{j}\frac{1}{2}{\left(\frac{{\left({\Phi}_{x,m}\right)}_{c}-{\left({\Phi}_{x,m}\right)}_{\mathit{me}}}{{\left({\Phi}_{x,m}\right)}_{\mathit{me}}}\right)}_{j}{\left(\frac{{\left({\Phi}_{x,m}^{*}\right)}_{c}-{\left({\Phi}_{x,m}^{*}\right)}_{\mathit{me}}}{{\left({\Phi}_{x,m}^{*}\right)}_{\mathit{me}}}\right)}_{j}$
- F_{x,m} = F_{x,m} + E_{x,m}
- ${\hat{\overline{\Phi}}}_{x,m}=\frac{\partial {E}_{x,m}}{\partial {\Phi}_{x,m}}=\{\begin{array}{c}{\left(\frac{{\left({\Phi}_{x,m}^{*}\right)}_{c}-{\left({\Phi}_{x,m}^{*}\right)}_{\mathit{me}}}{{\left({\Phi}_{x,m}\right)}_{\mathit{me}}{\left({\Phi}_{x,m}^{*}\right)}_{\mathit{me}}}\right)}_{j}\phantom{\rule{.6em}{0ex}}\mathit{for\; all}\phantom{\rule{.2em}{0ex}}j\in d\Omega \\ 0\phantom{\rule{5.2em}{0ex}}\mathit{for\; all}\phantom{\rule{.2em}{0ex}}j\notin d\Omega \end{array}$
- Solve $\mathbf{LU}{\overline{v}}_{x,m}={\hat{\overline{\Phi}}}_{x,m}$
- Update K̂ = K̂ - v${\u0304}_{x\mathit{,}m}^{\mathrm{T}}$Φ̄_{x,m}, b̂ = b̂ + v̄_{x,m}$${\left({\hat{\mu}}_{{a}_{\mathit{xf}}}\right)}_{p}=\frac{\partial {E}_{x,m}}{\partial {\left({\mu}_{{a}_{\mathit{xf}}}\right)}_{p}}=\frac{\partial {E}_{x,m}}{\partial K}\frac{\partial K}{\partial {\left({\mu}_{{a}_{\mathit{xf}}}\right)}_{p}}+\frac{\partial {E}_{x,m}}{\partial b}\frac{\partial b}{\partial {\left({\mu}_{{a}_{\mathit{xf}}}\right)}_{p}}$$
- $$=\sum _{\mathit{el}}\sum _{i,j}{\left({\hat{K}}^{\mathit{el}}\right)}_{i,j}\left(\frac{\partial {K}_{i,j}^{\mathit{el}}}{\partial {\left({{\mu}_{a}}_{\mathit{xf}}\right)}_{p}}\right)+\sum _{\mathit{el}}\sum _{j}{\hat{b}}_{j}\frac{\partial {b}_{j}}{\partial {\left({{\mu}_{a}}_{\mathit{xf}}\right)}_{p}}$$
The above gradients of error function E_{x,m} are calculated using the global stiffness matrices, K, and the matrix b assembled from the excitation and emission diffusion equations as described in Section 2.2 and 2.3. (The reverse differentiation method described above uses the fact that if Φ _{x,m} is the solution of the equation KΦ̅ _{x,m} = b then K̂ = ${-}_{v}^{\mathit{-}T}$ _{x,m} Φ̂ _{x,m} and b = v̅_{x,m} where v̅_{x,m} is the solution of the equation $\mathbf{K}{\overline{v}}_{x,m}={\hat{\overline{\Phi}}}_{x,m}$, see Appendix A). With ∇E_{x,m} and E_{x,m} computed, the Newton directions can be computed as described in Section 3.1. Similar formulas hold for τ̂ and μ̂_{ax→m}
4 Conclusions
We have formulated the solution to the absorption and fluorescence FDPM optical imaging problem as a non-linear optimization scheme. We develop the truncated Newton method for minimizing the sum of errors between measured and synthetic data generated using a simplistic finite element model with a minimal number of sources and detectors. Since the truncated Newton method requires only the product of the Hessian with a direction at any point, it is computationally efficient for large optimization problems. We couple the truncated Newton method with a finite element solver which assumes a polynomial series in absorption and lifetime in order to accurately compute Φ_{x,m} and employ principles of reverse differentiation in order to efficiently compute the gradient of the error function, ∇E_{x,m} . Using synthetic data with simulated measurement noise, we demonstrate in Part II the optimization strategy on a series of studies involving a minimum number of sources and detectors.
Appendix A
Calculation of adjoint variables
In this appendix we derive the expressions for the adjoint variables $\hat{K}=\frac{\partial {E}_{x,m}}{\partial k}$ and $\hat{\mathbf{b}}=\frac{\partial {E}_{x,m}}{\partial b}$.
- Adjoint variable K̂$$\hat{K}=\frac{\partial {E}_{x,m}}{\partial K}=\frac{\partial {E}_{x,m}}{\partial {\overline{\Phi}}_{x,m}}\frac{\partial {\overline{\Phi}}_{x,m}}{\partial K}={\hat{\overline{\Phi}}}_{x,m}\frac{\partial {\overline{\Phi}}_{x,m}}{\partial K}={\hat{\overline{\Phi}}}_{x,m}\frac{\partial {\Phi}_{x,m}}{\partial {\overline{\mu}}_{{a}_{\mathit{xf}}}}\frac{\partial {\overline{\mu}}_{{a}_{\mathit{xf}}}}{\partial K}$$
(since K is a function of μ_{axf}, we differentiate K with respect of μa_{xf} ). Now we use the global equation KΦ̅_{x,m} = b to find $\frac{\partial {\overline{\mu}}_{{a}_{\mathit{xf}}}}{\partial K}$ as follows:
$$\mathbf{K}{\stackrel{\u0304}{\Phi}}_{x,m}=\mathbf{b}$$
$$\frac{\partial K}{\partial {\stackrel{\u0304}{\mu}}_{{a}_{\mathit{xf}}}}{\overline{\Phi}}_{x,m}+K\frac{\partial {\overline{\Phi}}_{\mathit{xf}}}{\partial {\overline{\mu}}_{{a}_{\mathit{xf}}}}=0$$Since the source term b is constant in excitation equation.
$$\frac{\partial K}{\partial {\overline{\mu}}_{{a}_{\mathit{xf}}}}=-\frac{K\frac{\partial {\overline{\Phi}}_{x,m}}{\partial {\overline{\mu}}_{\mathit{xf}}}}{{\overline{\Phi}}_{x,m}}$$
$$\frac{\partial {\overline{\mu}}_{\mathit{xf}}}{\partial K}=-{K}^{-1}\frac{{\overline{\Phi}}_{x,m}}{\frac{\partial {\overline{\Phi}}_{x,m}}{\partial {\overline{\mu}}_{{a}_{x,m}}}}$$On substituting Eqn (A2) in Eqn (A1) we have
$$\hat{K}=-{\hat{\overline{\Phi}}}_{x,m}{K}^{-1}{\Phi}_{x,m}=-{\overline{v}}^{T}{\overline{\Phi}}_{x,m}$$where
$$\overline{v}=\hat{\overline{\Phi}}{K}^{-1}$$
$$\mathbf{K}\overline{v}={\hat{\overline{\Phi}}}_{x,m}$$We can solve the Eqn(A4) and obtain v̅. Finally K̂ is obtained using Eqn(A3).
- Adjoint variable b̂$$\hat{b}=\frac{\partial {E}_{x,m}}{\partial b}=\frac{\partial {E}_{x,m}}{\partial {\overline{\Phi}}_{x,m}}\frac{\partial {\overline{\Phi}}_{x,m}}{\partial b}={\hat{\overline{\Phi}}}_{x,m}\frac{\partial {\overline{\Phi}}_{x,m}}{\partial b}={\hat{\overline{\Phi}}}_{x,m}\frac{\partial {\overline{\Phi}}_{x,m}}{\partial \overline{\tau}}\frac{\partial \overline{\tau}}{\partial b}$$
(since b is a function of τ̅, we differentiate b with respect of τ̅). Now we use the global equation KΦ̅ _{x,m}=b to find $\frac{\partial \overline{\tau}}{\partial b}$ as follows:
$$\mathbf{K}{\overline{\Phi}}_{x,m}=\mathbf{b}$$
$$\mathbf{K}\frac{\partial {\overline{\Phi}}_{x,m}}{\partial \overline{\tau}}=\frac{\partial b}{\partial \overline{\tau}}$$
$$\frac{\partial \overline{\tau}}{\partial b}={\mathbf{K}}^{-1}\frac{1}{\frac{\partial {\overline{\Phi}}_{x,m}}{\partial \overline{\tau}}}$$Substitute Eqn(A6) into Eqn(A5) we have
Eqn(B7) is the same as Eqn (B4). Hence b̂ = v̅
M | amplitude modulation |
r | position |
c | speed of light |
S(r _{s,ω} ) | source at location r_{s} and modulated frequency ω. |
D | optical diffusion coefficient = 1/3 [μ_{a} + μ′_{s}] |
n | normal |
r_{s} | source position |
r_{d} | parameter describing γ |
n_{rel} | relative refractive functions |
W_{j} | independent weighting functions |
L_{j} | co-ordinates for triangular elements |
${\mathbf{K}}_{i}^{\mathit{\text{el}}}$ | element stiffness matrices |
r ^{el} | local vector in FEM stiffness equations |
b | local vector in FEM stiffness equation containing source terms |
Z | variance |
G(0,1) | random number with Gaussian distribution of zero mean and unit variance |
N | number of nodal points |
N_{s} | number of sources |
N_{B} | number of boundary nodes |
ρ | distance between source and detector pair |
θ | phase-delay |
ω | angular modulation frequency |
μ_{a} | absorption coefficient |
μ_{s} | isotropic scattering coefficient |
Φ | complex fluence |
ϕ | quantum efficiency |
τ | fluorescence lifetime |
Ω | volume |
γ | parameter to account for refractive index mismatch |
Γ | surface |
ε | stopping criteria for truncated Newton’s method |
ε | error function for all sources |
F | error function for individual sources |
g _{k} | gradient vector at iteration k = ∇E_{k} |
d | search direction |
G | Hessian matrix |
References
1. D. A. Boas, M. A. O’Leary, B. Chance, and A.G. Yodh, “Scattering of diffuse photon density waves by spherical heterogeneities within turbid media: analytic solutions and applications,” Proc. Natl. Acad. Sci. , 91, 4887’91 (1994). [CrossRef]
2. M. A. O’Leary, D. A. Boas, B. Chance, and A.G. Yodh, “Experimental images of heterogeneous turbid media by frequency-domain diffusion photon tomogrpahy,” Opt. Lett. 20, 426’428 (1995). [CrossRef]
3. R. L. Barbour, H. Graber, Y. Wang, J. Chang, and R. Aronson, “Perturbation approach for optical diffusion tomography using continuous-wave and time-resolved data,” in Medical Optical Tomography: Functional Imaging and Monitoring, G. Muller, B. Chance, R. Alfano, J. Beuthan, E. Gratton, M. Kashke, B. Masters, S. Svanberg, and P. van der Zee, eds. (SPIE Press, Bellingham, WA., 1993), pp 87’120.
4. Y. Yao, Y. Wang, Y. Pei, W. Zhu, and R.L. Barbour, “Frequency-domain optical imaging of absorption and scattering by a Born iterative method,” J. Opt. Soc. Am. A 14, 325’342 (1997). [CrossRef]
5. W. C. Chew and Y. M. Wang, “Reconstruction of two-dimensional permittivity distribution using the distorted Born iterative method,” IEEE Trans. On Medical Imaging. 9, 218’225 (1995). [CrossRef]
6. K. D. Paulsen and H. Jiang, “Spatially varying optical property reconstruction using a finite element diffusion equation approximation,” Med. Phys. 22, 691’701 (1995). [CrossRef]
7. H. Jiang, K. D. Paulsen, U. L. Osterberg, B.W. Pogue, and M. S. Patterson, “Optical image reconstruction using frequency-domain data simulations and experiments,” J. Opt. Soc. Am. A. 13, 253’266 (1996). [CrossRef]
8. D.Y. Paithankar, A. U. Chen, B.W. Pogue, M. S. Patterson, and E. M. Sevick-Muraca, “Imaging of fluorescent yield and lifetime from multiply scattered light re-emitted from tissues and other random media,” Appl. Opt. 36, 2260’2272 (1997). [CrossRef]
9. M. Schweiger, S. R. Arridge, and D. T. Delpy, “Application of the finite-element method for the forward and inverse models in optical tomography,” J. Math. Img. Vision 3, 263’283 (1993). [CrossRef]
10. J. J. McKeown ,“On algorithms for sums of squares problems,” Towards global optimization, edited by L. C. W. Dixon and G. P. Szeeg, (North-Holland Amsterdam, Holland, 1975).
11. M. T. Vespucci, “An efficient code for the minimization of highly nonlinear and large residual least squares functions,” Optimization 18, 825’855 (1987). [CrossRef]
12. R. R. Meyer, “Theoretical and computational aspects of nonlinear regression,” Nonlinear Programming, eds. J. B. Rosen, O. L. Mangasarian, and K. Ritter, (Academic Press, New York, 1970).
13. L. B. Rall, Automatic differentiation: Techniques and application, Lecture notes in computer science (Springer Verlag, 1981) p. 120. [CrossRef]
14. A. Griewank, “On automatic differentiation,” edited M. Iri and K. Tanaka, Mathematical programming: Recent developments and application, (Kluwer Academic Publishers, 1989) pp 83’108.
15. T. L. Troy, D. L. Page, and E. M. Sevick-Muraca, “Optical properties of normal and diseased breast tissues: prognosis for optical mammography,” J. Biomed Opt. 1, 342’355 (1996). [CrossRef]
16. E. M. Sevick-Muraca, G. Lopez, T. L. Troy, J. S. Reynolds, and C. L. Hutchinson, “Fluorescence and absorption contrast mechanisms for biomedical optical imaging using frequency-domain techniques,” Photochem. and Photobio. 6655’64 (1997). [CrossRef]
17. M. A. O’Leary, D. A. Boas, B. Chance, and A.G. Yodh, “Fluorescence lifetime imaging in turbid media,” Opt. Lett. 21, 158’160 (1996). [CrossRef]
18. J. Chang, H. L. Graber, and R.L. Barbour, “Luminescence optical tomography of dense scattering media,” J. Opt. Soc. Am. A. 14, 288’299 (1997). [CrossRef]
19. T. L. Troy and E. M. Sevick-Muraca, “Fluorescence lifetime imaging and spectroscopy in random media,” in Applied Fluorescence in Chemistry, Biology, and Medicine, Rettig, Strehmel, Shrader, and Seifert, eds., Springer Verlag, pp. 3’36 (1999). [CrossRef]
20. H. Jiang, “Frequency-domain fluorescent diffusion tomography: a finite-element based algorithm and simulations,” Appl. Opt. 37, 5337’5343 (1998). [CrossRef]
21. Chang, H. L. Graber, and R.L. Barbour, “Improved reconstruction algorithm for luminescence optical tomography when background luminophore is present,” Appl Opt. 37 , 3547’3552 (1998). [CrossRef]
22. J. Lee and E. M. Sevick-Muraca, “Lifetime and absorption imaging with fluorescence FDPM,” Time-resolved fluorescence spectroscopy and imaging in tissues, E. M. Sevick-Muraca (ed.)., Proc. Soc. Photo-Opt. Instrum. Eng., 3600: (to be published), (1999).
23. A. Ishimaru, Wave propagation and scattering in random media,(Academic Press, New York, 1978).
24. M. Schweiger, S. R. Arridge, M. Hiraka, and D. T. Delpy, “The finite-element method for the propagation of light in scattering media- boundary and source conditions,” Med. Phys. 22, 1779’1792 (1995) [CrossRef]
25. R. A. J. Groenhuis, H. A. Ferwerda, and J. J. Ten Bosch, “Scattering and absorption of turbid material determined from reflection measurements,” Appl. Opt. 22, 2456’2462 (1983). [CrossRef]
26. O. C. Zienkiewcz and R. L. Taylor,The finite element methods in engineering science, (McGraw-Hill, New York, 1989).
27. L. C. W. Dixon and R. C. Price, “Numerical experience with the truncated Newton method for unconstrained optimization,” JOTA , 56, 245’255 (1988). [CrossRef]
28. R. Roy, Image reconstruction from light measurements on biological tissue, Ph. D. thesis, University of Hertfordshire, England,(1996).
29. R. S. Dembo and T. Steihaug, “Truncated Newton algorithms for large-scale unconstrained optimization,” Math Programming 26, 190’212 (1983). [CrossRef]
30. R. C. Price, Sparse matrix optimization using automatic differentiation, Ph. D. thesis, University of Hertfordshire, U. K., (1987).
31. L. Armijo “Minimization of functions having Lipschitz continuous first partial derivatives,” Pacific J. Mathematics 16, 1’3 (1966).
32. P. Wolfe, “Convergence condition for ascent method,” SIAM Rev. , 11226’253 (1969). [CrossRef]
33. B. Christianson, A. J. Davies, L. C. W. Dixon, R. Roy, and P. van der Zee, “Giving reverse differentiation a helping hand,” Opt. Meth. And Software 8, 53’67 (1997). [CrossRef]
34. A. J. Davies, B. Christianson, L. C. W. Dixon, R. Roy, and P. van der Zee, ”Reverse differentiation and the inverse diffusion problem,” Adv. In Eng. Software 28, 217’221 (1997). [CrossRef]
35. R. E. Wengert, “A simple automatic derivative evaluation program,” Comm. A. C. M. , 7, 463’464 (1964).