## Abstract

Fluorescence lifetime imaging microscopy (FLIM) is a powerful imaging tool used to study the molecular environment of flurophores. In time domain FLIM, extracting lifetime from fluorophores signals entails fitting data to a decaying exponential distribution function. However, most existing techniques for this purpose need large amounts of photons at each pixel and a long computation time, thus making it difficult to obtain reliable inference in applications requiring either short acquisition or minimal computation time. In this work, we introduce a new nonparametric empirical Bayesian framework for FLIM data analysis (NEB-FLIM), leading to both improved pixel-wise lifetime estimation and a more robust and computationally efficient integral property inference. This framework is developed based on a newly proposed hierarchical statistical model for FLIM data and adopts a novel nonparametric maximum likelihood estimator to estimate the prior distribution. To demonstrate the merit of the proposed framework, we applied it on both simulated and real biological datasets and compared it with previous classical methods on these datasets.

© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Fluorescence lifetime imaging microscopy (FLIM) is a widely used technique to reveal the changes in fluorophores’ local environments by measuring fluorophores’ lifetime [1,2]. The application of FLIM includes, but is not limited to, measuring local environmental parameters within cells such as pH or oxygenation state, studying protein interactions by quantifying Förster resonance energy transfer (FRET), and investigating the metabolic state of cells [2]. In particular, due to noninvasiveness and high-resolution, FLIM has been used to monitor the dynamic change in metabolic state of living cells by measuring lifetime of auto-fluorescent properties of reduced nicotinamide adenine dinucleotide (NADH) and flavin adenine dinucleotide (FAD) in cancer research [2–4].

To investigate and compare different types of cells/tissues, the typical analysis workflow for FLIM data follows a two-step procedure [3–5]: 1) pixel-wise lifetime recovery at each pixel: the lifetime of each component and component contribution are extracted from fluorescence signal by fitting data to a single/double decaying exponential distribution function [6–9]; 2) integral property inference: one or several summary statistics of each sample are calculated from all pixel-wise estimations of the previous step, e.g. the mean or standard deviation of lifetime or component contribution. The pixel-wise fitted lifetime and these summary statistics are then used to investigate the spatial change within each sample and the difference across groups of samples, respectively.

To infer pixel-wise lifetime, numerous exponential curve fitting approaches have been proposed [6–18]. Due to easy implementation, pixel-wise analysis has been arguably the most widely used strategy for pixel-wise lifetime recovery, including least-squares fitting and maximum likelihood estimation (MLE) approaches [8–10]. One main obstacle for pixel-wise analysis is that it requires a large number of photons per pixel [19], resulting in long photon collection time, usually more than tens of seconds for the whole image. This time requirement for photon collection prohibits FLIM to be used for acquisition at higher speeds [20]. Despite the recent improvement in fast detector [21], one of the most commonly used computation strategies to alleviate this issue is global analysis, which estimates global fluorescence lifetime by using photons across all pixels and then calculates pixel-wise component contribution [7,12,13]. Although global analysis might provide more robust estimation in low-photon regime, it brings irreversible bias for pixel-wise lifetime estimation due to neglect of spatial change in fluorescence lifetime. Therefore, there is a need for more robust pixel-wise lifetime fitting algorithms that work for low-photon regimes.

On the other hand, the goal of integral property inference in the classical workflow is different from pixel-wise lifetime recovery because only summary information is needed in this step. As described above, the most common way is direct calculation from pixel-wise recovered lifetime. However, this way requires reliable estimation of pixel-wise lifetime, which usually needs many photons at each pixel as we previously discussed. Moreover, it usually takes long computation time, which brings difficulty to analysis in real time monitoring [4] and large scale experiments, especially when there are thousands of datasets to compare in high-throughput screenings [7,22]. The main difficulty lies in the pixel-wise fitting step, as pixel-wise lifetime recovery needs a large number of photons per pixel and thousands of iterative instrumental response function deconvolutions. Therefore, a natural question arises: can we just conduct integral property inference directly and bypass the pixel-wise lifetime recovery step? In this paper, we show this is feasible.

Motivated by these two needs, we introduce a new Nonparametric Empirical Bayesian framework for analyzing FLIM data, referred as NEB-FLIM, to improve both pixel-wise lifetime recovery and integral property inference in the classical workflow. Specifically, we introduce a hierarchical statistical model for FLIM data by assuming that the fluorescence lifetime at each pixel is drawn from some prior distribution. Under this hierarchical model, NEB-FLIM first adopts a non-parametric maximum likelihood estimator (NPMLE) to estimate the prior distribution by using all photons of the image. This estimated prior distribution is then incorporated into subsequent bayesian analysis for pixel-wise lifetime recovery. Through this, NEB-FLIM provides a more accurate and pixel-dependent estimation of fluorescence lifetime. NEB-FLIM uses a plugin estimator of previously estimated prior distribution to conduct integral property inference directly, instead of summarizing from pixel-wise recovered lifetime. In doing so, summary statistics can be computed in a much more computationally efficient and more robust fashion. Thus, it allows its use in applications when low acquisition or computation time is required.

## 2. Methods

In this section, we introduce a hierarchical statistical model for FLIM data in Section 2.1, the nonparametrical estimator of prior distribution in Section 2.2, the pixel-wise bayesian estimator in Section 2.3, and the method for integral property inference in Section 2.4.

#### 2.1 Statistical model for photon-counting FLIM data

In this section, we introduce a statistical model for photon-counting time-domain FLIM data, which is collected by a time-correlated single photon counting system (TCSPC) [1,2]. The form of this statistical model is different from commonly-used physical exponential decay models for fluorophores [1,2], but they are equivalent in terms of data analysis. We adopt this model because it is more convenient for statistical analysis. In the end of this section, we compare these two equivalent models and point out their connections.

In light of fluorophores’ properties [1,2], the decay of the fluorescence intensity follows an exponential decay law $F(t)=F_0e^{-{t/\tau }}$, where $F(t)$ is fluorescence intensity at time $t$, $F_0$ is fluorescence intensity at time $t=0$, and $\tau$ is defined as lifetime of fluorescence, the main parameter of interest in this article. Thus, our statistical model assumes each photon emitted by fluorophores obeys the following exponential distribution $\tau ^{-1}e^{-{t/\tau }}$ when $t>0$. To measure $\tau$ by TCSPC, a pulsed laser is used to excite the sample repeatedly with time period $T_p$, and only the first photon within every period is recorded. Thus, if the photons emitted from previous periods are taken into account, the probability distribution can be expressed as

In the above statistical model, we focus on the situation where all fluorescence distribution have the same lifetime, i.e. mono-exponential component model. In a lot of applications, the fluorescence distribution shows the status of the fluorophore, its confirmations and interactions with its local micro-environment [2]. For example, NADH has different fluorescence lifetime when it is bound and unbound to proteins [23]. In such situations, the decay of the fluorescence intensity follows a double exponential decay law $F(t)=F_0\left (A_1e^{-{t/\tau _1}}+A_2e^{-{t/\tau _2}}\right )$, where $A_k$ is the fraction of the $k$th component, also called component contribution, such that $A_1+A_2=1$ and $\tau _k$ is lifetime of the $k$th component. For convenience of statistical analysis, our statistical model assumes each photon follows a mixture of exponential distributions, i.e.

Following the same conduction in setting of a single type of fluorescence, the distribution function of arriving photon times is thus a mixture distribution $f(t)= af_{\tau _1}(t)+(1-a)f_{\tau _2}(t)$, where $f_{\tau _k}(t)$ has the same form of distribution in Eq. (1). Besides $\boldsymbol {\tau }:=(\tau _1,\tau _2)$, the contribution of each component $a$ is of more interest in many applications. Hence, the lifetime analysis of double exponential components model aims to recover $a$ and $\boldsymbol {\tau }$ from observations $\mathbf {N}=(N_1,\ldots ,N_m)$, which is drawn from $\textrm {Multi}(\sum _{j=1}^mN_j,a\mathbf {P}_{(\tau _1)}+(1-a)\mathbf {P}_{(\tau _2)})$.

To reflect the spatial trend of fluorescence lifetime, the arrival time of photons are recorded at each pixel $i\in {\mathbb {I}}$ through microscopy scanning techniques. More specifically, the observed data at each pixel $i\in {\mathbb {I}}$ is a histogram of photon counts $\mathbf {N}_i=(N_{i1},\ldots ,N_{im})$, and the goal is to study the pixel-wise fluorescence lifetime $\boldsymbol {\tau }_i=(\tau _{i1},\tau _{i2})$ and pixel-wise statistical component contribution $a_i$ of each pixel from the pixel-wise observations $\mathbf {N}_i$s. In other words, the FLIM data can be seen generated from thousands of parallel double exponential models. Figure 1 illustrates the data structure of fluorescence-lifetime imaging microscopy (FLIM). In order to analyze data from all pixels jointly, we further assume $\tau _{i1}$ and $\tau _{i2}$ are independently drawn from two prior distributions: $\pi _1(t)$ and $\pi _2(t)$. The prior distribution $\pi _1(t)$ and $\pi _2(t)$ can be also seen as empirical distribution of $\boldsymbol {\tau }_i$

#### 2.2 Estimation of prior distribution

In most traditional bayesian FLIM analysis methods, the prior distribution of lifetime is usually a predetermined distribution, which is either manually input or uninformative prior [14,15,17,18]. These subjective prior distribution leads to unavoidable bias when misspecified. Thus, we opt to estimate prior distributions by maximizing marginal likelihood distribution.

The model of FLIM data defined in Section 2.1 suggests that, if we pool all photons across pixels of images together, these photons can be seen drawn from a single mixture model

The model in (3) can be written in its equivalent form

To be specific, we assume the support of distribution $\pi ^\ast (t)$ belongs to some known interval $[T_L,T_U]$ and divide $[T_L,T_U]$ into $L$ equal-spaced interval with $L+1$ points $T_L=h_0<\ldots <h_L=T_U$. This bounded support assumption is suitable in most applications, as the knowledge of an roughly lifetime is available in advance. With this grid $h_0,\ldots ,h_L$, we can discretize the distribution $\pi ^\ast (t)$ as a $L$ dimension discrete distribution

where $p_l^\ast =\int _{h_{l-1}}^{h_l}d\pi ^\ast (t)$. To recover $\pi _\Delta ^\ast (t)$, it is sufficient to recover $\boldsymbol {p}^\ast =(p_1^\ast ,\ldots ,p_L^\ast )$. After discretization, the likelihood function of marginal distribution of $\mathbf {M}$ can be written asAfter estimating $\hat {\pi }^{\ast }(t)$, we now segment this distribution to recover $\pi _1(t)$ and $\pi _2(t)$. Generally, it is impossible to recover $\pi _1(t)$ and $\pi _2(t)$ by $\hat {\pi }^{\ast }(t)$ alone because they are not identifiable if there is overlapping area between them. To address this issue, we appeal to the observation that the two prior distributions can be separated very well in many FLIM applications. For example, the two components of NADH, bound and unbound, have lifetimes of roughly $400$ to $500$ picosecond(ps) and $2000$ to $2500$ ps, respectively [3]. Another example in FRET quantification is NowGFP, an improved version of green fluorescent protein. Its two components, close to and far away from acceptor such as mRuby2 or tdTomato, have lifetimes of roughly $2000$ to $3000$ ps and $5000$ ps [28]. Thus, we shall assume $\pi _1(t)$ from $\pi _2(t)$ are identifiable in the following sense

Motivated by this identification assumption (7), we segment $\hat {\pi }^{\ast }(t)$ by minimizing intra-component variance, equivalently maximizing inter-component variance.To be specific, the segmentation threshold $t_T$ can be seen as the solution of the below optimization problem

#### 2.3 Pixel-wise Bayesian analysis

In this section, we show the pixel-wise lifetime recovery benefits from accurate estimated prior distribution as well. To incorporate the estimated prior distribution, we opt to adopt the empirical bayesian framework [26,29–31] to analyze FLIM photon counting data. Under the hierarchical model defined in Section 2.1, the posterior distribution of $\boldsymbol {\tau }_i,a_i$ can be written as

Here, we consider the maximum of posterior distribution as our estimator after plugging in the prior distribution we estimated in Section 2.2. To be specific, the maximum of posterior distribution can be obtained by optimizing the following function

One challenge of EM algorithm in practice is the choice of initial values, $\tau ^{(0)}_{i1}$, $\tau ^{(0)}_{i2}$, and $a^{(0)}_{i}$, as different initial values might lead to different local optimum points. Fortunately, the estimated prior distribution could provide a good guidance for good choices of initial values because the support of $\hat {\pi }_1(t)$ and $\hat {\pi }_2(t)$ can allow us to narrow the search region down. More specifically, we choose $\tau ^{(0)}_{i1}$ as $5\%$ lower quantile of $\hat {\pi }_1(t)$, $\tau ^{(0)}_{i2}$ as $5\%$ upper quantile of $\hat {\pi }_2(t)$, and $a^{(0)}_{i}$ as the estimation $\hat {a}^{\ast }$. When the support of prior distribution lies in a small region, the EM algorithm can be accelerated a lot based on the above choices of initial values. Another challenge of the EM algorithm is slow convergence speed in practice. To accelerate the EM algorithm, we also adopt the acceleration scheme in [34].

#### 2.4 Integral property inference

Different from pixel-wise lifetime recovery, integral property inference aims to estimate/test a functional of pixel-wise parameters. Under the hierarchical model in Section 2.1, a functional of pixel-wise parameters can be written as a functional of prior distribution. Thus, we consider the estimation of linear functional of prior distribution in this section. To be specific, for any given function $g(t)$ defined on $[T_L,T_U]$, the goal is to estimate the following linear functional

Most summarized statistics of interest in FLIM studies can be written in the combination form of linear functional. For example, the mean and variance of lifetime of the $k$th component $\tau _k^\ast$ and $v(\tau _k)$ can be written asTo summarize the pixel-wise information, the most commonly used estimator in practice for $F_k(g)$ is plugin estimator of pixel-wise fitted lifetime in the last section

To illustrate the idea of NPMLE plug-in estimator, we show the explicit expression of five commonly used summarized statistics: mean of lifetime $\tau _1^\ast$ and $\tau _2^\ast$, mean of physical contributions $A_1^\ast$ and $A_2^\ast$, and mean of average lifetime

#### 2.5 Practical considerations

After we combine all components introduced in the previous sections, the new non-parametric empirical bayesian framework for FLIM data (NEB-FLIM) is summarized in Fig. 3. In the step of prior distribution estimation, the core component is the optimization problem in (6). After obtaining data $M_j$ for each bin, the optimization problem in (6) is ready to be solved by ‘REBayes’ R package [35]. To segment the estimated prior distribution, we calculate the object function (8) at each $h_l$, $l=1,\ldots ,L$ and take $h_l$ achieving the maximum of them as the cutting threshold $t_T$.

After estimating the prior distribution, the estimated prior distribution can be then used to conduct integral property inference or pixel-wise Bayesian analysis. The integral property inference can be completed by common used numerical integration algorithms. For simplicity, we just take $\sum _{l=1}^Lg(h_l)p_l$ as $\hat {F}^\textrm {NEB}_{k}(g)$ for any function $g$ and $k=1,2$. Here, $p_l$ is the probability mass of $\hat {\pi }_k(t)$ at $l$th bin between $h_{l-1}$ and $h_l$. The pixel-wise Bayesian analysis is implemented by EM algorithm as we described in previous section. We adopt scheme of [34] to accelerate the EM algorithm and stop the iteration when the object function is increased less than $10^{-2}$ (or any small number) in one iteration or number of iteration reaches some maximum number.

In this NEB-FLIM framework, there are mainly three tuning parameters: the lower bound of lifetime $T_L$, the upper bound of lifetime $T_U$ and the number of intervals $L$. $T_L$ and $T_U$ are chosen according to the specific application. The choice of $L$ is very important, as larger $L$ usually implies more accurate estimation of prior distribution, but more computation time as well. We discuss the choice of $L$ in more details in the later section.

## 3. Results

We now conduct numerical experiments to demonstrate the merits of our nonparametric empirical bayesian FLIM analysis framework (NEB-FLIM) in this section.

#### 3.1 Simulation

The first simulation experiment we consider here is to assess the performance of prior distribution $\pi ^\ast (t)$ estimation. To this end, we simulated FLIM images on a $32\times 32$ square lattice ${\mathbb {I}}$ according to the model described in Section 2.1. We assumed the period of laser excitation, $T_p$, was 10000 ps (or 10 ns), ratio of background photons, $\alpha$, was $0.001$, and $[0,T_p]$ was divided into $m=256$ bins. The IRF $h$ we use in this experiment is gaussian distribution function with mean $1500$ ps and standard deviation $100$ ps. At each pixel $i\in {\mathbb {I}}$, we assumed there were two types of fluorescence, and $\mathbf {N}_i$ was randomly generated from the following bi-exponential model

The pixel-wise lifetime of both components $(\tau _{i1},\tau _{i2})$ and the contribution of the first component $a_{i}$ are shown in Fig. 4. For simplicity, the number of photons at each pixel $i\in {\mathbb {I}}$ is assumed to be equal, i.e. $n_i=n,\ \forall i\in {\mathbb {I}}$.In this simulation experiment, we compare the performance of prior distribution estimation at different numbers of photons per pixel $n$ and different numbers of intervals $L$ in NPMLE. To assess the performance, we calculate the $L_2$ distance between cumulative distribution of the true prior distribution $\pi ^{\ast }(t)$ and our estimator $\hat {\pi }^{\ast }(t)$

We designed the next two simulation experiments to compare NEB-FLIM and previous methods. In particular, we mainly focus on two of the most popular methods: pixel-wise analysis and global analysis. As mentioned before, pixel-wise analysis methods fit the exponential curve only by photons at each pixel [8–10]. In this simulation experiment, we only focus on likelihood based pixel-wise analysis, as it has been shown more efficient than other popular pixel-wise analysis methods [9]. The global analysis estimates lifetime of two components globally and then estimates the components’ contribution at each pixel [7,12,13]. The two experiments are designed to compare the performance in terms of pixel-wise lifetime recovery and integral property inference, respectively.

We now compare performance of pixel-wise lifetime recovery. To this end, we still followed the bi-exponential model and chose the same setting with the previous experiment. $L$ was chosen at $L=1400$. The performance of each method is assessed by the mean square error

We design the next experiment to assess performance of integral property inference. To be specific, we compare four different methods to estimate the mean of lifetime $\tau _1^\ast$ and $\tau _2^\ast$ defined in (11). The four methods we would like to compare are: direct integral property inference in NEB-FLIM(PI-NEB), mean of pixel-wise lifetime estimated by NEB-FLIM(PBA-NEB), mean of pixel-wise lifetime estimated by pixel-wise analysis(PA), and mean of pixel-wise lifetime estimated by global analysis(GA). To compare these methods, we follow the same settings of previous experiments, but consider different sample sizes per pixel: $n=50$, $100$, and $200$. For each $n$, the experiment was repeated 100 times, and for each time, we applied these four methods on the generated FLIM image. We assessed the performance of estimating $\tau _1^\ast$ and $\tau _2^\ast$ by evaluating square root of mean square error

where $H$ is total number of simulation experiments and $\hat {\tau }_{kh}^\ast$ is the estimation $\tau _k^\ast$ at the $h$th simulation experiment. The results are summarized in Table 1. As shown in Table 1, it is clear that direct integral property inference in NEB-FLIM(PI-NEB) has better performance than the other methods.In the last experiment, we evaluate different methods for integral property inference from computation efficiency angle. In particular, we followed the same bi-exponential model in previous experiments and simulated image on $32\times 32$, $64\times 64$ and $128\times 128$ square lattice. The number of photons at each pixel $n$ is chosen as $10^3$. We compare the computation time of 4 different methods to estimate the mean of lifetime $\tau _1^\ast$ and $\tau _2^\ast$: direct integral property inference in NEB-FLIM with $L=400$ and $800$ (NEB-400 and NEB-800), plugin estimator of pixel-wise lifetime estimated by pixel-wise analysis(PA), and plugin estimator of pixel-wise lifetime estimated by global analysis(GA). To make comparison fair, all the algorithms are implemented in R and evaluated in the same desktop (Intel Core i5 @3.4 GHz/16GB). The computing times of all algorithms are reported in Table 2, which is based on 10 runs for each image size. It is clear from Table 2 that direct integral property inference in NEB-FLIM is faster than the other two methods. Moreover, the speed of NEB-FLIM mainly relies on the choice of $L$, but not the image size. It is also worth noting that the computation speed may depend on choice the programming language, computing environment and specific implementation, so all these algorithms might be accelerated under other programming languages or implementation.

#### 3.2 Real data example

Finally, we consider a specific biological dataset examining the metabolic state of cancer/normal living cells by measuring lifetime of reduced nicotinamide adenine dinucleotide (NADH). FLIM has been shown to be able to distinguish between free and protein bound state of NADH, as the two states of NADH have different fluorescence lifetimes [3,36]. The first component refers to free NADH, and the second component refers to the protein-bound NADH. Higher contribution of free NADH and hence lower average lifetime value, $A_1\tau _1+A_2\tau _2$, has been found to correlate with higher glycolytic metabolism. Apart from NADH lifetime imaging, FLIM can also be used to visualize flavin adenine dinucleotide (FAD) lifetime for early detection of cancer and for other micro-environment measurement of viscosity, pH and others [4].

This FLIM data set includes NADH FLIM data of MDA-MB-231 breast cancer cells and MCF10A normal cells. The excitation source was a Ti:Sapphire laser (Spectra Physics; Maitai) tuned to wavelength of 740 nm. The excitation and emission were coupled through an inverted microscope (Nikon; Eclipse TE300) with a 20x objective (Nikon, Plan Fluor, N.A. 0.75). A 450/70-nm band-pass emission filter (Semrock, Rochester. NY) was also used to selectively collect the NADH fluorescence emission signal. For each type of cell, FLIM images were collected at 256x256 resolution at 4 different durations($20$, $60$, $120$, and $240$ seconds) using SPC-150 Photon Counting Electronics (Becker & Hickl GmbH, Berlin, Germany) and Hamamatsu H7422P-40 GaAsP photomultiplier tube (Hamamatsu Photonics, Bridgewater, NJ). Urea crystals were used to measure the Instrumental Response Function (IRF) with a 370/10 bandpass emission filter (Semrock, Rochester. NY). Emission intensity was checked by the photon counts after each imaging session to make sure there was no photobleaching or photodamage of the sample. For each duration and sample, the average numbers of photons per pixel, $\bar {n}$, are summarized in Table 3.

We estimated the prior distribution of lifetime $\pi ^{\ast }(t)$ by setting $T_L=2$, $T_U=4000$, and $L=500$. We applied NEB-FLIM to extract summarized information directly from prior distributions estimated by these 8 FLIM images. In particular, we estimated the mean of statistical component contribution $a^\ast$, mean of lifetime of the first component $\tau _1^\ast$, mean of lifetime of the second component $\tau _2^\ast$, mean of physical component contribution (after normalization) $A_1^\ast$, $A_2^\ast$ and mean of weighted averaged lifetime $\tau _m^\ast$. All these results are summarized in Table 3. To assess the potential uncertain brought by different field-of-views, we randomly chose regions of size $128\times 128$ from the original image and applied intergal property inference of NEB-FLIM to estimate mean of weighted averaged lifetime $\tau _m^\ast$ on each chosen region. The estimated weighted averaged lifetime $\tau _m^\ast$ and corresponding standard deviation are reported in Fig. 7, which are based on 100 runs for each combination of duration and cell type. Through Table 3 and Fig. 7, we can conclude that the integral property inference of NEB-FLIM is relatively stable with respect to the imaging time. In other words, NEB-FLIM provides a robust estimation even in low-photon regime. If we compare these two samples, the results suggest cancer cells MDA-MB-231 have a larger mean of physical component contribution $A_1$ and smaller weighted averaged lifetime $\tau _m$ than normal cell MCF10A cells. This discovery is consistent with results of previous experiments in [4]. The difference between NADH lifetime/cell metabolic state can be easily captured by our new method when the imaging time is 20s ($\sim$30 photons per pixel). It is also worth noting that the processing time of integral property inference of NEB-FLIM on each image is less than 1 second on a common desktop (Intel Core i5 @3.4 GHz/16GB).

To compare performance of pixel-wise curve fitting, we applied NEB-FLIM, pixel-wise analysis (maximum likelihood estimation based), and global analysis on these 8 FLIM images. In particular, we did $3\times 3$ binning at each pixel to accumulate more photons. To make the comparisons fair, the initial values of pixel-wise analysis and global analysis were also guided by the prior distribution as NEB-FLIM, although this way might improve the accuracy of pixel-wise analysis and global analysis. Due to space limits, we only showed the physical component contribution $A_{2}$ and weighted average lifetime $\tau _m$, which are shown in Fig. 8 and Fig. 9 (top right corresponds to MCF10A cells and bottom left corresponds to MDA-MB-231 cells). To summarize the fitting result of estimated lifetime, we also made density plot of pixel-wise estimated lifetime of MCF10A Cell in Fig. 10. Through Fig. 8, Fig. 9 and Fig. 10, our proposed NEB-FLIM framework behaves almost the same with pixel-wise analysis when the number of photons is large (imaging time 240s). Furthermore, if we regard the result of imaging time 240s as a benchmark, we could see that the performance of our NEB-FLIM framework is better than the other two methods when the imaging time is short (e.g. 60s).

Finally, we compare the results of property inference (just as in the third simulation experiment). The lifetimes of images with imaging time 20s and 240s are summarized in Table 4. We followed the same procedure in Fig. 7 to evaluate the potential uncertain arose from choices of field-of-views, which is summarized in Fig. 11. It is clear that all methods can detect the difference of NADH lifetime between two types of cells. However, if we regard the recovery results of pixel-wise analysis (PA) when the imaging time is 240s as the benchmark, the performance of PI-NEB is better than the other methods when the imaging time is 20s (see e.g. MCF10A cell), especially better than pixel-wise analysis, the most popular method. This suggests that NEB-FLIM proposed in this article is able to recover more accurately summarized information and tell subtle differences between cells in both high and low photon regimes.

## 4. Discussion and conclusion

In this paper, we propose a new empirical bayesian framework for fluorescence lifetime imaging microscopy data (NEB-FLIM). Different from previous analysis workflows, our new NEB-FLIM framework first estimates the prior distribution of lifetime non-parametrically by using all photons across the whole image. This empirical prior distribution can either be used to conduct integral property inference directly or be incorporated into bayesian analysis to fit an exponential curve at each pixel. Through this method, the summarized information can be estimated very accurately and efficiently computationally. This leads to its potential usage in applications of FLIM requiring either short acquisition or computation times, such as when previewing the lifetime status of cells/tissues before formal analysis and real-time fluorescence lifetime tracking. Due to incorporation of this empirical distribution, the pixel-wise lifetime recovered by NEB-FLIM combines both global and local information, allowing more robust quantification of lifetime at each pixel.

In this presented paper, we only focus on NEB-FLIM framework within the context of a pixel-wise double exponential lifetime model. However, NEB-FLIM, as a generalized framework, can be extend to multiple exponential lifetime models at each pixel. If we assume there is a large gap between different components of lifetime, we can still apply NEB-FLIM to estimate prior distributions by replacing the binary segmentation method with some clustering method which segments the prior distribution into multiple pieces.

The key component to estimate the prior distribution in NEB-FLIM framework is the deconvolution problem in (5). In NEB-FLIM, we adopt linear programming to solve it after data collection. On the other hand, when data becomes available in a sequential order, this deconvolution problem is still solvable if we adopt some online learning algorithm. In other words, we can estimate the prior distribution at the same time as data acquisition. The prior distribution estimation and integral property inference can be completed just after data collection.

## Funding

U.S. Department of Energy (DE-SC0019013, Morgridge Institute for Research).

## Acknowledgments

We thank Dr. Ellen T. Arena of the University of Wisconsin-Madison for careful reading of an earlier draft that has led to improved presentation.

## Disclosures

The authors declare that there are no conflicts of interest related to this article.

## References

**1. **W. Becker, * Advanced Time-correlated Single Photon Counting Techniques*, vol. 81 (Springer, 2005).

**2. **W. Becker, * Advanced Time-correlated Single Photon Counting Applications*, vol. 111 (Springer, 2015).

**3. **D. K. Bird, L. Yan, K. M. Vrotsos, K. W. Eliceiri, E. M. Vaughan, P. J. Keely, J. G. White, and N. Ramanujam, “Metabolic mapping of MCF10A human breast cells via multiphoton fluorescence lifetime imaging of the coenzyme NADH,” Cancer Res. **65**(19), 8766–8773 (2005). [CrossRef]

**4. **A. J. Walsh, R. S. Cook, H. C. Manning, D. J. Hicks, A. Lafontant, C. L. Arteaga, and M. C. Skala, “Optical metabolic imaging identifies glycolytic levels, subtypes, and early-treatment response in breast cancer,” Cancer Res. **73**(20), 6164–6174 (2013). [CrossRef]

**5. **M. C. Skala, K. M. Riching, D. K. Bird, A. Gendron-Fitzpatrick, J. Eickhoff, K. W. Eliceiri, P. J. Keely, and N. Ramanujam, “In vivo multiphoton fluorescence lifetime imaging of protein-bound and free nicotinamide adenine dinucleotide in normal and precancerous epithelia,” J. Biomed. Opt. **12**(2), 024014 (2007). [CrossRef]

**6. **S. Pelet, M. Previte, L. Laiho, and P. So, “A fast global fitting algorithm for fluorescence lifetime imaging microscopy based on image segmentation,” Biophys. J. **87**(4), 2807–2817 (2004). [CrossRef]

**7. **S. C. Warren, A. Margineanu, D. Alibhai, D. J. Kelly, C. Talbot, Y. Alexandrov, I. Munro, M. Katan, C. Dunsby, and P. French, “Rapid global fitting of large fluorescence lifetime imaging microscopy datasets,” PLoS One **8**(8), e70687 (2013). [CrossRef]

**8. **K. Santra, E. A. Smith, J. W. Petrich, and X. Song, “Photon counting data analysis: Application of the maximum likelihood and related methods for the determination of lifetimes in mixtures of rose bengal and rhodamine b,” J. Phys. Chem. A **121**(1), 122–132 (2017). [CrossRef]

**9. **K. Santra, J. Zhan, X. Song, E. A. Smith, N. Vaswani, and J. W. Petrich, “What is the best method to fit time-resolved data? a comparison of the residual minimization and the maximum likelihood techniques as applied to experimental time-correlated, single-photon counting data,” J. Phys. Chem. B **120**(9), 2484–2490 (2016). [CrossRef]

**10. **M. Maus, M. Cotlet, J. Hofkens, T. Gensch, F. C. De Schryver, J. Schaffer, and C. Seidel, “An experimental comparison of the maximum likelihood estimation and nonlinear least-squares fluorescence lifetime analysis of single molecules,” Anal. Chem. **73**(9), 2078–2086 (2001). [CrossRef]

**11. **D. A. Turton, G. D. Reid, and G. S. Beddard, “Accurate analysis of fluorescence decays from single molecules in photon counting experiments,” Anal. Chem. **75**(16), 4182–4187 (2003). [CrossRef]

**12. **P. J. Verveer and P. Bastiaens, “Evaluation of global analysis algorithms for single frequency fluorescence lifetime imaging microscopy data,” J. Microsc. **209**(1), 1–7 (2003). [CrossRef]

**13. **P. R. Barber, S. M. Ameer-Beg, J. D. Gilbey, R. J. Edens, I. Ezike, and B. Vojnovic, “Global and pixel kinetic data analysis for FRET detection by multi-photon time-domain FLIM,” Proc. SPIE **5700**, 171–181 (2005). [CrossRef]

**14. **P. Barber, S. Ameer-Beg, S. Pathmananthan, M. Rowley, and A. Coolen, “A bayesian method for single molecule, fluorescence burst analysis,” Biomed. Opt. Express **1**(4), 1148–1158 (2010). [CrossRef]

**15. **M. I. Rowley, P. R. Barber, A. C. Coolen, and B. Vojnovic, “Bayesian analysis of fluorescence lifetime imaging data,” Proc. SPIE **7903**, 790325 (2011). [CrossRef]

**16. **J. Kim, J. Seok, H. Lee, and M. Lee, “Penalized maximum likelihood estimation of lifetime and amplitude images from multi-exponentially decaying fluorescence signals,” Opt. Express **21**(17), 20240–20253 (2013). [CrossRef]

**17. **M. I. Rowley, A. Coolen, B. Vojnovic, and P. R. Barber, “Robust bayesian fluorescence lifetime estimation, decay model selection and instrument response determination for low-intensity FLIM imaging,” PLoS One **11**(6), e0158404 (2016). [CrossRef]

**18. **B. Kaye, P. J. Foster, T. Yoo, and D. J. Needleman, “Developing and testing a bayesian analysis of fluorescence lifetime measurements,” PLoS One **12**(1), e0169337 (2017). [CrossRef]

**19. **M. Köllner and J. Wolfrum, “How many photons are necessary for fluorescence-lifetime measurements?” Chem. Phys. Lett. **200**(1-2), 199–204 (1992). [CrossRef]

**20. **M. Raspe, K. M. Kedziora, B. van den Broek, Q. Zhao, S. de Jong, J. Herz, M. Mastop, J. Goedhart, T. W. Gadella, I. T. Young, and K. Jalink, “siFLIM: single-image frequency-domain FLIM provides fast and photon-efficient lifetime data,” Nat. Methods **13**(6), 501–504 (2016). [CrossRef]

**21. **N. Krstajić, S. Poland, J. Levitt, R. Walker, A. Erdogan, S. Ameer-Beg, and R. K. Henderson, “0.5 billion events per second time correlated single photon counting using cmos spad arrays,” Opt. Lett. **40**(18), 4305–4308 (2015). [CrossRef]

**22. **C. Guzmán, C. Oetken-Lindholm, and D. Abankwa, “Automated high-throughput fluorescence lifetime imaging microscopy to detect protein–protein interactions,” J. Lab. Autom. **21**(2), 238–245 (2016). [CrossRef]

**23. **J. R. Lakowicz, * Principles of Fluorescence Spectroscopy* (Springer, 2006).

**24. **J. Kiefer and J. Wolfowitz, “Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters,” Ann. Math. Stat. **27**(4), 887–906 (1956). [CrossRef]

**25. **B. G. Lindsay, “The geometry of mixture likelihoods: a general theory,” Ann. Statist. **11**(1), 86–94 (1983). [CrossRef]

**26. **W. Jiang and C. Zhang, “General maximum likelihood empirical bayes estimation of normal means,” Ann. Statist. **37**(4), 1647–1684 (2009). [CrossRef]

**27. **R. Koenker and I. Mizera, “Convex optimization, shape constraints, compound decisions, and empirical bayes rules,” J. Am. Stat. Assoc. **109**(506), 674–685 (2014). [CrossRef]

**28. **B. G. Abraham, K. S. Sarkisyan, A. S. Mishin, V. Santala, N. V. Tkachenko, and M. Karp, “Fluorescent protein based fret pairs with improved dynamic range for fluorescence lifetime measurements,” PLoS One **10**(8), e0134436 (2015). [CrossRef]

**29. **H. Robinns, “Asymptotically subminimax solutions of compound decision problems,” in Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, vol. 1950, (1951), pp. 131–148.

**30. **C. Zhang, “Compound decision theory and empirical bayes methods,” Ann. Statist. **31**(2), 379–390 (2003). [CrossRef]

**31. **B. Efron, “Two modeling strategies for empirical bayes estimation,” Statist. Sci. **29**(2), 285–301 (2014). [CrossRef]

**32. **B. Kleijn and A. Van der Vaart, “The bernstein-von-mises theorem under misspecification,” Electron. J. Stat. **6**, 354–381 (2012). [CrossRef]

**33. **A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. Royal Stat. Soc. Ser. B (methodological) **39**(1), 1–22 (1977). [CrossRef]

**34. **R. Varadhan and C. Roland, “Simple and globally convergent methods for accelerating the convergence of any EM algorithm,” Scand. J. Stat. **35**(2), 335–353 (2008). [CrossRef]

**35. **R. Koenker and J. Gu, “Rebayes: an r package for empirical bayes mixture methods,” Tech. Rep., Journal of Statistical Software **82**(8), 1–30 (2017). [CrossRef]

**36. **J. V. Chacko and K. W. Eliceiri, “Autofluorescence lifetime imaging of cellular metabolism: Sensitivity toward cell density, ph, intracellular, and intercellular heterogeneity,” Cytometry, Part A **95**(1), 56–69 (2019). [CrossRef]