## Abstract

We present a filtering procedure based on singular value decomposition to remove artifacts arising from sample motion during dynamic full field OCT acquisitions. The presented method succeeded in removing artifacts created by environmental noise from data acquired in a clinical setting, including in vivo data. Moreover, we report on a new method based on using the cumulative sum to compute dynamic images from raw signals, leading to a higher signal to noise ratio, and thus enabling dynamic imaging deeper in tissues.

© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Optical coherence tomography (OCT) is routinely used for 3D imaging of microstructures in tissue and relies on the endogenous backscattering contrast [1, 2]. Full-field optical coherence tomography (FFOCT) is an *en face*, high transverse resolution version of OCT [3, 4]. Using a camera and an incoherent light source, FFOCT acquires *en face* virtual sections of the sample at a given depth and has been used for biology [5] and medicine [6]. Recently, a novel contrast mechanism has been exploited by measuring temporal fluctuations of the backscattered light in a technique called dynamic full field OCT (D-FFOCT) [7]. In ex vivo fresh tissues, these dynamic measurements reveal subcellular structures that are very weak back scatterers and provide contrast based on their intracellular motility [8, 9]. A similar technique is used in regular OCT for retinal angiography called OCTA where speckle variance is analyzed on several B-Scans (typically 8 frames) to produce binary images of the retinal vasculature [10]. Due to the high spatial resolution ($<1\text{}\mu m$) and the number of frames used in D-FFOCT (typically 512), our method is not only sensitive to capillaries but also to intracellular motility signals and produces a contrast that reveals living cells [11]. The penetration depth of D-FFOCT is typically ten times less than FFOCT due to the small cross-section of the moving scatterers leading to weak signals, limiting its use in thick samples. Up to now, using D-FFOCT for in vivo imaging has remained elusive as this technique is sensitive to nanometric axial displacements of the sample. The same problems arise for OCTA and several approaches have been developed to remove bulk motion of the eye [12, 13]. In this paper we propose two methods to overcome the aforementioned limitations.

First, we introduce a framework based on the singular value decomposition (SVD) to filter out the axial displacement of the sample from the local fluctuations linked to intracellular motility, enabling in vivo use of D-FFOCT. SVD based algorithms have been previously applied to OCT data, e.g. for smart OCT where the SVD is applied to the reflection matrix in order to extend the penetration depth [14]. An SVD filtering method for D-FFOCT has been previously proposed for simulated data [15] but does not work on our experimental data mainly because the image formation model is different. Here we propose to find eigenvectors associated with axial motion and filter them out. Similar SVD based algorithms for spatio-temporal filtering have been used effectively in acoustics for Doppler acquisitions [16, 17]. In each case the goal is to use the SVD to transform the initial data in a new basis that is more suitable for filtering and identifying outliers. As opposed to [16, 17], our approach reconstructs the signals in the initial space before computing the dynamic image rather than constructing the image in the SVD space. The main advantage of using the SVD rather than Fourier analysis here is that the filter adapts to the data set which exhibits different amounts of artifacts with random patterns.

Secondly, we present a new operator to compute the local dynamics based on the cumulative sum in order to enhance the non-stationary part of the signal, leading to a great increase in the signal to noise ratio (SNR).

Finally, we report on the first D-FFOCT acquisition in vivo to image the mouse liver at $80\text{}\mu m$ depth where the two proposed algorithms greatly improved the image quality by removing motion artifacts and increasing the SNR by a factor of 3.

## 2. Removing artifacts using SVD

In order to construct a D-FFOCT image, a stack of typically 512 direct images (1440 × 1440 pixels) is acquired with a standard FFOCT using our custom software [18]. The FFOCT setup consists of a Linnik interferometer where both arms contain identical microscope objectives. The reference arm contains a silicon mirror mounted on a piezoelectric translation (PZT) used for phase modulation. In a typical FFOCT experiment, at least two images are acquired with different phase modulations and the FFOCT image is constructed by using appropriate phase-shifting algorithms [4, 11]. For D-FFOCT experiments the PZT position is not modulated, fluctuations arise by scatterers motions inside the coherence volume. In this paper we used data acquired from two different setups. The first one is a laboratory setup shown in Fig. 1 and the second one is a LightCT commercial setup manufactured by LLTech SAS. The characteristics of both of these setups are summarized in Table 1. In the first report on D-FFOCT, the level of the dynamic signal at each pixel was computed using a running standard deviation averaged over the whole acquisition [7] so that each pixel is processed independently. Calculating the standard deviation of the signal in time removes highly scattering stationary structures such as collagen or myelin fibers and reveals cells with a much better contrast. Indeed, strongly backscattering structures can dominate the signal even outside the coherence volume thereby masking weakly scattering structures such as cells. In the laboratory, we succeeded in stabilizing the setup by mounting it on a sturdy optical bench, and carried out ex vivo experiments without motion artifacts. For real life applications however, D-FFOCT devices are currently being used by clinicians in hospitals for imaging biopsied tissues [19] and by an atomo-pathologists in busy environments with vibrations arising from vibrational modes of the building, from people walking around the device and from air conditioning. Mechanical vibrations can lead to sample arm motion or oscillations, creating strong signal fluctuations, especially from highly reflective structures such as collagen fibers.

#### 2.1. Motion artifact model

The intensity recorded by the camera is the sum of the backscattered light from both the sample and the reference arm [11]:

*t*,

*η*is the camera quantum efficiency,

*I*

_{0}is the power LED output impinging on the interferometer considering a 50/50 beam-splitter, ${R}_{ref}$ is the reference mirror reflectivity (i.e. the power reflection coefficient), $R\left(\mathbf{r},t\right)$ is the sample reflectivity (i.e. the power reflection coefficient) at position

**r**and time

*t*, $\mathrm{\Delta}\varphi \left(\mathbf{r},t\right)$ is the phase difference between the reference and sample back-scattered signals at position

**r**and time

*t*, ${I}_{incoh}={R}_{inc}{I}_{0}/4$ is the incoherent light back-scattered by the sample on the camera, mainly due to multiple scattering and reflections out of the coherence volume. The dynamic signal is computed as the average of the running temporal standard deviation and the processed dynamic signal can be written:

*SD*is the standard deviation operator,

*N*is the total number of sub-windows,

*τ*is the sub-windows length so that ${t}_{\left[i,i+\tau \right]}$ is the time corresponding to one sub-window,

*R*and $\mathrm{\Delta}{\varphi}_{s}$ are respectively the reflectivity and phase of the local scatterers that induce the temporal fluctuations that D-FFOCT aims to measure. In the event of small displacements of the entire sample, on the order of the depth of field or smaller, the processed signals will be the sum of the actual local fluctuations and the modulation created by the bulk sample motion creating a global phase shift. The resulting artifacts are therefore proportional to the sample reflectivity, which is orders of magnitude higher than the reflectivity of the scatterers probed by D-FFOCT (e.g. mitochondria and vesicles) leading to strong artifacts on the dynamic image, which mask the signal of interest. In the presence of mechanical noise, the measured fluctuation can be written:

_{s}#### 2.2. Proposed algorithm

In order to remove motion artifacts we want to use the SVD as an adaptive filter that could separate motility signals from motion induced signals. The first step is to unfold the 3D $M\left(x,y,t\right)$ cube of data into a 2D matrix ${M}_{u}\left(\mathbf{r},t\right)$ to perform the decomposition. Higher dimensions of SVD do exist but are not required here as the horizontal *x* and vertical *y* dimension do not differ when considering axial motion artifacts. SVD is the generalization of the eigendecomposition of a positive semidefinite normal matrix, and can be thought of as decomposing a matrix into a weighted, ordered sum of separable matrices which will become handy when reconstructing the SVD-denoised signals:

*U*contains the spatial eigenvectors,

*V*contains the temporal eigenvectors and Σ contains the eigenvalues associated with spatial and temporal eigenvectors. Performing the SVD on an unfolded ${M}_{u}\left(1440\times 1440,512\right)$ dynamic stack takes around 30 seconds on a workstation computer (Intel i7-7820X CPU, 128 Gbyte of DDR4 2666 MHz RAM) and requires 45 GByte of available RAM using LAPACK routine for SVD computation without computing full matrices. Investigating such decomposition for artifact-free data sets we found that spatial eigenvectors related to motion artifacts have particular and easily identifiable associated temporal eigenvectors. Indeed, when looking at the first temporal eigenvectors, we observed sinusoid-like patterns with increasing frequency, see Fig. 2(a). In presence of motion artifacts, temporal eigenvectors appeared with random, high-frequency components that are easy to detect with simple features. Here, the zero crossing rate (the number of times a function crosses

*y*= 0) is used to detect temporal eigenvectors involved in motion artifacts. In presence of motion artifacts some of the firsts temporal eigenvectors present a high zero crossing rate, see Fig. 2(b) and 2(c). In order to detect these outliers we computed the absolute value of the derivative of the zero crossing rate (D-ZRC $=\left|ZR{C}_{i+1}-ZR{C}_{i}\right|$) and applied a threshold: if the D-ZRC is higher than three times the D-ZRC standard deviation then the corresponding eigenvalue

*σ*is set to zero in $\widehat{\mathrm{\Sigma}}$ and the SVD-denoised stack ${\widehat{M}}_{u}$ is reconstructed as:

_{i}The SVD-denoised stack ${\widehat{M}}_{u}$ can then be folded back to its original 3D shape $\widehat{M}$ and the dynamic computation can be performed. Interestingly, the use of an automatic selection of eigenvectors allows a more reproducible analysis. For example, the SVD can be performed on spatial sub-regions without visible artifacts, something very hard to obtain with manual selection of eigenvectors. This can also improve the filtering procedure in the case of sample spatial deformation or if the computation requires too much RAM.

#### 2.3. Results

We tested the proposed SVD filtering on different acquisitions taken on different setups (the one presented in Fig. 1(a) and the LightCT system manufactured by LLTech SAS). When motion artifacts were present, image quality after denoising was greatly improved in each case. In Fig. 3 we present lung biopsy images taken with the LightCT system in a clinic, where imaged tissues were waste tissues from biopsy procedures that were destined to be destroyed, and we imaged them just before destruction. The imaging was carried out according to the tenets of the Declaration of Helsinki and followed international ethic requirements for human tissues. SVD filtering effectively removes motion artifacts from collagen fibers and reveals cells in Fig. 3(b) and 3(e), see Visualization 1 for complete $1260\times 1260\text{}\mu m$ fields of view. D-FFOCT images were also higher contrast after SVD denoising, allowing easier interpretation for clinical applications, e.g. lung tumor detection in the presented images. We imaged fibroblasts with the setup presented in Fig. 1(a). Cells were very flat leading to fringes pattern created by the specular reflection on their surface. These fringes were highly visible on the processed D-FFOCT image preventing the visualization of subcellular structures, see Fig. 4(a). After SVD filtering it is possible to distinguish single subcellular entities and track them, see Fig. 4(b), enabling biological study without the need of a costly optical bench setup.

## 3. Extending penetration depth using non-stationarities

In addition to motion artifacts, a drawback of D-FFOCT compared to standard static FFOCT is the reduced penetration depth. While FFOCT can acquire images as deep as $1\text{}mm$, D-FFOCT is limited to about $100\text{}\mu m$ due to the weak signal level produced by the sample fluctuations we wish to measure. In order to enhance the dynamic signal strength and so improve penetration depth, we propose computation of the dynamic image from the cumulative sum of the signal, rather than the raw signal. Indeed, the model for the dynamic image formation is small scatterers moving in the coherence volume during the acquisition leading to phase and amplitude fluctuations in the conjugated camera pixel. While a pure Brownian motion is stationary, hyper-diffusive displacements are not and we therefore propose to use the cumulative sum to enhance these non-stationarities. Intuitively, summing a centered noise will give a noisy trajectory that stays close to zero whereas if there is a small bias it will be summed for every sample and the cumulative sum will therefore have a slope equal to this bias.

#### 3.1. Theoretical considerations

Let us consider an array of random values drawn from a zero centered Gaussian distribution. If the number of samples is large the mean of the array will be close to zero and equivalently the sum of all the samples will also be close to zero. Taking the cumulative sum of such an array gives a so called *Brownian bridge* (the curve starts and ends close to zero and makes a "bridge" between these two points). Theoretically, Brownian bridges are expected to be maximal close to the edges as the probability distribution of the maximum is the third arcsin law which has a typical U-shape [20]. More importantly, the Brownian bridge maximum follows a Rayleigh distribution. If we consider a Brownian bridge ${W}_{s}\text{}\forall \text{}s\in \left[0,1\right]$:

*W*is the supremum of the bridge and $\mathbb{P}\left[{W}_{M}\le u\right]$ is the probability of the supremum being less than

_{M}*u*. According to the Rayleigh distribution, the Brownian bridge maximum must therefore scale as $\sqrt{t}$ with

*t*scaling asthe number of frames. Now, if there is a bias in the distribution, which is the case if a scatterer is moving with constant velocity in the coherence volume, the cumulative sum will scale as $\frac{t}{2}$ due to the slope introduced by the bias. It will also be either always positive or negative and the maximum will be reached around the center of the bridge. The cumulative sum will therefore exhibit a completely different behavior for centered noise than for actual motility signals, leading to a better signal to noise ratio on dynamic images. Note that for Brownian bridges it is often observed that the function changes sign regularly (the probability of sign changes is also well established [20]), which is not the case when there are non-stationarities.

We simulated an experiment by introducing a linear bias of $\sigma /3$ on a centered Gaussian distribution which is not perceptible on the signals presented in Fig. 5(a). Looking at the cumulative sum the bias is much more obvious as the maximum reached by the bridge is three times higher, hence motility signals are detected with a higher sensitivity using the cumulative sum.

#### 3.2. Results

The dynamic image is computed by taking the average of the maxima of the absolute values of the running cumulative sum:

*N*is the total number of sub-windows,

*τ*is the sub-windows length so that ${t}_{\left[i,i+\tau \right]}$ is the time corresponding to one sub-window and $\overline{I}\left(\mathbf{r},{t}_{\left[i,i+\tau \right]}\right)$ is the signal mean on the sub-window. We tested the proposed method with

*τ*= 50 on the photoreceptor layer of an explanted macaque retina at $85\text{}\mu m$ depth that presents a horizontal gradient of SNR, see Fig. 5(b) and 5(c). In order to quantify the gain in SNR we segmented 192 single cells using Trainable Weka [21] and computed the SNR for each cell (the SNR was computed as the mean intensity of the pixels inside the cells divided by the mean intensity of the background), see Fig. 5(d). In this case the SNR is doubled with the proposed method and the camera column noise is almost completely removed. We tested the proposed algorithm on several acquisitions on tissues and cell cultures and the average SNR improvement factor was 1.9, allowing imaging deeper into tissues with D-FFOCT.

## 4. Applying both methods for in vivo dynamic imaging

The proposed SVD filtering procedure is of great interest for applying D-FFOCT in vivo for removing sample motion such as eye motion for retinal imaging. In order to limit lateral drifts and to maintain contact, a custom head was adapted on the sample arm of a LightCT setup combined with a pump to create a weak suction force. We acquired a stack of images on a living mouse liver at $80\text{}\mu m$ depth. The animal manipulation protocol was approved by our local animal care committee. The mouse (4 week-old C57BL/6 (Janvier Lab, Le Genest Saint Isle, France)), was anesthetized by isoflurane, and sacrificed after the imaging procedure by $C{O}_{2}$ inhalation. The standard D-FFOCT images were very noisy mainly due to the heartbeat and breathing of the mouse, leading to tissue motion that creates artifacts, see Fig. 6(a). On applying the proposed SVD filtering, we were able to remove motion artifacts Fig. 6(b). Nonetheless signals are still very low due to the deep imaging in a strongly scattering organ and applying the cumulative sum algorithm dramatically increased the SNR by a factor of 3 Fig. 6(c), see Visualization 2 for complete $1260\times 1260\text{}\mu m$ fields of view. In the end, there are remaining artifacts produced by the coherence volume axial drift during the acquisition. Indeed, if the coherence volume shifts more than its axial extension, even if motion artifacts are perfectly removed, the probed dynamics would be averaged over several depths leading to an axial blur. Toovercome this issue, the position of the coherence gate inside the sample may be compensated by monitoring the breathing and moving the reference arm with a precision corresponding to the optical sectioning ($1\text{}\mu m$ for the in vivo acquisition here) inorder to compensate for the axial drift.

## 5. Conclusion

We proposed a filtering algorithm based on the SVD to effectively remove motion artifacts from dynamic images. The proposed method adds $\sim 40$ seconds for a (1440, 1440, 512) stack which will require GPU processing in order to speed up the process for real time applications. This method was applied on an in vivo data set and is promising as long as axial motion is smaller than the coherence volume depth. Tracking and compensating methods are currently being investigated in order to acquire D-FFOCT stacks in a completely artifact-free manner for cornea [22] and retina [23, 24]. We also proposed a method based on the cumulative sum to enhance non-stationarities in temporal signals which led to an SNR factor increase of 1.9 on average for ex vivo samples and 3 on our in vivo data set. These general techniques could be applied to any other imaging modality with sub-diffraction phase sensitivity.

## Funding

HELMHOLTZ grant, European Research Council (ERC) (610110).

## Acknowledgments

The author would like to thank LLTech SAS for sharing its raw data, especially Émilie Benoit and Louis Dutheil for carrying out in vivo and clinical experiments. The author is also grateful to Olivier Thouvenin, Pedro Mecê, Kassandra Groux, Viacheslav Mazlin, Mathias Fink, Claude Boccara and Kate Grieve for fruitful discussions and valuable comments regarding this paper. The data and algorithms used during the current study are available from the corresponding author upon reasonable request.

## References

**1. **D. Huang, E. Swanson, C. Lin, J. Schuman, W. Stinson, W. Chang, M. Hee, T. Flotte, K. Gregory, C. Puliafito, and J. G. Fujimoto, “Optical coherence tomography,” Science **254**, 1178–1181 (1991). [CrossRef] [PubMed]

**2. **W. Drexler and J. G. Fujimoto, eds., *Optical coherence tomography: technology and applications* (Springer, 2015), 2nd ed. [CrossRef]

**3. **E. Beaurepaire, A. C. Boccara, M. Lebec, L. Blanchot, and H. Saint-Jalmes, “Full-field optical coherence microscopy,” Opt. Lett. **23**, 244–246 (1998). [CrossRef]

**4. **A. Dubois, K. Grieve, G. Moneron, R. Lecaque, L. Vabre, and C. Boccara, “Ultrahigh-resolution full-field optical coherence tomography,” Appl. Opt. **43**, 2874–2883 (2004). [CrossRef] [PubMed]

**5. **J. Ben Arous, J. Binding, J.-F. Léger, M. Casado, P. Topilko, S. Gigan, A. C. Boccara, and L. Bourdieu, “Single myelin fiber imaging in living rodents without labeling by deep optical coherence microscopy,” Journal of Biomedical Optics **16**, 116012 (2011). [CrossRef] [PubMed]

**6. **M. Jain, N. Shukla, M. Manzoor, S. Nadolny, and S. Mukherjee, “Modified full-field optical coherence tomography: A novel tool for rapid histology of tissues,” J. Pathol. Informatics **2**, 28 (2011). [CrossRef]

**7. **C. Apelian, F. Harms, O. Thouvenin, and A. C. Boccara, “Dynamic full field optical coherence tomography: subcellular metabolic contrast revealed in tissues by interferometric signals temporal analysis,” Biomed. Opt. Express **7**, 1511–1524 (2016). [CrossRef] [PubMed]

**8. **C.-E. Leroux, F. Bertillot, O. Thouvenin, and A.-C. Boccara, “Intracellular dynamics measurements with full field optical coherence tomography suggest hindering effect of actomyosin contractility on organelle transport,” Biomed. Opt. Express **7**, 4501–4513 (2016). [CrossRef] [PubMed]

**9. **O. Thouvenin, C. Boccara, M. Fink, J. Sahel, M. Pâques, and K. Grieve, “Cell motility as contrast agent in retinal explant imaging with full-field optical coherence tomography,” Investig. Opthalmology & Vis. Sci. **58**, 4605 (2017). [CrossRef]

**10. **A. H. Kashani, C.-L. Chen, J. K. Gahm, F. Zheng, G. M. Richter, P. J. Rosenfeld, Y. Shi, and R. K. Wang, “Optical coherence tomography angiography: A comprehensive review of current methods and clinical applications,” Prog. Retin. Eye Res. **60**, 66 – 100 (2017). [CrossRef] [PubMed]

**11. **J. Scholler, V. Mazlin, O. Thouvenin, K. Groux, P. Xiao, J.-A. Sahel, M. Fink, C. Boccara, and K. Grieve, “Probing dynamic processes in the eye at multiple spatial and temporal scales with multimodal full field oct,” Biomed. Opt. Express **10**, 731–746 (2019). [CrossRef] [PubMed]

**12. **Y. Jia, O. Tan, J. Tokayer, B. Potsaid, Y. Wang, J. J. Liu, M. F. Kraus, H. Subhash, J. G. Fujimoto, J. Hornegger, and D. Huang, “Split-spectrum amplitude-decorrelation angiography with optical coherence tomography,” Opt. Express **20**, 4710–4725 (2012). [CrossRef] [PubMed]

**13. **T. E. de Carlo, A. Romano, N. K. Waheed, and J. S. Duker, “A review of optical coherence tomography angiography (OCTA),” Int. J. Retin. Vitreous **1**, 5 (2015). [CrossRef]

**14. **A. Badon, D. Li, G. Lerosey, A. C. Boccara, M. Fink, and A. Aubry, “Smart optical coherence tomography for ultra-deep imaging through highly scattering media,” Sci. Adv. **2**e1600370(2016). [CrossRef] [PubMed]

**15. **H. Ammari, F. Romero, and C. Shi, “A signal separation technique for sub-cellular imaging using dynamic optical coherence tomography,” Multiscale Model. & Simul. **15**, 1155–1175 (2017). [CrossRef]

**16. **C. Demené, T. Deffieux, M. Pernot, B. Osmanski, V. Biran, J. Gennisson, L. Sieu, A. Bergel, S. Franqui, J. Correas, I. Cohen, O. Baud, and M. Tanter, “Spatiotemporal clutter filtering of ultrafast ultrasound data highly increases doppler and fultrasound sensitivity,” IEEE Transactions on Med. Imaging **34**, 2271–2285 (2015). [CrossRef]

**17. **J. Baranger, B. Arnal, F. Perren, O. Baud, M. Tanter, and C. Demené, “Adaptive spatiotemporal svd clutter filtering for ultrafast doppler imaging using similarity of spatial singular vectors,” IEEE Transactions on Med. Imaging **37**, 1574–1586 (2018). [CrossRef]

**18. **J. Scholler, “FFOCT control and acquisition software,” (2019). https://doi.org/10.5281/zenodo.3137245.

**19. **E. B. a la Guillaume, C. Apelian, E. Dalimier, C. Boccara, A. Mansuet-Lupo, G. Chassagnon, and M.-P. Revel, “Lung biopsy assessment with dynamic cell optical imaging,” Proc. SPIE 10470, Endosc. Microsc. XIII **10470**104700C (2018). [CrossRef]

**20. **P. Lévy, “Sur certains processus stochastiques homogènes,” Compos. Math. **7**, 283–339 (1940).

**21. **I. Arganda-Carreras, V. Kaynig, C. Rueden, K. W. Eliceiri, J. Schindelin, A. Cardona, and H. Sebastian Seung, “Trainable Weka Segmentation: a machine learning tool for microscopy pixel classification,” Bioinformatics **33**, 2424–2426 (2017). [CrossRef] [PubMed]

**22. **V. Mazlin, P. Xiao, E. Dalimier, K. Grieve, K. Irsch, J.-A. Sahel, M. Fink, and A. C. Boccara, “In vivo high resolution human corneal imaging using full-field optical coherence tomography,” Biomed. Opt. Express **9**, 557–568 (2018). [CrossRef] [PubMed]

**23. **P. Xiao, V. Mazlin, K. Grieve, J.-A. Sahel, M. Fink, and A. C. Boccara, “In vivo high-resolution human retinal imaging with wavefront-correctionless full-field oct,” Optica **5**, 409–412 (2018). [CrossRef]

**24. **P. Mecê, P. Xiao, V. Mazlin, J. Scholler, K. Grieve, J.-A. Sahel, M. Fink, and C. Boccara, “Towards lens-based wavefront sensorless adaptive optics full-field oct for in-vivo retinal imaging,” Proc. SPIE **10867**1086722 (2019).