## Abstract

In imaging geometries, which employ wavefront-shaping to control the light transport through a multi-mode optical fibre (MMF), this terminal hair-thin optical component acts as a minimally invasive objective lens, enabling high resolution laser-scanning fluorescence microscopy inside living tissues at depths hardly accessible by any other light-based technique. Even in the most advanced systems, the diffraction-limited foci scanning the object across the focal plane are contaminated by a stray optical signal carrying typically few tens of % of the total optical power. The stray illumination takes the shape of a randomised but reproducible speckle, and is unique for each position of the focus. We experimentally demonstrate that the performance of imaging a fluorescent object can be significantly improved, when resulting images are computationally post-processed, utilising records of intensities of all speckle-contaminated foci used in the imaging procedure. We present two algorithms based on a regularised iterative inversion and regularised direct pseudo-inversion respectively which lead to enhancement of the image contrast and resolution.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Using liquid crystal spatial light modulators, or digital micro-mirror devices (DMDs), the wavefront entering a MMF can be modulated in a way resulting in a single diffraction-limited spot, formed in the output plane of the fibre [1,2]. When a fluorescence sample is exposed to a realisation of the focus at a specific position, a fraction of the fluorescence emission signal generated propagates backwards through the same fibre towards the proximal end (adjacent to the illumination) where it can be spectrally separated from the excitation optical pathway and registered by a bucket detector (e.g. photo-multiplier tube, PMT). With the appropriate light modulation, the spot can be generated anywhere across the image plane, and providing a sequence of such foci essentially forms the basis of imaging a fluorescent object, analogously to the laser-scanning microscope [3–9]. Endoscopes exploiting MMFs recently become promising technological candidates for minimally invasive deep tissue imaging [10–12].

In the most common case where the focal plane is chosen in a close proximity to the distal fibre facet (adjacent to the sample), the focus takes the shape of a high-intensity peak surrounded by series of concentric rings, which gradually blend into a certain level of speckled background covering the whole field of view. These light structures will be referred to as *muddy modes* throughout this paper. When imaging sparse objects with high signal levels, the presence of such contamination can be tolerated as it only brings an almost uniform DC offset over the field of view. Yet for example in the case of *in-vivo* imaging of neurons and their sub-cellular structures (axons, dendrites and spines) residing deep inside the brain tissue, the speckled background may be considerably debilitating. Here, the sample exposure needs to be brought to its lowest possible levels due to photo-damage and photo-bleaching, which lowers the signal to noise level. The low signal levels badly affect the dynamic range, e.g. the ability of observing minuscule processes while having sources of large signals (somata) in the field of view. The fraction of optical power carried by the focus with respect to the total power delivered by the MMF is referred here as the power ratio (PR) and it is frequently used to assess the quality of the focus [11]. A muddy mode with PR of 1 thus represents a perfect diffraction-limited focus, while that of zero is essentially a random speckle.

The speckle background has multiple origins: Fundamentally, a perfect background-free focus cannot be generated from a superposition of the optical modes allowed to propagate through the fibre. For the MMF and wavelength used in this study, the best realisation of the focus with the allowed modes would however not reduce the resulting PR from unity by more than a fraction of a percent. Much more serious drop in PR is related to the wavefront-shaping ability. The MMF may randomise the polarisation of light, which can be responsible for PR reduction to 50%. This can be solved by optical pathways in which both polarizations are controlled separately. Shorter segments of step-index MMF under moderate bending conditions have been shown to maintain circular polarisation states well [13], which can be exploited to significantly minimise the PR drop penalty to tolerable levels even in single-polarizaion control geometries. A further significant cause of speckled background is the use of phase-only modulation (applied in the off-axis regime), which can lead to PR reduction of $\approx$10% (see Fig. 1). As detailed in [14], DMDs give the best performance when mimicking the phase-only modulation in the off-axis regime. Providing full complex modulation, although undoubtedly achievable [15], may lead to significant power losses combined with additional complexity of the experimental geometries. Raster-scanning muddy modes across the sample may be a convenient and a straightforward way to facilitate imaging, yet it may not be the optimal route with respect to the information value of the obtained images, since all information carried by the speckle gets wasted. In light of the rapid progression across the field of structure illumination microscopy [16], it is worth opening the MMF based imaging methods to computational imaging algorithms. Indeed, in an early study by Mahalati and Kahn [7], imaging has been achieved by exposing the sample to an ensemble of completely random speckle patterns, which have been carefully recorded prior imaging has taken place. The records were employed in a computational algorithm essentially providing the pseudo-inversion of the linear optical system. Practical implementation of this concept is virtually analogous to that of computational ghost imaging [17,18]. Under the optimal conditions, the algorithms are capable of providing images free from any background and with resolution enhancement achievable by the use of deconvolution (maximising contrast of all available spatial frequencies of the image). Under the presence of noise, the speckle-pattern-illumination approach however results in severe numerical artefacts, which in practical applications could render resulting images unreliable. The authors later provided a theoretical study demonstrating that illumination with localized patterns and the corresponding localized reconstruction yields the best resilience to high levels of shot noise in the image [19]. In the practice of low signal-to-noise fluorescence imaging, this makes muddy modes of the highest achievable PR the optimum illumination fields. Compared to the speckle-patterns, imaging with the basis of muddy modes yields a well recognisable image immediately, without the need for any computational processing.

Our paper therefore sets out to explore the benefits of computational image enhancement in an experimental study utilising the best illumination patterns available in our settings. We implement two approaches for retrieving the sample - a regularised iterative algorithm and a regularised direct pseudo-inversion. We show how the algorithms affect the contrast and noise across the range of available spatial frequencies and demonstrate trade-offs between the resolution and the magnitude of generated artefacts.

## 2. Experimental setup and imaging procedure

A simplified experimental geometry together with all aspects of the experimental procedure is presented in Fig. 2. The output of a laser source is divided into two optical paths, the reference and signal beam, using a polarising beam splitter (PBS). The signal beam is modulated by a DMD, and only the first diffraction order is further utilised to propagate through the MMF. The MMF output is imaged on a camera (Basler Ace acA640-750um) where it is combined with the reference during the calibration procedure described below. During imaging, the fluorescence signal is collected through the same MMF backwards and detected using a PMT.

The implementation of computational image enhancement does not require any hardware modifications with respect to previously reported holographic endo-microscopy systems, e.g. [11]. The only significant difference involves the calibration procedure, which next to providing the holographic modulations for synthesis of each muddy mode further produces records of their resulting intensity distributions (muddy modes) across the image plane. Each experiment starts with this calibration procedure. A focused beam is raster scanned across the proximal facet of the fibre. For every position, the resulting electric field on the distal end of the fibre is measured by means of phase-shifting interferometry [11]. In this manner, the transmission matrix of the system is acquired [20,21] in the basis of input foci (focal points on the proximal fibre facet) to camera pixels (on the distal end). Based on the measured transmission matrix, binary amplitude modulations are synthesised mimicking an off-axis phase-only modulation approach [14]. The holograms are then displayed sequentially on the DMD, which leads to raster-scanning the muddy modes across the image plane near the distal end of the fibre.

At this point, the reference beam is blocked and the intensity distribution of all muddy modes across the focal plane is recorded on the camera for the purpose of computational image enhancement. The 8-bit dynamic range of the used CMOS (Basler acA640-750um) camera is insufficient to record muddy modes with suitable fidelity (the average speckle background is 3000 times weaker than the peak intensity), therefore we implemented a high-dynamic range (HDR) image capturing technique via multiple camera exposures. Each of the muddy modes was recorded with three different camera exposures (exposure times 1000, 100, and 10 $\mu$s). The resulting image is formed by taking the pixel value from the longest exposure time that did not saturate the detector and scaling it according to the relative exposure time for every pixel in the image. The camera response was verified to be linear with the exposure time prior to these measurements.

Finally, for the image acquisition a fluorescent sample is inserted into the focal plane. The prepared sequence of muddy modes is sequentially exposed, essentially scanning across the imaging plane and the fluorescence signals are registered. We have used 34894 muddy modes with the foci organised across a rectangular grid with a period of 225 nm. The field of view covered the fibre core with diameter of 50 $\mu$m and at the refresh rate of the DMD we have reached an imaging speed of $\sim 0.5$ frames (full scans) per second.

## 3. Reconstruction algorithms

A single intensity measurement $ {r_{\mathrm {i}}}$ recorded by the PMT can be expressed as the sum of the illumination intensity (one muddy mode) across all pixels weighted by the density of fluorophores in the sample $\mathbf {s}$. The illumination patterns are stored in a muddy-mode matrix $\mathbf {M}$ in such way that every row of $\mathbf {M}$ contains one muddy mode flattened into a vector. Given the sample ${\mathbf {s}}$ arranged into a column vector, the recorded intensities of all scanned points **r** can be written in the form of a forward problem as

The number of recorded intensity measurements which form a single 2D image $\mathbf {r}$ will be further referred to as ${n_{\mathrm {r}}}$ and the number of pixels forming the reconstructed sample $\mathbf {s}$ as ${n_{\mathrm {s}}}$. In the simplest case the number of recorded intensities ${n_{\mathrm {r}}}$ equals the number of sample pixels that are reconstructed ${n_{\mathrm {s}}}$, but this is not a strict requirement. When ${n_{\mathrm {r}}} > {n_{\mathrm {s}}}$, the problem is over-specified and a direct solution to Eq. (1) may not exist. On the other hand when ${n_{\mathrm {r}}} < {n_{\mathrm {s}}}$ there is an infinite number of solutions. Even in this case, the sample can often still be estimated, provided some priors of the sample are available. This field of research is called compressive imaging [22] and has recently been demonstrated also in optical fibres [23].

The reconstruction formalism represented as an inverse problem now amounts to finding a solution for $\mathbf {s}$ that yields the recorded intensities $\mathbf {r}$. Due to the presence of noise in both $\mathbf {M}$ and $\mathbf {r}$, a direct matrix inversion would result in images with poor quality and thus cannot be used. An appropriate selection of the image reconstruction algorithm should be made considering the statistical distribution of the noise. In a shot-noise-limited imaging system, the recorded signal $\mathbf {r}$ has a Poisson distribution of noise. This is also the case for imaging through the MMF as shown in Appendix A. and Fig. 7. This type of distribution is indeed well expected in a photon detection process [24].

In the following text we present two approaches to reconstructing the sample (i.e. distribution of fluorophores) using direct measurement of muddy modes. One approach respects the Poisson distribution of the measured signal while the other approach exploits a simple pseudo-inversion designed originally for an intensity invariant noise.

#### 3.1 Point scanning imaging

Point scanning imaging typically implies a process in which all detected photons are assigned to a pixel illuminated by the focus – i.e. the peak of the muddy mode in case of holographic endoscopy [3,4,8]. This approach corresponds to a presumption, that every focus is a perfect delta function with all the power accumulated in a single pixel of the output field. Therefore $\textbf{M}$ is presumed to be an identity matrix and does not need to be measured. Consequently ${\mathbf {s}} \equiv {\mathbf {r}}$.

#### 3.2 Reconstruction of images with a Poisson noise (PN)

Distribution of the real sample ${\mathbf {s}}$ can be found using the maximum likelihood estimation (MLE) based approach. In principle, we find the most likely (probable) values of ${\mathbf {s}}$ which solve Eq. (1) under the assumption that the recorded values ${\mathbf {r}}$ are randomly distributed with a Poisson probability density function. The inverse solution of Eq. (1) can be found using an iterative scheme, see Eq. (21) in [24] (assuming zero background)

Assuming that the initial guess ${\mathbf {s}}^{(0)}$ is non-negative, it is guaranteed that the solution of Eq. (2) will be non-negative [24] without any further constrains of the range of the acceptable regulariser strength $\lambda$. This *non-negativity* constraint guarantees that the solution ${\mathbf {s}}$ will meet physical reality.

#### 3.3 Reconstruction of images with an intensity independent noise (IIN)

In this section we explore the performance of a reconstruction algorithm which presumes a Gaussian noise, or more universally, an intensity independent noise distribution of ${\mathbf {r}}$. Although this assumption is generally not correct for an imaging process, it allows to use a direct solution based on a simple matrix multiplication.

In this case, the solution of Eq. (1) can be found as a minimization of a least-squares problem [26]

In the absence of any regularisation, Eq. (4) can be solved by the pseudo-inverse of $\textbf{M}$, which can be computed using a singular value decomposition (SVD). **M** can then be written as:

When the regularisation term in Eq. (4) is present, its minimum can also be found using the SVD. The solution can be found analytically by adjusting the diagonal values of ${\mathbf {\Sigma }}^{+}$ as

All other non-diagonal values remain zero. The regularised pseudo-inverse can then be expressed as and an efficient estimate of the sample can be obtained from Here, the pseudo-inversion of $\textbf{M}$ for a single regulariser strength $\lambda$ has to be performed only once and the solution of the inverse problem constitutes a simple and fast matrix-vector multiplication.On the other hand, the sample estimate ${\mathbf {s}}$ is no longer guaranteed to be non-negative. Depending on the strength of the regulariser $\lambda$, a significant part of the sample estimate may yield negative values, which does not correspond to an ideal photon detection process. In reality though, negative values may sill occur in the image as a result of e.g. the detector dark noise (Fig. 7). Hence, a solution containing a small fraction of negative values may still be acceptable.

## 4. Results

All of the above described methods contain a “free” parameter – the strength of the regulariser $\lambda$. While there exist some techniques which can be used to find its optimal value such as using an L-curve in the Tikhonov regularisation [27], we determined the optimal value of $\lambda$ directly from the images based on the criteria described below.

#### 4.1 Criteria for evaluation of image quality

A key feature for the analysis of image quality is the image resolution which is determined by the contrast of the resolved spatial frequencies. Using the USAF target the contrast can be directly calculated from the intensity profiles across the bar grids. The dependence of the contrast on spatial frequencies can be summarised in the form of a contrast transfer function (CTF). In the following, the contrast will be evaluated only in discrete spatial frequencies given by the spatial frequencies of the USAF target (Fig. 3(c)).

Another key feature for image quality assessment should account for the fidelity of the image to the “ground truth” and explore how the enhancement process changed this relation. A common measure used for this purpose is the peak signal to noise ratio (PSNR) [28] defined as

In many cases where the PNSR is evaluated (e.g. for the performance of video compressing codecs) the ground truth is a raw image without compression. The ground truth can also be obtained as an average of many consequent images of the same sample, e.g. in the case of classical fluorescence microscopy [29]. In our case, the ground truth is a high resolution image of an USAF target obtained by scanning electron microscopy which was binarised and scaled down to the desired pixel resolution and registered to the reconstructed image (Fig. 3(a)).

The PSNR provides an absolute measure of the image fidelity to the ground truth. Human perception, however, is strongly affected by the structure of the image, which is taken into account by the Structural Similarity Index (SSIM) [30]. The SSIM assesses the visual impact of three image characteristics: luminance, contrast, and structure. The SSIM in the simplest form can be written as

`imregister`using the similarity transform) prior to calculation of the PSNR and SSIM.

In the following sections we use the contrast, PSNR, as well as the SSIM as metrics for comparison of the image quality of the proposed reconstruction methods.

#### 4.2 Point scanning (original image)

The fluorescent USAF target was first imaged using the point scanning method described in Sec. 3.1. Figure 3 shows the ground truth image (a) and the imaged sample (b). In total we imaged the sample using $n_{\textrm{r}} = 34 894$ muddy modes with a mean PR of 0.60 (standard deviation 0.03). The corresponding CTF in panel (d, red) was calculated from the elements indicated in panel (c) as an average contrast across the marked regions (red). The error bars represent the 95% confidence intervals. A theoretical CTF (d, blue) corresponds to an aberration-free imaging system with a circular aperture and numerical aperture NA = 0.22, i.e. NA of the MMF fibre used for imaging. The bar grids of the sample for calculation of the theoretical CTF were modelled as periodic binarised step functions with values of 0 and 1.

A very good correspondence between the measured and predicted CTF was achieved. The contrast of the experimental CTF is slightly lower than in the theoretical CTF, particularly for the lower spatial frequencies, which is caused by the presence of the non-zero (positive) background signal (see Appendix). The PSNR and SSIM were calculated from a circular ROI which follows MMF circumference (Fig. 3(c) yellow). The PSNR of the experimentally obtained image is 11.36 dB while the SSIM is 0.31.

#### 4.3 Image enhancement with Poisson noise (PN)

First, we processed the measured data using the Poisson noise based inversion method. The enhanced images were calculated using Eq. (2) for various values of the regulariser strength $\lambda$. For simplicity, the initial guess of solution **s**^{(0)} was selected as uniform over the field with intensity corresponding to mean value of the original image. Other choices of **s**^{(0)}, however, may in principle guarantee faster convergence. The number of iterations was limited by a relative change of intensity between consequent iterations of maximum 1% or a total of 50 iterations. Figure 4(a) shows the original image and a set of reconstructed images for selected $\lambda$. The corresponding CTFs are plotted in (b). The blue curve is the theoretical CTF and the red curve has been calculated from the original image. It is displayed in all graphs for comparison to the CTFs calculated from the reconstructed images (yellow). A dashed black line delineates the maximum contrast of 1. The region in which the contrast of the original image can be enhanced is thus outlined by the red curve and the black line. Figure 4(c) summarises the contrast enhancement for the evaluated elements as a function of $\lambda$. Different colours indicate different spatial frequencies. The reference values of the original contrast are dash-dotted in the same colour code. For frequencies $\leq 323$ line pairs per millimetre (lp/mm) the contrast can be enhanced up to $\sim 0.99$ for $\lambda <5\times 10^{-5}$. The grey area highlights a common range of $\lambda <6\times 10^{-5}$ for which the contrast has been enhanced for all spatial frequencies. The contrast curves peak for $\lambda$ in the range of $1.5\times 10^{-6}\leq \lambda \leq 5\times 10^{-6}$, which corresponds to enhancement by $68\%$ (for the lowest frequency) and more.

The graphs in Fig. 4(d) show the PSNR (blue) and SSIM (red) as functions of $\lambda$. The corresponding original values are in dashed. According to the PSNR metric, the reconstructed images are enhanced for $\lambda <5\times 10^{-4}$ reaching maximum of 12.91 dB. This corresponds to an enhancement by $\sim 14\%$. According to the SSIM metric, the images are enhanced in the same interval of $\lambda$ with the maximum reaching 0.4, which represents an enhancement of 29%. Both PSNR as well as SSIM reach their maximal values close to $\lambda \sim 3\times 10^{-5}$. The grey area again indicates the range of $\lambda$ for which the contrast has been enhanced as highlighted in Fig. 4(c).

The graphs in Fig. 4 suggest that the optimal choice of the regulariser is a value in the range $5 \times 10^{-6} \leq \lambda \leq 3\times 10^{-5}$, which leads to enhancement of all three metrics. Selecting a single value of $\lambda$ from this range will fine-tune the enhancement towards either contrast or PSNR and SSIM, respectively.

Importantly, the same optimal value of $\lambda$ derived from a single image can be used for image enhancement of a series of images of a changing field of view from the same sample (PN in Visualization 1).

#### 4.4 Image enhancement with intensity independent noise (IIN)

The image reconstructions using the intensity-independent-noise-based inverse modelling were calculated using Eqs. (10) and (9). Figure 5(a) compares the original image obtained via the MMF and image reconstructions using different values of $\lambda$. Small values of $\lambda$ in the transformation result in an increased level of noise (e.g. $\lambda =0.01$). This phenomenon can be expected when considering the limiting case of $\lambda = 0$, for which the transformation is basically not regularised. In a non-regularised problem, the inversion mapping of $\mathbf {r}$ to $\mathbf {s}$ operates essentially as a high-pass filter amplifying the smaller singular vectors of $\mathbf {M}$, i.e. the noise. Moreover, values of $\lambda <0.22$ yield reconstructions with negative values which fraction strongly increases with decreasing of $\lambda$. As discussed in Sec. 3.3, a small fraction of negative values can be accepted. We set the threshold for negative pixels to 1% of the total pixel count, which is fulfilled for $\lambda \geq 0.16$.

Transformations with high values of $\lambda$ on the other hand operate as a (poorly designed) low-pass filter that slightly decreases the contrast of low frequencies and suppresses the contrast of high frequencies. This results in a smoothed image with lower resolution than the original (e.g. $\lambda =5$).

Figure 5(b) shows the corresponding CTFs in green. Negative values in reconstructions with $\lambda <0.16$ were replaced by zeros so that the calculated contrast is not artificially increased. The dependence of contrast on $\lambda$ for each resolved spatial frequency is shown in Fig. 5(c). The segments of curves for $\lambda <0.16$ are plotted in a dotted line to indicate unacceptable solutions. The reference values calculated from the original image are in dash-dotted in the same colour code.

For elements with spatial frequencies $\leq 323$ lp/mm the contrast can be enhanced to 1 for $\lambda = 0.16$. The grey area highlights a common range of $0.16\leq \lambda \leq 0.18$, for which the contrast has been enhanced for all spatial frequencies. The range of $\lambda$ is very narrow in this case. Selecting higher $\lambda$ (up to 0.47) will result in an increase of contrast of the lower spatial frequencies but decrease of the higher ones. The contrast curves peak for $\lambda \leq 0.016$, which is among the unacceptable solutions. The maximum acceptable contrast enhancement is therefore for $\lambda = 0.16$, which corresponds to enhancement by 69% (for the lowest frequency) and more (for higher frequencies).

Figure 5(d) shows the PSNR and SSIM of the enhanced images in blue and red, respectively and their reference original values in dashed (11.36 dB and 0.31, respectively). The dotted segments indicate unacceptable solutions as in (c). The PSNR metric is enhanced in a range $0.16\leq \lambda \leq 0.55$ with a maximum of 12.55 dB for $\lambda =0.16$, which corresponds to enhancement by $\sim 10\%$. The SSIM is enhanced over a range $0.16<\lambda _R<0.6$ with maximal value of 0.39 for $\lambda = 0.16$. This represents $\sim 26\%$ increase compared to the original value. The grey region indicates the interval of acceptable solutions from Fig. 5(c) for which the contrast of all frequencies has been enhanced. Importantly, also the PSNR as well as the SSIM are increased compared to the original values in this range.

As in the case of the PN-based method, the same optimal value of $\lambda$ derived from a single image can be used for image enhancement of a series of images of a changing field of view from the same sample (IIN in Visualization 1).

#### 4.5 Methods comparison

The PN-based inverse modelling represents a more "correct" model for reconstruction of fluorescent images. It relies on an iterative calculation of a set of equations for every single image.

An $L_2$ norm has been implemented in order to give preferences to solutions with smaller norms, which prevents extreme sparsity of the reconstructed image and partially suppresses the high frequency noise. The only free parameter in this transformation is the regulariser strength $\lambda$, which affects the quality of the reconstructed images.

Provided the initial guess of solution is non-negative, all other iterative solutions are non-negative as well for all $\lambda$. This agrees with the physical reality of a photon detection process. The contrast of all analysed spatial frequencies in our sample can be enhanced by up to >68%. Moreover, all spatial frequencies reach the maximum contrast within the same interval $1.5\times 10^{-6} \leq \lambda \leq 5\times 10^{-6}$. The PSNR and SSIM metrics can be enhanced up to 14% and 29% respectively. Importantly, the regularised PN-based inversion yields a wide interval of $\lambda < 6 \times 10^{-5}$, for which all three image quality criteria are enhanced. Selecting a $\lambda$ from this range allows fine-tuning of the image quality towards maximum contrast enhancement or PSNR and SSIM, respectively. The reconstruction of a single frame takes $\sim 21$ seconds using a computational server equipped with 4 Intel Xeon Gold 6126 CPUs.

The IIN-based inversion on the other hand does not respect the Poisson distribution of noise in the recorded images. It allows however to use a simple least-squares minimization assuming the Gaussian type of noise with constant variance. In this case the solution can be calculated as a pseudo-inverse using SVD which constitutes a simple vector-matrix multiplication for every image. Contrary to PN-inversion, the regularised IIN solution can yield negative values. The fraction of negative values increases steeply with small values of $\lambda <0.16$. Accepting a maximum 1% of image pixels being negative, the transformation yields enhanced image quality in a narrow interval $0.16 \leq \lambda \leq 0.18$. Specifically, the contrast of all frequencies can be enhanced up to at least 69%, which is comparable to the PN-based inversion. Both PSNR as well as SSIM are enhanced in the same interval of $\lambda$ by maximum of 10% and 26%, respectively as compared to 14% and 29%, respectively in the case of the PN-based inversion. The initial computation of matrix $\textbf{M}^{+\prime }$ in the regularized SVD procedure (9) takes 15 minutes using the same computational power. The reconstruction of every additional frame using this matrix (10) takes only $\sim 0.2$ seconds which can be advantageous in reconstructions of videos with a fast feedback.

In order to test the performance of the reconstruction methods in conditions relevant to fluorescent bio-imaging, we also imaged fluorescent micro-particles (diameter 2 $\mu$m) on a cover slip (Fig. 6). The mean PR of the measurement across the field of view was 0.37 (standard deviation 0.07). This is a significantly lower value than in the case of imaging the USAF target which leads to lower contrast and resolution of the original image. Even in the case of these sub-optimal imaging conditions, both algorithms succeed to enhance the contrast, reduce the background and increase the resolution (assessed from the width of the profiles). The values of $\lambda$ have been selected from the optimal ranges determined from the USAF target reconstructions as $5x10^{-6}$ and 0.16 for the PN- and IIN-based methods, respectively. Interestingly, they are optimal also for the sample of fluorescent micro-particles. The original image and its reconstructions in (a) have been normalized for visual clarity. The profiles in (b) compare real intensity values along two lines indicated in (a). The PN-based method outperforms the IIN-based method also in the case of imaging fluorescent micro-particles.

## 5. Conclusion

We have demonstrated that the measurements of the spatial intensity profile of all focal points including their speckled background, which we termed here the “muddy modes”, may be successfully used to computationally enhance the quality of fluorescent images obtained through MMFs. Implementation of these methods does not impose any constraints on the original sample image unlike methods using compressive sensing. Our data show that the contrast of all resolved spatial frequency features has been increased simultaneously with the image fidelity evaluated by the PSNR and SSIM metrics.

We presented two approaches to image reconstruction – an iterative approach that respects the Poisson distribution of noise and a direct pseudo-inversion calculated by SVD which presumes an incorrect, intensity-independent model for the noise. Both approaches were regularised using the Tikhonov regularisation in order to suppress noise and to obtain the best solution. The methods yield reconstructions with comparable increase of contrast, however, the PN-based method outperforms the IIN-based one in image fidelity. The extent of image quality enhancement is given by solely the regulariser strength $\lambda$ in the PN-based method, while the range of available $\lambda$ is additionally limited by a requirement for non-negative pixels in the IIN-based method.

We have shown that $\lambda$ derived from one image can be successfully used to reconstruct a series of images of a changing field of view from the same and even a different sample.

The methods and their comparison however, cannot be considered universal. Their performance will be dependent on the level of detected photons. The results presented here are obtained with a photon count typically in a range 2000 in a single muddy mode. With decreasing signal to noise, the iterative algorithm is likely to become more advantageous yielding better results while the direct inversion will tend to amplify the noise in the SVD process.

Although the presented methods are intended for image enhancement of biological specimens, the presented form herein is limited to planar (2D) samples. The concepts however can be extended to the third dimension paving the way to reconstructions of volume samples such as tissue *in vivo*.

Finally, both methods presented here can be used in order to computationally enhance the image quality. While the PN-based method yields better image enhancement in general, the IIN-based method may be used as well allowing for a fast reconstruction of a series of images and opening the door to e.g. image enhancement online.

## Appendix A. Noise distribution in measured data

In order to quantify the properties of the detected noise as well as to calibrate the photodetector (PMT – Thorlabs PMT2102/M) we imaged the USAF 1952 target consequently with decreasing of laser light illumination intensity. We took about 30 steps consisting of $\sim 5-10$ frames per step. Using this measurement we analysed the pixels forming the interior of the bright square in the central part of the image, see Fig. 3. The intensity values of these pixels created a statistical ensemble for calculation of the mean and the variance. These values are plotted in Fig. 7(a). The output current from the PMT has been converted to voltage with an internal trans-impedance amplifier. The relation between variance of pixels intensity and the mean follows a linear trend (linear interpolation in red), which confirms the Poisson statistics. The inverse of this dependence slope has been used to transform the measured signal to photon count.

Figure 7(b) shows a dark noise signal detected from a covered photodetector. It has a Gaussian distribution with a mean −0.8 mV. Therefore, prior to any data processing we subtracted this mean value from the measured signal and replaced the remaining negative values by zero.

## Appendix B. List of used symbols

Variable | Description | Dimensions |
---|---|---|

${n_{\mathrm {s}}}$ | number of pixels of the sample | number |

${n_{\mathrm {r}}}$ | number of recorded intensities per image | number |

$\mathbf {M}$ | illumination intensities (muddy modes) | ${n_{\mathrm {s}}}$ rows $\times$ ${n_{\mathrm {r}}}$ columns |

$\mathbf {s}$ | sample | 1 column $\times$ ${n_{\mathrm {s}}}$ rows |

$\mathbf {r}$ | recorded intensities of the sample $\mathbf {s}$ | ${n_{\mathrm {r}}}$ rows $\times$ 1 column |

$\mathbf {t}$ | ground truth image | ${n_{\mathrm {s}}}$ rows $\times$ 1 column |

$\mathbf {U}$ | orthogonal matrix | ${n_{\mathrm {s}}}$ rows $\times$ ${n_{\mathrm {s}}}$ columns |

$\mathbf {V}$ | orthogonal matrix | ${n_{\mathrm {r}}}$ rows $\times$ ${n_{\mathrm {r}}}$ columns |

$\mathbf {\Sigma }$ | diagonal rectangular matrix | ${n_{\mathrm {s}}}$ rows $\times$ ${n_{\mathrm {r}}}$ columns |

$^{+}$ | pseudo-inverse | |

$^{+\prime }$ | regularised pseudo-inverse | |

$\lambda$ | regulariser strength |

## Funding

European Commission (101016787, CZ.1.05/ 2.1.00/ 01.0017); Akademie Věd České Republiky (RVO:68081731); European Research Council (724530); Ministerstvo Školství, Mládeže a Tělovýchovy (CZ.02.1.01/ 0.0/ 0.0/ 15_003/0000476, LO1212).

## Disclosures

The authors declare no conflicts of interest.

## Data availability

Data underlying the results presented in this paper are not publicly available at this time but maybe obtained from the authors upon reasonable request.

## References

**1. **R. Di Leonardo and S. Bianchi, “Hologram transmission through multi-mode optical fibers,” Opt. Express **19**(1), 247–254 (2011). [CrossRef]

**2. **T. Čižmár and K. Dholakia, “Shaping the light transmission through a multimode optical fibre: Complex transformation analysis and applications in biophotonics,” Opt. Express **19**(20), 18871–18884 (2011). [CrossRef]

**3. **T. Čižmár and K. Dholakia, “Exploiting multimode waveguides for pure fibre-based imaging,” Nat. Commun. **3**(1), 1027 (2012). [CrossRef]

**4. **S. Bianchi and R. Di Leonardo, “A multi-mode fiber probe for holographic micromanipulation and microscopy,” Lab Chip **12**(3), 635–639 (2012). [CrossRef]

**5. **Y. Choi, C. Yoon, M. Kim, T. D. Yang, C. Fang-Yen, R. R. Dasari, K. J. Lee, and W. Choi, “Scanner-free and wide-field endoscopic imaging by using a single multimode optical fiber,” Phys. Rev. Lett. **109**(20), 203901 (2012). [CrossRef]

**6. **I. N. Papadopoulos, S. Farahi, C. Moser, and D. Psaltis, “Focusing and scanning light through a multimode optical fiber using digital phase conjugation,” Opt. Express **20**(10), 10583–10590 (2012). [CrossRef]

**7. **R. N. Mahalati, R. Y. Gu, and J. M. Kahn, “Resolution limits for imaging through multi-mode fiber,” Opt. Express **21**(2), 1656–1668 (2013). [CrossRef]

**8. **I. N. Papadopoulos, S. Farahi, C. Moser, and D. Psaltis, “High-resolution, lensless endoscope based on digital scanning through a multimode optical fiber,” Biomed. Opt. Express **4**(2), 260–270 (2013). [CrossRef]

**9. **E. E. Morales-Delgado, D. Psaltis, and C. Moser, “Two-photon imaging through a multimode fiber,” Opt. Express **23**(25), 32158–32170 (2015). [CrossRef]

**10. **S. Ohayon, A. M. C. Aguirre, R. Piestun, and J. J. DiCarlo, “Deep brain fluorescence imaging with minimally invasive ultra-thin optical fibers,” bioRxiv (2017).

**11. **S. Turtaev, I. T. Leite, T. Altwegg-Boussac, J. M. P. Pakan, N. L. Rochefort, and T. Cizmar, “High-fidelity multimode fibre-based endoscopy for deep brain in vivo imaging,” Light: Sci. Appl. **7**(1), 92 (2018). [CrossRef]

**12. **S. A. Vasquez-Lopez, R. Turcotte, V. Koren, M. Ploschner, Z. Padamsey, M. J. Booth, T. Cizmar, and N. J. Emptage, “Subcellular spatial resolution achieved for deep-brain imaging in vivo using a minimally invasive multimode fiber,” Light: Sci. Appl. **7**(1), 110 (2018). [CrossRef]

**13. **M. Plöschner, T. Tyc, and T. Čižmár, “Seeing through chaos in multimode fibres,” Nat. Photonics **9**(8), 529–535 (2015). [CrossRef]

**14. **S. Turtaev, I. T. Leite, K. J. Mitchell, M. J. Padgett, D. B. Phillips, and T. Čižmár, “Comparison of nematic liquid-crystal and DMD based spatial light modulation in complex photonics,” Opt. Express **25**(24), 29874–29884 (2017). [CrossRef]

**15. **A. Jesacher, C. Maurer, A. Schwaighofer, S. Bernet, and M. Ritsch-Marte, “Full phase and amplitude control of holographic optical tweezers with high efficiency,” Opt. Express **16**(7), 4479–4486 (2008). [CrossRef]

**16. **R. Heintzmann and T. Huser, “Super-resolution structured illumination microscopy,” Chem. Rev. **117**(23), 13890–13908 (2017). [CrossRef]

**17. **J. H. Shapiro, “Computational ghost imaging,” Phys. Rev. A **78**(6), 061802 (2008). [CrossRef]

**18. **Y. Bromberg, O. Katz, and Y. Silberberg, “Ghost imaging with a single detector,” Phys. Rev. A **79**(5), 053840 (2009). [CrossRef]

**19. **R. Y. Gu, R. N. Mahalati, and J. M. Kahn, “Noise-reduction algorithms for optimization-based imaging through multi-mode fiber,” Opt. Express **22**(12), 15118–15132 (2014). [CrossRef]

**20. **S. M. Popoff, G. Lerosey, R. Carminati, M. Fink, A. C. Boccara, and S. Gigan, “Measuring the transmission matrix in optics: An approach to the study and control of light propagation in disordered media,” Phys. Rev. Lett. **104**(10), 100601 (2010). [CrossRef]

**21. **S. Popoff, G. Lerosey, M. Fink, A. C. Boccara, and S. Gigan, “Image transmission through an opaque material,” Nat. Commun. **1**(1), 81 (2010). [CrossRef]

**22. **A. Liutkus, D. Martina, S. Popoff, G. Chardon, O. Katz, G. Lerosey, S. Gigan, L. Daudet, and I. Carron, “Imaging with nature: Compressive imaging using a multiply scattering medium,” Sci. Rep. **4**(1), 5552 (2015). [CrossRef]

**23. **L. V. Amitonova and J. F. de Boer, “Endo-microscopy beyond the Abbe and Nyquist limits,” Light: Sci. Appl. **9**(1), 81 (2020). [CrossRef]

**24. **M. Bertero, P. Boccacci, G. Desiderà, and G. Vicidomini, “Image deblurring with poisson data: from cells to galaxies,” Inverse Probl. **25**(12), 123006 (2009). [CrossRef]

**25. **W. Karl, “3.6 - regularization in image restoration and reconstruction,” in * Handbook of Image and Video Processing (Second Edition)*, A. Bovik, ed. (Academic, 2005), Communications, Networking and Multimedia, pp. 183 – V, 2nd ed.

**26. **P. J. Verveer and T. M. Jovin, “Image restoration based on Good’s roughness penalty with application to fluorescence microscopy,” J. Opt. Soc. Am. A **15**(5), 1077–1083 (1998). [CrossRef]

**27. **P. C. Hansen, “The l-curve and its use in the numerical treatment of inverse problems,” in * Computational Inverse Problems in Electrocardiology, ed. P. Johnston, Advances in Computational Bioengineering*, (Wessex Institute of Technology, 2000), pp. 119–142.

**28. **M. Makitalo and A. Foi, “Optimal inversion of the anscombe transformation in low-count poisson image denoising,” IEEE Trans. on Image Process. **20**(1), 99–109 (2011). [CrossRef]

**29. **F. Luisier, T. Blu, and M. Unser, “Image denoising in mixed poisson–gaussian noise,” IEEE Trans. on Image Process. **20**(3), 696–708 (2011). [CrossRef]

**30. **Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. on Image Process. **13**(4), 600–612 (2004). [CrossRef]