Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Fast correlated-photon imaging enhanced by deep learning

Open Access Open Access

Abstract

Quantum imaging using photon pairs with strong quantum correlations has been harnessed to bring quantum advantages to various fields from biological imaging to range finding. Such inherent non-classical properties support the extraction of more valid signals to build photon-limited images, even in low-light conditions where the shot noise becomes dominant as light decreases to a single-photon level. Numerical optimization algorithms are possible but require thousands of photon-sparse frames, and they are thus unavailable in real time. We demonstrate fast correlated-photon imaging enhanced by deep learning as an intelligent computational strategy to discover a deeper structure in big data. Our work verifies that a convolutional neural network can efficiently solve inverse imaging problems associated with strong shot noise and background noise (electronic noise, scattered light). Our results show that we can overcome limitations due to the trade-off between imaging speed and image quality by pushing the low-light imaging technique to the single-photon level in real time, which enables deep-learning-enhanced quantum imaging for real-life applications.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. INTRODUCTION

Correlated-photon imaging, relying on the inherent quantum correlations between entangled photon pairs, has emerged as a novel technique that brings quantum enhancement to many research fields [15]. The direct imaging of non-classical correlations can reveal entanglement between position and momentum [6,7] or entanglement among optical angular momentum modes [8]. The new imaging technique enabled by single-photon-sensitive cameras can test fundamental quantum physics [916] and improve the conventional imaging systems in spatial resolution and signal-to-noise ratio (SNR) [1719].

Unfortunately, the imaging system’s performance at low-light conditions is affected by shot noise due to the quantum nature of light. Intensified scientific complementary metal-oxide-semiconductor (I-sCMOS) cameras are able to capture a single photon by virtue of image intensifier technology [2024]. To extract the single-photon signal from the noise, a reasonable threshold is set to binarize the data in each pixel, a signal over which is defined as a successful registration of one photon [25,26].

Reconstructing such photon-limited images can be converted to solve inverse problems [27,28]. Numerical algorithms [29,30] can solve the problems by treating Poisson statistics as the prior knowledge and performing complex iterative operations, such as least squares, maximum likelihood, and convex optimization. Apparently, thousands of raw images have to be collected to form a proper statistical result, which prevents the reconstruction process from being implemented in real time. Machine learning, also known as an “end-to-end” approach, can merge multiple stages into one neural network [3136]. It implies finding a direct relationship between the original objects and measured ultra-low-light images by learning from large datasets.

Here, we experimentally report a deep learning of correlated-photon imaging with a convolutional neural network (CNN), whose architecture is inspired by autoencoders and trained for extracting effective signals from various noises. As a result, deep learning algorithms show superior performance over the numerical reconstruction algorithms in image restoration and super-resolution at the single-photon level, especially the ability to achieve high-contrast imaging in real time. Our results suggest emerging deep-learning-enhanced applications in quantum imaging and quantum information processing.

2. THEORY AND EXPERIMENT

In general, the imaging measurement, $y$, is noisy compared with the original image, $x$, due to the imperfect imaging system, such as the laser’s Poisson properties, the optical elements’ limited size, and the camera’s low quantum detection efficiency. However, a statistical model describing the forward imaging system becomes ill-posed when the number of photons decreases. Thus regularization is often introduced into a designed numerical algorithm ${R_{{\rm{reg}}}}$ to search solutions that match the prior knowledge about objects [37],

$${R_{{\rm{reg}}}}(y) = \mathop {{\rm{argmin}}}\limits_x {\cal L}\{\Phi (x),y\} + h(x),$$
where $\Phi$ represents the Poisson forward model. ${\cal L}$ is an appropriate measure of error. $h$ is a regularizer to control model complexity and reduce overfitting. The choice of the regularizer is often based on practical experience. The total variation (TV) regularization has been extensively studied and applied in image denoising. It often converts the image denoising into a well-posed problem by introducing certain constraints, thus ensuring the existence and uniqueness of the optimal image, so the method has the advantage of being less disturbed by noise.

As shown in Fig. 1(a), numerical algorithms usually consist of the following process: the pre-processing step rescales the measurements to fit the inputs in the modeling step, where the reconstruction process is converted to a convex optimization program from Eq. (1). After sequential quadratic approximations, we get an optimal image corresponding to the original object.

 figure: Fig. 1.

Fig. 1. Reconstruction algorithms in an inverse imaging problem. (a) Schematic of the numerical algorithm. This process only optimizes a single-frame image to achieve a globally optimal solution by multiple steps such as least squares, maximum likelihood, and convex optimization. (b) Schematic of the learning algorithm. This process builds a network structure to directly connect the input and output images. Then a large set of training images is fed into the network to learn features of the imaging system by optimizing joint parameters. Once the training step is finished, we can rebuild images in real time.

Download Full Size | PDF

As an effective method for feature extracting and image denoising, deep learning has revolutionized our ability to use computers to perform challenging tasks. It provides another alternative, called the learning algorithm, as shown in Fig. 1(b). Original objects and their corresponding measurements, $\{{({{x_n},{y_n}})} \}_{n = 1}^N$, are fed into a neural network as inputs, the reconstruction algorithm, ${R_{{\rm{learn}}}}$, is trained for optimizing the following process:

$${R_{{\rm{learn}}}} = \mathop {{\rm{argmin}}}\limits_{{R_\theta},\theta} \sum\limits_{n = 1}^N {\cal L}\{{x_n},{R_\theta}({y_n})\} + h(\theta),$$
where ${\cal L}$ is a measure of error, $h$ is a regularizer, and $\theta$ represents possible parameters in the neural network. Learning algorithms usually consist of two parts: the encoder maps the features of input images into the hidden layer space, and then the decoder restores these features to reconstruction images. The internal parameters are adjusted to minimize the loss function from Eq. (2). Once the learning step is complete, the neural network structure can serve as the optical imaging model and recover new images from their measurements in a straightforward fashion.

Our experimental arrangement is shown in Fig. 2(a). We use a 780 nm mode-locked Ti:sapphire oscillator with a pulse duration of 140 fs and a repetition rate of 80 MHz. The laser is frequency-doubled to 390 nm in a ${\rm{Li}}{{\rm{B}}_3}{{\rm{O}}_5}$ crystal, and then the ultraviolet laser pumps a 2 mm thick $\beta - {\rm{Ba}}{{\rm{B}}_2}{{\rm{O}}_4}$ crystal to create the correlated-photon pairs via a type-II spontaneous parametric down-conversion (SPDC) process. The dichroic mirror allows the light of 390 nm to pass through, while the light of 780 nm is reflected. The single-channel count rate and two-channel coincidence reach about 500,000 and 100,000, respectively. The idler photons are detected by a single-photon avalanche diode (SPAD) to trigger the image intensifier coupled to the sCMOS camera. It is a single-photon-sensitive camera with a 5.5 megapixel sensor and quantum efficiency of about 20% at 780 nm. The signal photons, encoded with image information by a spatial light modulator (SLM), pass through a linear polarization analyzer and then are captured by the I-sCMOS camera. A wave plate and polarization beam splitter are employed to initiate and optimize signal photons by monitoring and adjusting the signal path’s photon level. To ensure the synchronization of correlated-photon pairs, we compensate for the electronic delay by adding a 10 m fiber delay line to the signal path.

 figure: Fig. 2.

Fig. 2. Experimental setup and CNN model. (a) Sketch of the experimental setup. Correlated-photon pairs with a wavelength of 780 nm are simultaneously generated from $\beta$-barium borate crystal cut for Type-II phase matching. The signal photons probe an object displayed on a SLM and reflected photons are detected by an I-sCMOS camera, whose intensifier is triggered by a SPAD detector responding to the corrected idler photons. A 10 m fiber delay line is necessary to compensate for the electronic delays between two arms. LBO, ${\rm{Li}}{{\rm{B}}_3}{{\rm{O}}_5}$ crystal; DM, dichroic mirror; BBO, $\beta - {\rm{Ba}}{{\rm{B}}_2}{{\rm{O}}_4}$ crystal; WP, wave plate; PBS, polarization beam splitter; APD, avalanched photon diode; POL, polarizer; FL, filter lens (${{780}}\pm{{10}}\;{\rm{nm}}$). (b) The CNN model. The structure includes two parts: encoder and decoder. The encoder compresses the inputs to a lower-dimensional representation, while the decoder decompresses the representation to reconstruct the inputs as best as possible.

Download Full Size | PDF

 figure: Fig. 3.

Fig. 3. Reconstruction results with both numerical and learning algorithms. (a) Original object intensity of six letters representing the word “photon” from the EMNIST dataset. (b) Direct measurements captured by the I-sCMOS camera with 5 s exposure time. (c) TV regularization reconstruction. (d) The CNN algorithm reconstruction. (e) Intensity plots from one line of the “n” letter corresponding to the original intensity.

Download Full Size | PDF

The amplitude-only SLM pixel size is ${{8}}\;{{\unicode{x00B5}{\rm m}}} \times {{8}}\;{{\unicode{x00B5}{\rm m}}}$, and the number of pixels is ${{1920}} \times {{1080}}$. On the one hand, liquid crystal molecules can modulate photons, simulating reflection and other optical phenomena in real-world scenes. On the other hand, the SLM can keep the optical system stable and update the samples in real time, which provides excellent convenience for obtaining a large number of measured images. The entire optical system makes real-time correlated-photon imaging possible.

In low-light situations, such as in the long-range imaging where the number of photons in each laser pulse decay exponentially with propagation distance, or in the fluorescence microscopic imaging where the samples are sensitive to bright light, imaging would face a dilemma, in which only a few numbers of photons can be obtained. The limited photons are vulnerable to noise, which can lead to a significant difference in imaging results. These uncertain noises are primary from the shot noise of the light source itself, shot noise of the dark signal, and readout noise. Therefore, optimizing these inherent noises becomes crucial in correlated-photon imaging. See Supplement 1 for details.

We choose two different sample categories: handwritten images from the Extended MNIST (EMNIST) [38] and MNIST [39] database. For each noise level, a corresponding CNN model is trained. The layout of our specific CNN structure is schematically shown in Fig. 2(b), which is inspired by the “encoder-decoder network” architecture [40,41]. The encoder consists of a stack of convolutional layers and maxpooling layers. Specifically, each convolutional layer has a filter size of ${{3}} \times {{3}}$, stride size of 1, and zero-padding with a size of 1, followed by a max pooling layer with a kernel size of ${{2}} \times {{2}}$ and a stride size of 2. The decoder comprises deconvolutional layers and unpooling layers to perform the opposite operations to the encoder. The activate function of all layers is rectified linear units (ReLU), which allows for fast and effective training on the large and complex databases. See Supplement 1 for details.

The data process consists of two phases: training and test. The EMNIST database images are randomly split into a training set and a test set containing 10,521 and 1169 images, respectively. All images are uploaded to the SLM frame by frame, and the I-sCMOS camera records the results. The measured images constitute the inputs to the network, and the true EMNIST images are the targets. The training procedure updates the weights of the network, and once it is completed, the model performances are evaluated throughout the test set. The CNN optimization process is based on Python version 3.7 performed on a GTX1650 graphics card (NVIDIA).

3. RESULTS

We choose the “photon” letter samples to demonstrate the performance from different reconstruction algorithms, as shown in Figs. 3(a)–3(d). Direct measurements in the camera plane are very noisy compared with the original objects in the low-light condition. For TV regularization, we display the optimal results after minimizing Eq. (1). (See Supplement 1 for details.) When there is only one frame of data, this scheme has little effect on image reconstruction. In contrast, the CNN algorithm is very efficient in suppressing noise. Besides, we plot an intensity map of the “n” image, as shown in Fig. 3(e). The reconstruction results from the CNN model can fit the original images well, which is superior to TV regularization.

The negative films of original images are collected, but sharp edges about the objects can still be well reconstructed by the end-to-end method. After the training process, the CNN model can recreate the original images but is insensitive to the training data because the inputs and outputs are no longer the same. Our CNN model does not simply develop a mapping that memorizes the training data but learns to map the input images toward a lower dimension. If this dimension accurately characterizes the original data, we can effectively filter out the unwanted noise. Therefore, the CNN model learns a general encoding and decoding process independent of samples. Moreover, the negative films are common in medical imaging, especially x ray imaging, where minimizing the radiation is beneficial to the patient. This is a completely different mechanism from numerical algorithms and is also the kernel to accelerate correlated-photon imaging.

To measure the difference between the original and reconstructed images, we compare the root mean square error (RMSE) defined as follows:

$${\rm{RMSE}} = \sqrt {\frac{{\sum\limits_{i = 1}^m \sum\limits_{j = 1}^n {{[R(i,j) - O(i,j)]}^2}}}{{m \times n}}} ,$$
where $O$ is the original image, $R$ is the reconstructed image obtained using several different denoising algorithms, and $m$ and $n$ represent the pixel size of images. As is shown in Table 1, a smaller RMSE value corresponds to a smaller difference and better denoising performance. This result indicates that the CNN algorithm has an advantage in suppressing noise and reconstructing images.
Tables Icon

Table 1. Comparison of the Root Mean Square Error for Different Algorithms

 figure: Fig. 4.

Fig. 4. Reconstruction results with learning algorithms at 0.8 photons/pixel on average. (a) Original object intensity of four digits “1905” from the MNIST dataset. (b) Direct measurements in the plane of the camera. (c) The CNN algorithm reconstruction verifies strong robustness in noisy environment. (d) The mean square error between the network outputs and original handwritten digits drops down as the epoch increases.

Download Full Size | PDF

Further, to verify the robustness of the learning algorithm in ultra-low-light conditions, fewer correlated photons are illuminated on the samples with ${\sim}0.8$ photons per pixel on average. We prepare another dataset containing 6690 handwritten digits downloaded from the MNIST database, including 6021 images as a training set and 669 images as a test set. After optimizing the CNN model weights, we display the “1905” digits in Figs. 4(a)–4(c). Fewer photons result in indistinguishable raw signal measurements. Interestingly, the reconstructed images still give high contrast. Thus, the CNN algorithm can protect the signals from noise damage and demonstrate strong robustness.

Besides, to optimize the CNN structure, we construct the networks with five layers, seven layers, and nine layers, as shown in Fig. 4(d). After 1000 epochs, the mean square error (MSE) between the network outputs and the original handwritten digits drops down to 0.25 for five layers and becomes steady, which indicates that our network has not overfitted to the training dataset. However, for seven layers and nine layers, the cost difference becomes smaller after 2000 epochs. Using the least layers is necessary to realize the optimal denoising results since it can reduce the computational cost significantly and save huge computing resources. See Supplement 1 for details.

 figure: Fig. 5.

Fig. 5. Summary of the state-of-the-art imaging experiments at a single-photon level. To reconstruct a high-contrast image, numerical algorithms require sparse single photons per frame; thus, intensified CCD and CMOS cameras have to accumulate thousands of frames. The emergence of new imaging devices like SPAD cameras makes less necessary photons and high contrast possible, but numerical algorithms are still a barrier to realize fast imaging. Deep learning algorithms effectively solve this problem, achieving a win-win for both imaging speed and quality.

Download Full Size | PDF

We summarize the state-of-the-art single-photon imaging experiments, as shown in Fig. 5. To quantify this comparison, we define the image contrast as

$$C = \frac{{{I_{{\max}}} - {I_{{\min}}}}}{{{I_{{\max}}} + {I_{{\min}}}}}.$$

Imaging systems differ in various applications, leading to a trade-off in visibility and the time spent on collecting data. Compared with the passive imaging schemes, active imaging enables higher contrast by high-precision nanosecond time gate filtering out noise from the signals. However, intensified CCD and CMOS architectures suffer from the low frame rate. As a result, the traditional reconstruction algorithms can enhance the image only by collecting thousands of sparse-photon frames, which wastes a lot of time [4244]. Another active imaging scheme is based on SPAD cameras capable of counting and time stamping single photons with picosecond time resolution. Single-pixel scanning imaging [45] with excellent visibility acquires a megapixel scan in approximately 20 min. SPAD array imaging [46] achieves great improvement, but it still requires hundreds of seconds. It can be seen that the balance between imaging quality and imaging speed is the result of the simultaneous improvement of hardware devices and algorithms. Our deep-learning-based reconstruction algorithm based on the I-sCMOS camera can realize fast imaging at a second-level speed while simultaneously maintaining high visibility.

4. DISCUSSION

In summary, we experimentally demonstrate a fast correlated-photon imaging scheme enhanced by deep learning algorithms to recover objects illuminated with a SPDC source. We show the CNN model is superior to the classical algorithm in denoising and fast reconstruction with ultra-weak illumination. First, the reconstruction process is independent of prior knowledge, such as Poisson distribution or point spread function. Besides, large network parameters trained with many images can overcome the uncertainty in the optimization process, where we can find the best solutions for the non-convex problem. Furthermore, once trained, the CNN model remains nonlinear and highly complex, which keeps a good response to the imaging system. Finally, it can complete high-contrast image restoration by collecting a limited number of faint pictures, which provides the possibility of real-time imaging. In this work, the optimization results are built on a single frame, which breaks the rule that it will take more time to collect more photons to obtain a high contrast image. Although the CNN structure is a standard model for denoising in this experiment, in the long term, we believe the pioneering work boosts a closer connection between deep learning and quantum imaging, which will pave the way for broader and practical exploitation of quantum-enhanced imaging.

Funding

National Key Research and Development Program of China (2019YFA0706302, 2019YFA0308700, 2017YFA0303700, ); National Natural Science Foundation of China (11690033, 11761141014, 11904229, 61734005); Science and Technology Commission of Shanghai Municipality (STCSM) (17JC1400403); Shanghai Municipal Education Commission (SMEC) (2017-01-07-00-02-E00049); Shanghai Municipal Science and Technology Major Project (2019SHZDZX01); .

Acknowledgment

The authors thank Jian-Wei Pan for helpful discussions. X.-M. J. acknowledges additional support from a Shanghai talent program and from Zhiyuan Innovative Research Center of Shanghai Jiao Tong University.

Disclosures

The authors declare no conflicts of interest.

Supplemental document

See Supplement 1 for supporting content.

REFERENCES

1. M. Jachura and R. Chrapkiewicz, “Shot-by-shot imaging of Hong–Ou–Mandel interference with an intensified SCMOS camera,” Opt. Lett. 40, 1540–1543 (2015). [CrossRef]  

2. G. Brida, L. Caspani, A. Gatti, M. Genovese, A. Meda, and I. R. Berchera, “Measurement of sub-shot-noise spatial correlations without background subtraction,” Phys. Rev. Lett. 102, 213602 (2009). [CrossRef]  

3. R. Chrapkiewicz, M. Jachura, K. Banaszek, and W. Wasilewski, “Hologram of a single photon,” Nat. Photonics 10, 576–579 (2016). [CrossRef]  

4. K. Sun, J. Gao, M.-M. Cao, Z.-Q. Jiao, Y. Liu, Z.-M. Li, E. Poem, A. Eckstein, R.-J. Ren, X.-L. Pang, H. Tang, I. A. Walmsley, and X.-M. Jin, “Mapping and measuring large-scale photonic correlation with single-photon imaging,” Optica 6, 244–249 (2019). [CrossRef]  

5. X. Qiu, D. Zhang, W. Zhang, and L. Chen, “Structured-pump-enabled quantum pattern recognition,” Phys. Rev. Lett. 122, 123901 (2019). [CrossRef]  

6. J. C. Howell, R. S. Bennink, S. J. Bentley, and R. W. Boyd, “Realization of the Einstein-Podolsky-Rosen paradox using momentum- and position-entangled photons from spontaneous parametric down conversion,” Phys. Rev. Lett. 92, 210403 (2004). [CrossRef]  

7. R. S. Aspden, D. S. Tasca, R. W. Boyd, and M. J. Padgett, “EPR-based ghost imaging using a single-photon-sensitive camera,” New J. Phys. 15, 073032 (2013). [CrossRef]  

8. P.-A. Moreau, E. Toninelli, T. Gregory, R. S. Aspden, P. A. Morris, and M. J. Padgett, “Imaging bell-type nonlocal behavior,” Sci. Adv. 5, eaaw2563 (2019). [CrossRef]  

9. A. G. Basden, C. A. Haniff, and C. D. Mackay, “Photon counting strategies with low-light-level CCDs,” Mon. Not. R. Astron. Soc. 345, 985–991 (2003). [CrossRef]  

10. B. M. Jost, A. V. Sergienko, A. F. Abouraddy, B. E. A. Saleh, and M. C. Teich, “Spatial correlations of spontaneously down-converted photon pairs detected with a single-photon-sensitive CCD camera,” Opt. Express 3, 81–88 (1998). [CrossRef]  

11. L. Zhang, L. Neves, J. S. Lundeen, and I. A. Walmsley, “A characterization of the single-photon sensitivity of an electron multiplying charge-coupled device,” J. Phys. B 42, 114011 (2009). [CrossRef]  

12. J.-L. Blanchet, F. Devaux, L. Furfaro, and E. Lantz, “Purely spatial coincidences of twin photons in parametric spontaneous down-conversion,” Phys. Rev. A 81, 043825 (2010). [CrossRef]  

13. E. Toninelli, M. P. Edgar, P.-A. Moreau, G. M. Gibson, G. D. Hammond, and M. J. Padgett, “Sub-shot-noise shadow sensing with quantum correlations,” Opt. Express 25, 21826–21840 (2017). [CrossRef]  

14. G. Brida, M. Genovese, and I. R. Berchera, “Experimental realization of sub-shot-noise quantum imaging,” Nat. Photonics 4, 227–230 (2010). [CrossRef]  

15. Y. Wang, X.-L. Pang, Y.-H. Lu, J. Gao, Y.-J. Chang, L.-F. Qiao, Z.-Q. Jiao, H. Tang, and X.-M. Jin, “Topological protection of two-photon quantum correlation on a photonic chip,” Optica 6, 955–960 (2019). [CrossRef]  

16. X.-Y. Xu, X.-L. Huang, Z.-M. Li, J. Gao, Z.-Q. Jiao, Y. Wang, R.-J. Ren, H. P. Zhang, and X.-M. Jin, “A scalable photonic computer solving the subset sum problem,” Sci. Adv. 6, eaay5853 (2020). [CrossRef]  

17. T. Aidukas, P. C. Konda, A. R. Harvey, M. J. Padgett, and P.-A. Moreau, “Phase and amplitude imaging with quantum correlations through Fourier ptychography,” Sci. Rep. 9, 1–9 (2019). [CrossRef]  

18. R. S. Aspden, N. R. Gemmell, P. A. Morris, D. S. Tasca, L. Mertens, M. G. Tanner, R. A. Kirkwood, A. Ruggeri, A. Tosi, R. W. Boyd, G. S. Buller, R. H. Hadfield, and M. J. Padgett, “Photon-sparse microscopy: visible light imaging using infrared illumination,” Optica 2, 1049–1052 (2015). [CrossRef]  

19. V. Parodi, E. Jacchetti, R. Osellame, G. Cerullo, D. Polli, and M. T. Raimondi, “Nonlinear optical microscopy: From fundamentals to applications in live bioimaging,” Front. Bioeng. Biotechnol. 8, 585363 (2020). [CrossRef]  

20. M. Aßmann, F. Veit, M. Bayer, M. van der Poel, and J. M. Hvam, “Higher-order photon bunching in a semiconductor microcavity,” Science 325, 297–300 (2009). [CrossRef]  

21. J. Wiersig, C. Gies, F. Jahnke, M. Aßmann, T. Berstermann, M. Bayer, C. Kistner, S. Reitzenstein, C. Schneider, S. Höfling, A. Forchel, C. Kruse, J. Kalden, and D. Hommel, “Direct observation of correlations between individual photon emission events of a microcavity laser,” Nature 460, 245–249 (2009). [CrossRef]  

22. M. Aßmann, F. Veit, M. Bayer, C. Gies, F. Jahnke, S. Reitzenstein, S. Höfling, L. Worschech, and A. Forchel, “Ultrafast tracking of second-order photon correlations in the emission of quantum-dot microresonator lasers,” Phys. Rev. B 81, 165314 (2010). [CrossRef]  

23. O. Schwartz, J. M. Levitt, R. Tenne, S. Itzhakov, Z. Deutsch, and D. Oron, “Superresolution microscopy with quantum emitters,” Nano Lett. 13, 5832–5836 (2013), PMID: 24195698. [CrossRef]  

24. M. P. Edgar, D. S. Tasca, F. Izdebski, R. E. Warburton, J. Leach, M. Agnew, G. S. Buller, R. W. Boyd, and M. J. Padgett, “Imaging high-dimensional spatial entanglement with a camera,” Nat. Commun. 3, 1–6 (2012). [CrossRef]  

25. Y. Wen, B. J. Rauscher, R. G. Baker, M. C. Clampin, P. Fochie, S. R. Heap, G. Hilton, P. Jorden, D. Linder, B. Mott, P. Pool, A. Waczynski, and B. Woodgate, “Individual photon counting using e2v L3 CCDs for low background astronomical spectroscopy,” Proc. SPIE 6276, 463–470 (2006). [CrossRef]  

26. E. Lantz, J.-L. Blanchet, L. Furfaro, and F. Devaux, “Multi-imaging and Bayesian estimation for photon counting with EMCCDs,” Mon. Not. R. Astron. Soc. 386, 2262–2270 (2008). [CrossRef]  

27. M. T. McCann, K. H. Jin, and M. Unser, “Convolutional neural networks for inverse problems in imaging: A review,” IEEE Signal Process. Mag. 34(6), 85–95 (2017). [CrossRef]  

28. A. Lucas, M. Iliadis, R. Molina, and A. K. Katsaggelos, “Using deep neural networks for inverse problems in imaging: Beyond analytical methods,” IEEE Signal Process. Mag. 35(1), 20–36 (2018). [CrossRef]  

29. A. Katsaggelos, R. Molina, and J. Mateos, Super Resolution of Images and Video, Synthesis Lectures on Image, Video, and Multimedia Processing (Morgan & Claypool, 2007), p. 1.

30. Z. Chen, S. D. Babacan, R. Molina, and A. K. Katsaggelos, “Variational Bayesian methods for multimedia problems,” IEEE Trans. Multimedia 16, 1000–1017 (2014). [CrossRef]  

31. V. Jain and S. Seung, “Natural image denoising with convolutional networks,” in Advances in Neural Information Processing Systems (2009), pp. 769–776.

32. H. C. Burger, C. J. Schuler, and S. Harmeling, “Image denoising: Can plain neural networks compete with BM3D?” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2012), pp. 2392–2399.

33. J. Xie, L. Xu, and E. Chen, “Image denoising and inpainting with deep neural networks,” in Advances in Neural Information Processing Systems (2012), pp. 341–349.

34. R. Wang and D. Tao, “Non-local auto-encoder with collaborative stabilization for image restoration,” IEEE Trans. Image Process. 25, 2117–2129 (2016). [CrossRef]  

35. Y. Chen and T. Pock, “Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration,” IEEE Trans. Pattern Anal. Mach. Intell. 39, 1256–1272 (2016). [CrossRef]  

36. K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Transactions on Image Process. 26, 3142–3155 (2017). [CrossRef]  

37. Z. T. Harmany, R. F. Marcia, and R. M. Willett, “This is SPIRAL-TAP: Sparse Poisson intensity reconstruction algorithms–theory and practice,” IEEE Trans. Image Process. 21, 1084–1096 (2012). [CrossRef]  

38. G. Cohen, S. Afshar, J. Tapson, and A. van Schaik, “EMNIST: an extension of MNIST to handwritten letters,” arXiv:1702.05373 (2017).

39. Y. LeCun, C. Cortes, and C. Burges, “MNIST handwritten digit database,” AT&T Labs, 2010, http://yann.lecun.com/exdb/mnist.

40. X.-J. Mao, C. Shen, and Y.-B. Yang, “Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections,” in Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS (2016), pp. 2810–2818.

41. V. Badrinarayanan, A. Kendall, and R. Cipolla, “SEGNET: A deep convolutional encoder-decoder architecture for image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017). [CrossRef]  

42. H. Tang, C. Di Franco, Z.-Y. Shi, T.-S. He, Z. Feng, J. Gao, K. Sun, Z.-M. Li, Z.-Q. Jiao, T.-Y. Wang, M. S. Kim, and X.-M. Jin, “Experimental quantum fast hitting on hexagonal graphs,” Nat. Photonics 12, 754–758 (2018). [CrossRef]  

43. P. A. Morris, R. S. Aspden, J. E. Bell, R. W. Boyd, and M. J. Padgett, “Imaging with a small number of photons,” Nat. Commun. 6, 1–6 (2015). [CrossRef]  

44. Y. Wang, Y.-H. Lu, F. Mei, J. Gao, Z.-M. Li, H. Tang, S.-L. Zhu, S. Jia, and X.-M. Jin, “Direct observation of topology from single-photon dynamics,” Phys. Rev. Lett. 122, 193903 (2019). [CrossRef]  

45. A. Kirmani, D. Venkatraman, D. Shin, A. Colaço, F. N. Wong, J. H. Shapiro, and V. K. Goyal, “First-photon imaging,” Science 343, 58–61 (2014). [CrossRef]  

46. D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7, 1–8 (2016). [CrossRef]  

Supplementary Material (1)

NameDescription
Supplement 1       Supplemental document

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (5)

Fig. 1.
Fig. 1. Reconstruction algorithms in an inverse imaging problem. (a) Schematic of the numerical algorithm. This process only optimizes a single-frame image to achieve a globally optimal solution by multiple steps such as least squares, maximum likelihood, and convex optimization. (b) Schematic of the learning algorithm. This process builds a network structure to directly connect the input and output images. Then a large set of training images is fed into the network to learn features of the imaging system by optimizing joint parameters. Once the training step is finished, we can rebuild images in real time.
Fig. 2.
Fig. 2. Experimental setup and CNN model. (a) Sketch of the experimental setup. Correlated-photon pairs with a wavelength of 780 nm are simultaneously generated from $\beta$-barium borate crystal cut for Type-II phase matching. The signal photons probe an object displayed on a SLM and reflected photons are detected by an I-sCMOS camera, whose intensifier is triggered by a SPAD detector responding to the corrected idler photons. A 10 m fiber delay line is necessary to compensate for the electronic delays between two arms. LBO, ${\rm{Li}}{{\rm{B}}_3}{{\rm{O}}_5}$ crystal; DM, dichroic mirror; BBO, $\beta - {\rm{Ba}}{{\rm{B}}_2}{{\rm{O}}_4}$ crystal; WP, wave plate; PBS, polarization beam splitter; APD, avalanched photon diode; POL, polarizer; FL, filter lens (${{780}}\pm{{10}}\;{\rm{nm}}$). (b) The CNN model. The structure includes two parts: encoder and decoder. The encoder compresses the inputs to a lower-dimensional representation, while the decoder decompresses the representation to reconstruct the inputs as best as possible.
Fig. 3.
Fig. 3. Reconstruction results with both numerical and learning algorithms. (a) Original object intensity of six letters representing the word “photon” from the EMNIST dataset. (b) Direct measurements captured by the I-sCMOS camera with 5 s exposure time. (c) TV regularization reconstruction. (d) The CNN algorithm reconstruction. (e) Intensity plots from one line of the “n” letter corresponding to the original intensity.
Fig. 4.
Fig. 4. Reconstruction results with learning algorithms at 0.8 photons/pixel on average. (a) Original object intensity of four digits “1905” from the MNIST dataset. (b) Direct measurements in the plane of the camera. (c) The CNN algorithm reconstruction verifies strong robustness in noisy environment. (d) The mean square error between the network outputs and original handwritten digits drops down as the epoch increases.
Fig. 5.
Fig. 5. Summary of the state-of-the-art imaging experiments at a single-photon level. To reconstruct a high-contrast image, numerical algorithms require sparse single photons per frame; thus, intensified CCD and CMOS cameras have to accumulate thousands of frames. The emergence of new imaging devices like SPAD cameras makes less necessary photons and high contrast possible, but numerical algorithms are still a barrier to realize fast imaging. Deep learning algorithms effectively solve this problem, achieving a win-win for both imaging speed and quality.

Tables (1)

Tables Icon

Table 1. Comparison of the Root Mean Square Error for Different Algorithms

Equations (4)

Equations on this page are rendered with MathJax. Learn more.

R r e g ( y ) = a r g m i n x L { Φ ( x ) , y } + h ( x ) ,
R l e a r n = a r g m i n R θ , θ n = 1 N L { x n , R θ ( y n ) } + h ( θ ) ,
R M S E = i = 1 m j = 1 n [ R ( i , j ) O ( i , j ) ] 2 m × n ,
C = I max I min I max + I min .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.