## Abstract

Ghost imaging incorporating deep learning technology has recently attracted much attention in the optical imaging field. However, deterministic illumination and multiple exposure are still essential in most scenarios. Here we propose a ghost imaging scheme based on a novel dynamic decoding deep learning framework (Y-net), which works well under both deterministic and indeterministic illumination. Benefited from the end-to-end characteristic of our network, the image of a sample can be achieved directly from the data collected by the detector. The sample is illuminated only once in the experiment, and the spatial distribution of the speckle encoding the sample in the experiment can be completely different from that of the simulation speckle in training, as long as the statistical characteristics of the speckle remain unchanged. This approach is particularly important to high-resolution x-ray ghost imaging applications due to its potential for improving image quality and reducing radiation damage.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Ghost imaging (GI) extracts the information of an object by measuring the intensity correlation of optical fields, which has now been widely applied in remote sensing, super-resolution, x-ray imaging, atoms and electron imaging etc. [1–13]. In the traditional GI, a large number of measurements are required to calculate the ensemble average as an unbiased estimation of the sample’s image, which brings a heavy burden to the imaging system. Later on, the compressive sensing framework has been combined with GI schemes, and the image quality has been greatly improved by exploiting the sparse prior of objects [14–19]. To improve the sampling efficiency, some researches have taken advantage of the coding theory and designed the sensing matrix by optimizing the illuminating optical fields [20,21]. In the meantime, computational ghost imaging has emerged [22–24]. It is a deterministic measuring process, in which the incident light is preset or prerecorded. Recently, deep learning techniques have been introduced into computational ghost imaging and the measurement rate goes down to a cheerful level that is comparable with compressive sensing and even lower [25–27]. In their work, the illuminating speckles encoding the sample are static, which means they are the same during the training and imaging process. However, in many GI scenarios, such as high-resolution x-ray ghost imaging, particle ghost imaging, and some remote sensing applications, the intensity distribution of the illumination fluctuates randomly and is difficult to be precisely manipulated [7,9,10,12]. Besides, due to the quantum nature of photons, this static condition may not be satisfied when the light is weak [28,29].

As a widespread machine learning framework, deep learning has demonstrated its magic power in many fields. In the literature of optical imaging, the deep learning inspired approaches have been demonstrated in compressive sensing [30,31], scatter imaging [32–36], super-resolution [37,38], microscopy [39,40] and phase retrieval [28,41]. The imaging problems can be expressed as an optimization process

where $\mathbf {x}$ is the signal, $\mathbf {y}$ is the measurement, $\|\cdot \|_2$ is the $\ell _2$ norm, $\Theta$ is the forward operation, $\Phi$ and $\gamma$ are the regularization operation and its weight factor respectively. A typical deep learning strategy is to learn the representation $\Psi$ of the signal $\mathbf {x}$ from the data set $\{\mathbf {x}\}$ and then optimize the latent variable ${\mathbf {z}}$ [30]. Note that $\mathbf {x} = \Psi (\mathbf {z})$ and the latent variable $\mathbf {z}$ usually belong to a lower dimension space, so the optimization is more convenient than finding $\mathbf {x}$ directly. Another strategy is to find the inverse operation of the forward operation $\Theta$ and the regularization operation $\Phi$ though the data set $\{\mathbf {x},\mathbf {y}\}$ [32–34,41]. Physics-informed priors are important for generalization during the training phase [28,42]. In this strategy, the learned map $f \colon \mathbf {y} \mapsto \mathbf {x}$ is directly related to the measurement $\mathbf {y}$, so this end-to-end map relies on a specifically determined $\Theta$. The problem becomes more challenging for a dynamical system where the forward operation $\Theta$ is not deterministic. Some researches focused on using the memory effect of scattering medium to characterize the statistical similarity which is invariant in scattering imaging [43,44]. Li obtained the network map using a set of fixed diffusers after a long time of training data acquisition [36], and multiple scattering images are inevitably necessary to capture sufficient statistical variations.In this paper, we propose a dynamic decoding deep learning framework (Y-net) for GI systems, in which the intensity distribution of the illuminating light is indeterministic. A ghost imaging scheme based on the Y-net has been demonstrated. The network is trained with simulation data, and the testing results show that it works well with experimental data due to its dynamic capability. In our scheme, the sample needs to be illuminated only once and the image can be obtained when the illuminating light is very weak. Thus, it provides potential application in x-ray imaging, in which the exposure should be reduced as much as possible considering the radiation damage of samples. In addition, our approach is based on an end-to-end network, so that the image of a sample can be directly obtained from the data collected by the detector without extra processing, such as initial input image estimation, subsequent phase recovery, etc.

## 2. Methods

#### 2.1 Imaging scheme

Figure 1 shows the scheme of our method, including the optical setup of the ghost imaging system and the data flow illustrations. As shown in Fig. 1(a), a laser beam illuminates a rotating ground glass to produce pseudo-thermal light, and a diaphragm behind the ground glass is used to control the source size $\sigma _s$. A beam splitter divides the incident light into two beams: a reference beam propagating directly to the detector and a test beam with the sample inserted in the optical path. In the test beam, the distance from source to sample and the distance from sample to detector are $d_1$ and $d_2$, respectively. In the reference beam, the distance from source to detector is $d = d_1 + d_2$. The speckle distribution in the reference beam is recorded by ${\rm CCD_1}$, which serves as an ordinary panel detector. To improve the measurement efficiency, a strategy of parallel sampling is adopted in the test beam. Then ${\rm CCD_2}$ serves as a multi-point detector to collect the intensity signals of multiple points simultaneously. The number of parallel sampling is ${\rm M}$. The M intensity data are used to construct a sparse image. The M intensity values and their positions recorded by the detector correspond to the M nonzero pixels in the sparse image. The speckle image and the constructed sparse image are transferred to a well-trained Y-net as the input, and the output of the network is the image of the sample. Figure 1(b) describes the data flow in the training and testing process. During the training stage, one diffuser is randomly generated for each training sample, which means different diffusers for different samples. The corresponding reference speckle and M intensities are obtained by simulation and fed to the network. Then the temporary output of the network is compared with the label, and the parameters of the network are updated according to the loss function calculation. During the testing stage, a real sample in the optical system is illuminated by the speckle generated with a diffuser unseen in training. Then the speckle and M intensities acquired by the detectors are input into the network to get the image of the sample.

The speckle distribution recorded by ${\rm CCD_1}$ can be described as following

where $E_0(x_0)$ represents the optical field on the source plane, $h_d$ is the free-space transfer function from source to ${\rm CCD_1}$. The intensity distribution on the detector plane of ${\rm CCD_2}$ isIn traditional Fourier-transform ghost imaging (FGI), when satisfying $d = d_1 + d_2$, an ensemble average operation $\langle \cdot \rangle$ is used to obtain the Fourier-transform pattern of the sample, which is [7,45]

In our Y-net based GI scheme, the two steps of Fourier-transform pattern acquisition and phase retrieval are integrated. The imaging problem can be modeled as Eq. (1) where the forward operation $\Theta$ is a composite operation $\mathbf {A}|\mathbf {T}(\cdot )|^2$. Here we use $\mathbf {x}$ represents the image of the object, $\mathbf {T}$ denotes the Fourier-transform matrix, and $|\cdot |^2$ is the point-wise square of modulus. The speckle in the reference beam is randomly distributed and indeterministic, so are the operation $\mathbf {A}$ and the composite operation $\mathbf {A}|\mathbf {T}(\cdot )|^2$. To describe this indeternisitic system, we modify the model in Eq. (1) by adding an extra penalty concerning the dynamic measuring process, then the imaging problem can be expressed as

We solve this problem under the framework of deep learning. Instead of optimizing the signal $\mathbf {x}$, we optimize the model parameters by exploiting the training data and try to establish a direct map $f \colon \mathbf {y} \mapsto \mathbf {x}$ to obtain the image of a sample directly from measurement. The optimization is subject to the network parameters $\Omega ^{-1} = \{\Theta ^{-1}, \Pi ^{-1}, \Phi ^{-1}\}$ and we have

#### 2.2 Network architecture and training

The overall structure of our Y-net consists of two encoders and one decoder as shown in Fig. 2. The speckle and intensities recorded by the detectors are input into the two encoders separately in a symmetric way. The M intensities are transferred to the network as a sparse non-overlapping intensity distribution. Each encoder is composed of five convolutional layers with a batch normalization layer before the first convolutional layer and a max-pooling layer after each of the other four convolutional layers. Then the two encoders are merged by subtraction, and a decoder is built to recover the signal. The decoder has four upsampling layers, a dropout layer, and ten convolutional layers. More specifically, the decoder path firstly goes through four upsampling layers, each of which is followed by a convolutional layer, then passes a dropout layer followed by a convolutional layer, and next is a convolutional layer with a stride size of two, and finally through four convolutional layers with zero padding mode successively. The max-pooling size and the unsamping size are $2$. Without special explanation, all the convolutional filters have a size of $4\times 4$ and the padding mode is $1$. All the layers are followed by a rectified linear unit operation serving as an activation function, except for the last layer handled by a sigmoid function to restrict the range of pixels.

Our model is inspired by the variational autoencoder(VAE) [46] which is widely applied in unsupervised learning. The batch-normalization layer is necessary to accelerate the convergence of loss function. The filter size of convolutional layers is proportional to the speckle size in the reference beam. The speckle size in our case is $16.7\mu m$ (2.85 pixels), and we choose $4\times 4$ filters. The merge of the embeddings contributes to the dynamic capability of our network. If the information that comes from the reference speckle is blocked, our network will lose the ability to deal with dynamic illumination. The dropout layer is used to avoid over fitting, and the dropout rate $p$ is carefully chosen to guarantee the generalization of the network. We observed the impact of different dropout rate, and chose a dropout rate $p=0.6$.

This dual-encoder network is designed to extract the image information of an sample from the correlated speckle and intensity data recorded by the two detectors. From the perspective of coding theory, the optical measurement process can be regarded as the encoding part in our imaging scheme. The information of samples is encoded in the intensity signals detected in the test beam, while the original optical field is observed by the detector in the reference beam. In the training process, our Y-net learns the encoding protocol of the optical system from a large number of correlated data, and achieves the capability of decoding sample information directly from the raw speckle and intensity data. Thus, after training, the network serves as the decoding part of the imaging system. This model is particularly useful when the encoding process of the imaging system varies dynamically.

It is expensive and time-consuming to acquire experiment data for training. Fortunately, due to the dynamic capability of our network, it can be trained with speckle and intensity data generated by simulation. The virtual measurement data of speckle and intensities can be derived from image data sets, which provides the potential to obtain a variety of training data easier and faster. For each image in certain image database, we put it into the simulated GI system as a sample, and randomly generate the phase distribution of the optical field on the source plane. Then one frame of speckle in the reference beam and a set of intensity data in the test beam can be synthesized according to Eqs. (2), (3) and (4). The distance parameters and the diameter of the source used in the simulation are the same as those in the experiment. The simulation process is very fast and it costs several millisecond for a sample (typical for a sample with 28x28 grids and the detector with 64x64 grids). All the virtual measurement data derived from different image samples together with the sample images constitute the training data set. In this paper, 60000 images in the MNIST database are used to generate the speckle and intensity data for training, and the remaining images in the database are used for validation. The simulation data set was obtained in about 1 minute.

There are many types of loss functions can be chosen to train the network. In order to have better performance, the prior of the image structure of samples should be considered. For example, the average binary cross-entropy (BCE) promotes sparsity [47] and is suitable for simple structure samples. It is defined by

where $N$ is the pixel number, $Q_i$ and $P_i$ are the pixel value of target $Q$ and output $P$, respectively. For natural grayscale images, the mean-square error (MSE) is a better choice. The Adam optimizer is used to update the parameters with the initial learning rate $r =0.002$ and $\beta _1 = 0.9$, $\beta _2 = 0.99$. The total training epoch is 250, and the training process took about 14 hours. After training, the experiment result of a sample can be given within several milliseconds. All computation including training and evaluation are performed on a workstation(@Intel-Xeon CPU and 4$\times$@Nvidia-GeForce-1080Ti GPUs).## 3. Results and discussions

#### 3.1 Experiments

A 532 nm laser was adopted in the experiment, the distance parameters were $d_1= 5.0$ cm and $d_2 = 20.1$ cm, and the source diameter was $\sigma _s = 1$ mm. The pixel size of the ${\rm CCD}$ was $5.86\times 5.86$ $\mu m^2$ and the number of pixels was $512\times 512$. The speckle patterns recorded by the ${\rm CCD}$ were merged into $64\times 64$ and normalized before being transferred to the well-trained network. We compared the speckle patterns collected in the experiment and generated by simulation. Figure 3(a) is a typical speckle image recorded by ${\rm CCD_1}$, and Fig. 3(c) is a speckle image generated by simulation in the reference beam as described in the training process. Obviously, the spatial distributions of the two images are different. We calculated the corresponding second-order auto-correlation of the two speckle images, and the results are shown in Figs. 3(b) and (d). It can be found that the statistical characteristics of the two speckle images are almost the same. This is why our network uses simulation data in training, but after training it can be applied to experimental data.

We tested five samples("1","2","4","6","9") fabricated with stainless steel in the experiment. The sample size is $1 \times 1$ $mm^2$. They were chosen from the testing part of the MNIST database, and never appeared in the training process. Figure 4 presents the experimental results of the five samples. The original images of the samples are displayed in Fig. 4(a). Figures 4(b) and (c) are the corresponding patterns recorded by the detectors in the reference beam and the test beam, respectively. It can be observed that the patterns in Fig. 4(b) are different from each other, and it indicates that the five samples are illuminated by completely different speckles. The probability of these speckles appearing in the training data set is almost zero. To make it clear, we estimate the number of possible speckles, and it is about $6 \times 10^{151}$ (the speckle distribution is $64 \times 64$ pixels, the speckle size is $2.85 \times 2.85$ pixels, and at least 1 bit detector), which is much larger than the 60000 speckles we used in the training stage. Therefore, the illumination light in the experiment are randomly distributed and varies dynamically for different samples. That’s the reason why we say the measurement is a dynamic coding process. Figures 4(d) and (e) give the outcome of our network with different sampling numbers ${\rm M} =4096$ and ${\rm M} = 1024$, respectively. For each sample, only one frame of reference speckle and M intensities were utilized by the network to extract the image of the sample. Specifically, we used all the intensities in Fig. 4(c) when ${\rm M} = 4096$ and utilized a quarter of the intensities in Fig. 4(c) when ${\rm M} = 1024$. The reconstructed images in Fig. 4(d) are better than the images in Fig. 4(e) because of the increasing of the sampling rate. As a comparison, we processed the speckle data using the traditional FGI method according to Eq. (5), in which all the pixels were used to calculate the ensemble average. And the widely used hybrid input-output algorithm [48] was adopted for phase retrieval. The results are shown in Fig. 4(f). Unfortunately, there is almost no sign of digits in these images. Note that the experimental speckle and intensity data in Figs. 4(b) and (c) do not appear in the training data set. Thus, our method works well with this indeterministic GI system, in which the illumination light varies dynamically.

To assess the quality of the image results, we used two evaluations: structural similarity index(SSIM) and peak signal-to-noise ratio(PSNR). The SSIM is defined by [49]

To test the stability of our method, we repeated the experiments with two other stainless steel samples under dynamic illumination. Figure 5(a) shows the original samples. Figures 5(b)–(d) are the speckle patterns of the reference beam recorded by the detector at different times. These patterns are obviously different from each other, which means the sample are illuminated by different speckles at different times. From the results in Figs. 5(e)–(g), it can be seen that the outcome of our network is stable. The average SSIM of this repeat experiment is 0.67 for the digit "3" and 0.62 for the digit "7".

#### 3.2 Simulations

We investigated the performance of our network in static and dynamic situations by simulation experiments. In the static experiments, the imaging process remained unchanged, which means the illuminating light was deterministic. To simulate the static GI system, we fixed random seeds when generating speckle distribution, so that the illumination speckle for each sample is identical. Based on the same network architecture described in section 2.2, we trained the network again with the derived data set for the static GI system. Then we tested the network with a set of different digits shown in Fig. 6(a), and Fig. 6(b) presents the results. The images of samples are successfully obtained with high image quality. In the dynamic experiments, we tested the stability of the network output when the illuminating speckle was generated randomly. We chose 10 digits from the testing part of the MNIST database. For each digit sample, we repeated the experiment 10 times. Figure 7 displays the results. Although the input speckles are different as shown in Fig. 7(c), the network outputs presented in Fig. 7(b) are consistent with the original images in Fig. 7(a). It indicates that our Y-net is stable and reliable for GI systems exploiting dynamical illumination.

We also investigated the relationship between the number of parallel sampling and the quality of the reconstructed image through more simulation experiments. The Frey faces database [50] which includes 1965 faces was adopted. 1900 faces in this database were used to generate the training data set, and the remaining faces were used as the testing samples. The original images are 8 bits grey images with a dimension of $28\times 20$. The dimension of the network output is $28\times 28$, so we cropped the output tensor into $28\times 20$ to ensure the training was carried out correctly. Considering that the face samples are not sparse, we used the MSE loss function in the training stage. The results show that the image quality is positively related to the number of sampling, and nearly perfect images can be achieved when the number of sampling is 650. Figure 8 presents the results with different sampling numbers. Figure 8(a) is the original images from the testing data set, and Figs. 8(b) and (c) are the reconstructed images corresponding to M = 650 and M = 50, respectively. It can be observed that the images in Fig. 8(b) are very similar to the original images in Fig. 8(a). We repeated the experiment with different sampling numbers (M = 50, 200, 350, 500, 650, 800, 950, 1100) and analyzed the SSIM and PSNR between the reconstructed images and the original images. The corresponding means and standard derivations were also calculated. Figure 9 shows the results. With the increase of sampling number, the SSIM and PSNR tend to increase, and the growth seems to be saturated when M=650.

It should be mentioned that our method is particularly useful when the illuminating light is weak. In some applications, photons arriving at the detector may be not adequate to form a well speckle pattern as usual. Here we use the photon per pixel ($PPP$) to describe the photon efficiency for imaging [51]. It is defined by

where $N_{ph}$ is the number of photons actually measured, and $N$ is the total number of pixels. Figure 10 gives the relevant results. Figure 10(a) is the original images. When the light intensity is weak with $PPP= 0.5$, the partial-form speckle and the reconstructed images are shown in Figs. 10(b) and (c), respectively. Although a lot of information is lost in the case of fewer photons, the reconstruction from the partial-form speckle data is still successful. As a comparison, the well-form speckle with $PPP = 100$ and its reconstruction results are shown in Figs. 10(d) and (e), respectively. The average SSIM and PSNR for $PPP= 0.5$ are 0.62 and 12.13, and in the case of $PPP= 100$, they are 0.83 and 18.12.## 4. Conclusion

In summary, we have demonstrated a ghost imaging scheme based on Y-net, a novel dynamic decoding deep learning framework that can be used to reconstruct sample images in both static and dynamic GI systems. As long as the statistical characteristics of the illuminating light remain unchanged, the image of a sample can be successfully achieved even if the spatial distribution of the illuminating light is indeterministic. Due to its dynamic capability, Y-net can be applied to experimental data after training with simulation data. Thus, it can avoid the common difficulty of insufficient training data in learning-based imaging methods. This dynamic capability also provides the feasibility of imaging with weak illuminating light. Results show that the sample image can still be obtained when the photon per pixel collected by the detector is less than 1 photon. Moreover, the sample will be illuminated only once in our scheme. Finally, compared with the traditional Fourier-transform GI techniques which involve an ill-posed phase retrieval problem, Y-net ghost imaging is more convenient, and can greatly improve the sampling efficiency and image quality. Benefited from the end-to-end characteristic of the Y-net, sample images can be extracted directly from the raw data collected by the detector. All these features are of great significance for GI applications, especially for imaging which is under dynamic illumination and requires single exposure of samples. For example, in high-resolution x-ray ghost imaging, it has the potential to achieve high quality image and reduce radiation damage.

In the future, further investigation on improving this proof of concept model is needed, such as exploiting physics-informed priors to improve image quality, building virtual data sets to deal with complex samples, optimizing the network architecture to handle large-scale images, etc.

## Funding

National Natural Science Foundation of China (11627811); National Key Research and Development Program of China (2017YFB0503300, 2017YFB0503303).

## Disclosures

The authors declare no conflicts of interest.

## References

**1. **G. Scarcelli, V. Berardi, and Y. Shih, “Can two-photon correlation of chaotic light be considered as correlation of intensity fluctuations?” Phys. Rev. Lett. **96**(6), 063602 (2006). [CrossRef]

**2. **M. J. Padgett and R. W. Boyd, “An introduction to ghost imaging: quantum and classical,” Philos. Trans. R. Soc., A **375**(2099), 20160233 (2017). [CrossRef]

**3. **R. E. Meyers, K. S. Deacon, and Y. Shih, “Turbulence-free ghost imaging,” Appl. Phys. Lett. **98**(11), 111115 (2011). [CrossRef]

**4. **M. Bina, D. Magatti, M. Molteni, A. Gatti, L. A. Lugiato, and F. Ferri, “Backscattering differential ghost imaging in turbid media,” Phys. Rev. Lett. **110**(8), 083901 (2013). [CrossRef]

**5. **D.-J. Zhang, H.-G. Li, Q.-L. Zhao, S. Wang, H.-B. Wang, J. Xiong, and K. Wang, “Wavelength-multiplexing ghost imaging,” Phys. Rev. A **92**(1), 013823 (2015). [CrossRef]

**6. **B. I. Erkmen, “Computational ghost imaging for remote sensing,” J. Opt. Soc. Am. A **29**(5), 782–789 (2012). [CrossRef]

**7. **H. Yu, R. Lu, S. Han, H. Xie, G. Du, T. Xiao, and D. Zhu, “Fourier-transform ghost imaging with hard x rays,” Phys. Rev. Lett. **117**(11), 113901 (2016). [CrossRef]

**8. **D. Pelliccia, A. Rack, M. Scheel, V. Cantelli, and D. M. Paganin, “Experimental x-ray ghost imaging,” Phys. Rev. Lett. **117**(11), 113902 (2016). [CrossRef]

**9. **R. I. Khakimov, B. Henson, D. Shin, S. Hodgman, R. Dall, K. Baldwin, and A. Truscott, “Ghost imaging with atoms,” Nature **540**(7631), 100–103 (2016). [CrossRef]

**10. **R. Schneider, T. Mehringer, G. Mercurio, L. Wenthaus, A. Classen, G. Brenner, O. Gorobtsov, A. Benz, D. Bhatti, L. Bocklage, B. Fishcher, S. Lazarev, Y. Obukhov, K. Scblage, P. Skopintsev, J. Wagner, F. Waldmann, S. Willing, I. Zakyzhnyy, W. Wurth, R. A. Vartanyants Ivan, and Röhlsberger von Zanthier Joachim, “Quantum imaging with incoherently scattered light from a free-electron laser,” Nat. Phys. **14**(2), 126–129 (2018). [CrossRef]

**11. **A.-X. Zhang, Y.-H. He, L.-A. Wu, L.-M. Chen, and B.-B. Wang, “Tabletop x-ray ghost imaging with ultra-low radiation,” Optica **5**(4), 374–377 (2018). [CrossRef]

**12. **A. M. Kingston, D. Pelliccia, A. Rack, M. P. Olbinado, Y. Cheng, G. R. Myers, and D. M. Paganin, “Ghost tomography,” Optica **5**(12), 1516–1520 (2018). [CrossRef]

**13. **S. Li, F. Cropp, K. Kabra, T. Lane, G. Wetzstein, P. Musumeci, and D. Ratner, “Electron ghost imaging,” Phys. Rev. Lett. **121**(11), 114801 (2018). [CrossRef]

**14. **O. Katz, Y. Bromberg, and Y. Silberberg, “Compressive ghost imaging,” Appl. Phys. Lett. **95**(13), 131110 (2009). [CrossRef]

**15. **P. Zerom, K. W. C. Chan, J. C. Howell, and R. W. Boyd, “Entangled-photon compressive ghost imaging,” Phys. Rev. A **84**(6), 061804 (2011). [CrossRef]

**16. **W.-K. Yu, M.-F. Li, X.-R. Yao, X.-F. Liu, L.-A. Wu, and G.-J. Zhai, “Adaptive compressive ghost imaging based on wavelet trees and sparse representation,” Opt. Express **22**(6), 7133–7144 (2014). [CrossRef]

**17. **Z. Liu, S. Tan, J. Wu, E. Li, X. Shen, and S. Han, “Spectral camera based on ghost imaging via sparsity constraints,” Sci. Rep. **6**(1), 25718 (2016). [CrossRef]

**18. **H. Yu, E. Li, W. Gong, and S. Han, “Structured image reconstruction for three-dimensional ghost imaging lidar,” Opt. Express **23**(11), 14541–14551 (2015). [CrossRef]

**19. **R. Zhu, H. Yu, R. Lu, Z. Tan, and S. Han, “Spatial multiplexing reconstruction for fourier-transform ghost imaging via sparsity constraints,” Opt. Express **26**(3), 2181–2190 (2018). [CrossRef]

**20. **V. Katkovnik and J. Astola, “Compressive sensing computational ghost imaging,” J. Opt. Soc. Am. A **29**(8), 1556–1567 (2012). [CrossRef]

**21. **C. Hu, Z. Tong, Z. Liu, Z. Huang, J. Wang, and S. Han, “Optimization of light fields in ghost imaging using dictionary learning,” Opt. Express **27**(20), 28734–28749 (2019). [CrossRef]

**22. **J. H. Shapiro, “Computational ghost imaging,” Phys. Rev. A **78**(6), 061802 (2008). [CrossRef]

**23. **N. D. Hardy and J. H. Shapiro, “Computational ghost imaging versus imaging laser radar for three-dimensional imaging,” Phys. Rev. A **87**(2), 023820 (2013). [CrossRef]

**24. **B. Sun, M. P. Edgar, R. Bowman, L. E. Vittert, S. Welsh, A. Bowman, and M. Padgett, “3d computational imaging with single-pixel detectors,” Science **340**(6134), 844–847 (2013). [CrossRef]

**25. **M. Lyu, W. Wang, H. Wang, H. Wang, G. Li, N. Chen, and G. Situ, “Deep-learning-based ghost imaging,” Sci. Rep. **7**(1), 17865 (2017). [CrossRef]

**26. **F. Wang, H. Wang, H. Wang, G. Li, and G. Situ, “Learning from simulation: An end-to-end deep-learning approach for computational ghost imaging,” Opt. Express **27**(18), 25560–25572 (2019). [CrossRef]

**27. **T. Shimobaba, Y. Endo, T. Nishitsuji, T. Takahashi, Y. Nagahama, S. Hasegawa, M. Sano, R. Hirayama, T. Kakue, A. Shiraki, and T. Ito, “Computational ghost imaging using deep learning,” Opt. Commun. **413**, 147–151 (2018). [CrossRef]

**28. **A. Goy, K. Arthur, S. Li, and G. Barbastathis, “Low photon count phase retrieval using deep learning,” Phys. Rev. Lett. **121**(24), 243902 (2018). [CrossRef]

**29. **L. Sun, J. Shi, X. Wu, Y. Sun, and G. Zeng, “Photon-limited imaging through scattering medium based on deep learning,” Opt. Express **27**(23), 33120–33134 (2019). [CrossRef]

**30. **A. Bora, A. Jalal, E. Price, and A. G. Dimakis, “Compressed sensing using generative models,” in Proceedings of the 34th International Conference on Machine Learning - Volume 70 (JMLR.org, 2017), pp. 537–546.

**31. **T. M. Quan, T. Nguyen-Duc, and W.-K. Jeong, “Compressed sensing mri reconstruction using a generative adversarial network with a cyclic loss,” IEEE Trans. Med. Imaging **37**(6), 1488–1497 (2018). [CrossRef]

**32. **R. Horisaki, R. Takagi, and J. Tanida, “Learning-based imaging through scattering media,” Opt. Express **24**(13), 13738–13743 (2016). [CrossRef]

**33. **A. Sinha, J. Lee, S. Li, and G. Barbastathis, “Lensless computational imaging through deep learning,” Optica **4**(9), 1117–1125 (2017). [CrossRef]

**34. **Y. Sun, Z. Xia, and U. S. Kamilov, “Efficient and accurate inversion of multiple scattering with deep learning,” Opt. Express **26**(11), 14678–14688 (2018). [CrossRef]

**35. **S. Li, M. Deng, J. Lee, A. Sinha, and G. Barbastathis, “Imaging through glass diffusers using densely connected convolutional networks,” Optica **5**(7), 803–813 (2018). [CrossRef]

**36. **Y. Li, Y. Xue, and L. Tian, “Deep speckle correlation: a deep learning approach toward scalable imaging through scattering media,” Optica **5**(10), 1181–1190 (2018). [CrossRef]

**37. **C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep convolutional network for image super-resolution,” in * Computer Vision – ECCV 2014*, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, eds. (Springer, 2014), pp. 184–199.

**38. **C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi, “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, (IEEE, 2017), pp. 4681–4690.

**39. **T. Nguyen, Y. Xue, Y. Li, L. Tian, and G. Nehmetallah, “Deep learning approach for fourier ptychography microscopy,” Opt. Express **26**(20), 26470–26484 (2018). [CrossRef]

**40. **E. Nehme, L. E. Weiss, T. Michaeli, and Y. Shechtman, “Deep-storm: super-resolution single-molecule microscopy by deep learning,” Optica **5**(4), 458–464 (2018). [CrossRef]

**41. **Y. Rivenson, Y. Zhang, H. Günaydın, D. Teng, and A. Ozcan, “Phase recovery and holographic image reconstruction using deep learning in neural networks,” Light: Sci. Appl. **7**(2), 17141 (2018). [CrossRef]

**42. **C. Işil, F. S. Oktem, and A. Koç, “Deep learning-based hybrid approach for phase retrieval,” in * Computational Optical Sensing and Imaging* (Optical Society of America, 2019), pp. CTh2C–5.

**43. **I. Freund, M. Rosenbluh, and S. Feng, “Memory effects in propagation of optical waves through disordered media,” Phys. Rev. Lett. **61**(20), 2328–2331 (1988). [CrossRef]

**44. **O. Katz, P. Heidmann, M. Fink, and S. Gigan, “Non-invasive single-shot imaging through scattering layers and around corners via speckle correlations,” Nat. Photonics **8**(10), 784–790 (2014). [CrossRef]

**45. **J. Cheng and S. Han, “Incoherent coincidence imaging and its applicability in x-ray diffraction,” Phys. Rev. Lett. **92**(9), 093903 (2004). [CrossRef]

**46. **D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” Stat **1050**, 1 (2014).

**47. **S. Suresh, N. Sundararajan, and P. Saratchandran, “Risk-sensitive loss functions for sparse multi-category classification problems,” Inf. Sci. **178**(12), 2621–2638 (2008). [CrossRef]

**48. **J. R. Fienup, “Phase retrieval algorithms: a comparison,” Appl. Opt. **21**(15), 2758–2769 (1982). [CrossRef]

**49. **Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Process **13**(4), 600–612 (2004). [CrossRef]

**50. **S. Roweishttps://cs.nyu.edu/~roweis/data.html.

**51. **X. Liu, J. Shi, X. Wu, and G. Zeng, “Fast first-photon ghost imaging,” Sci. Rep. **8**(1), 5012 (2018). [CrossRef]