## Abstract

Single-pixel imaging can capture images using a detector without spatial resolution, which enables imaging in various situations that are challenging or impossible with conventional pixelated detectors. Here we report a compressive single-pixel imaging approach that can simultaneously encode and recover spatial, spectral, and 3D information of the object. In this approach, we modulate and condense the object information in the Fourier space and detect the light signals using a single-pixel detector. The data-compressing operation is similar to conventional compression algorithms that selectively store the largest coefficients of a transform domain. In our implementation, we selectively sample the largest Fourier coefficients, and no iterative optimization process is needed in the recovery process. We demonstrate an 88% compression ratio for producing a high-quality full-color 3D image. The reported approach provides a solution for information multiplexing in single-pixel imaging settings. It may also generate new insights for developing multi-modality computational imaging systems.

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. INTRODUCTION

Single-pixel imaging [1–16] is an emerging computational imaging scheme that differs from conventional imaging setups. It employs a single-pixel detector to acquire the spatial information of a scene and computationally recover the image. Single-pixel imaging allows one to build an imaging system with high signal-to-noise ratio, wide spectral range, and low cost in terms of detection units. As such, it has attracted considerable attention in recent years and has been applied in photoacoustic imaging [5], 3D imaging [6–8], multispectral/hyperspectral imaging [9–14], flow cytometry [15], and ophthalmology [16], among other developments. The recently reported Fourier single-pixel imaging (FSI) scheme [4] has demonstrated high-quality and high-efficiency imaging capability. FSI uses Fourier basis patterns (also known as sinusoidal intensity patterns) for illumination and measures the intensity of the resulting light signal with a single-pixel detector. Each measurement is an inner product between the object image and a Fourier basis pattern. Therefore, the operation of FSI is equivalent to performing a raster scan in the Fourier space (termed Fourier scan). As the energy of an object naturally concentrates at the low-frequency region of the Fourier space, one can selectively sample the low-frequency Fourier coefficients with FSI so as to obtain a high data compression ratio. The energy concentration at the low-frequency region provides extra degrees of freedom for encoding other object information into medium- and high-frequency regions, with which multi-modality imaging can be achieved. Multi-modality imaging allows one to acquire multiple image information with a single imaging setup at the same time, which achieves consistency in time and space. In comparison with single-modality imaging, multi-modality imaging provides richer information, which may add accuracy and reliability in applications, such as object recognition, material analysis, and medical diagnosis.

In this paper, we report an actively compressive imaging scheme that encodes, condenses, and recovers spatial, spectral, and 3D information of the object simultaneously. In this approach, we modulate the illumination and the detection light signals to encode and condense the object information in the Fourier space and detect the light signal using a single-pixel detector. With the modulations, the information of the three modalities is encoded to three specified regions in the Fourier space. The spatial information is naturally compressed at the low-frequency region, the depth information is actively compressed at the medium-frequency region by modulating the detection light signals, and the spectral information is actively compressed at the high-frequency region by modulating the illumination light signals. The modulations can be done by inserting two designed masks in the illumination and the detection paths, respectively. To encode information with a high data-compression ratio, we selectively sample the largest coefficients in these three regions. The proposed scheme is different from conventional compressive sensing techniques. No iterative optimization process is needed in the recovery process. We demonstrate an 88% compression ratio [compression ratio in the context is referred to as (1-measurements/pixel counts)] for producing a high-quality full-color 3D image. The reported approach provides a solution for information multiplexing in single-pixel imaging settings. It may generate new insights for developing multi-modality computational imaging systems.

## 2. SINGLE-MODALITY IMAGING

The FSI was originally developed as a single-modality imaging technique that only acquires intensity information (reflectance, transmittance, or fluorescence) of an object. The prior knowledge that an image of natural scene is sparse in the Fourier domain enables one to reconstruct a clear image with fewer measurements than pixel counts. We use a standard test image “Peppers” for demonstration. As shown in Fig. 1(a), most energy of the spatial information of the image is concentrated at the low-frequency region of the Fourier space. We can simply acquire the low-frequency component in Fig. 1(a) to reconstruct a high-quality image in Fig. 1(b), achieving a compression ratio of 92%. The higher compression ratio utilized, the more measurements saved. We use peak signal-to-noise ratio (PSNR), structure similarity index (SSIM) [17], and root mean square error (RMSE) to evaluate the reconstruction quality.

## 3. MULTI-MODALITY IMAGING

As Fig. 1(a) shows, the spatial information only occupies the low-frequency region, and the rest of the Fourier bandwidth remains unused. The sparsity of natural images in Fourier space provides FSI with extra degrees of freedom for encoding extra object information into a single Fourier spectrum measurement. To encode other object information to a specified region of the Fourier space, we can modulate the illumination or the detection light signals. To implement the modulation, we propose to use two designed masks in the illumination and the detection paths, respectively. The masks are designed based on Fourier basis patterns. The Fourier transforms of the masks are one or a few impulses (Dirac delta functions) in the Fourier space. According to the sifting property of the delta function, the masks can duplicate the information to the specific regions in the Fourier space. Therefore, the encoded information of the object will be distributed around the impulse(s), giving a condensed representation. We refer this strategy as active compressive FSI.

We perform two experiments to validate the reported scheme. In the first experiment, we design a multi-spectral mask to modulate illumination light signals. As such, spectral information is encoded into the high-frequency regions, achieving spatial-spectral dual-imaging modalities. In the second experiment, we design an additional mask to modulate detection light signals. As such, depth information is encoded into the medium-frequency regions, achieving spatial-spectral-depth triple-imaging modalities.

#### A. Dual-Modality Imaging

Figure 2 shows the multi-spectral mask. The mask is designed with a set of $2\times 2$ pixels. Within each set, different pixels are covered with a different spectral filter. For example, in Fig. 2, we use red, yellow, green, and blue to represent that the mask consists of four different spectral filters. The multi-spectral mask can be decomposed into four spectral masks (middle panel of Fig. 2). Each spectral mask is essentially a superposition of three Fourier basis patterns, and its Fourier transform contains three impulses (the bottom panel of Fig. 2). We propose to use this multi-spectral mask to modulate the grayscale basis patterns for sample illumination, as shown in Fig. 3(a). Each mask replicates and shifts the corresponding spectral component of the object image to the corresponding locations of the three impulses in Fourier space. Since each spectral mask has a different and linearly independent phase map (bottom panel of Fig. 2), the four Fourier-spectrum components can be encoded in one Fourier-spectrum measurement (see Supplement 1 for mathematical demonstration). To decode the multi-spectral information, we can simply shift the high-frequency components back to the origin and extract each spectral component by solving a system of equations (see Supplement 1 for mathematical demonstration).

However, limited by our experimental conditions, we have difficulties in coating the micromirrors of a digital micromirror device (DMD) chip with the corresponding spectral material. In our experiment, we alternatively use a DLP projector to generate color Fourier basis patterns as shown in Fig. 3(b). The generated color basis patterns consist of three spectral components—red, green, and blue. The spectra of the utilized digital projector in display of a pure red, green, and blue pattern are given in Supplement 1.

The experimental setup for spatial-spectral dual-modality imaging is shown in Fig. 4. In this experiment, the size of the color Fourier basis patterns is $512\text{\hspace{0.17em}\hspace{0.17em}}(\mathrm{W})\times 320\text{\hspace{0.17em}\hspace{0.17em}}(\mathrm{H})$ pixels, and we recover the object image with the same size. A photodiode (HAMAMATSU S1227-1010BR) is used as a single-pixel detector to measure the intensity of back-reflected light from the object under the color basis patterns’ illumination. The resultant electrical signals are collected by a data acquisition board (National Instruments USB-6343). We use four-step phase-shifting sinusoidal illumination [4] to perform FSI. Four-step phase-shifting FSI is essentially differential measurement, which benefits for noise elimination by doubling the number of measurements as the expense. Each component in the spectrum is sampled from its center along a circular path.

To demonstrate the performance of compression, we acquire the central low-frequency component with 39,320 measurements and the three high-frequency components with 19,660 measurements. Thus, the overall compression ratio is 64%. Figure 5(a) shows the acquired Fourier spectrum. We note that there are eight high-frequency components in the Fourier space due to the periodicity of Fourier spectrum. These eight components can be recombined into three circular components for the red, green, and blue object information. We then apply a 2D inverse Fourier transform and the color reconstruction algorithm [18] to reconstruct the full-color image in Fig. 5(b). For comparison, we also acquire the complete Fourier spectrum in Fig. 5(c) and obtain the reconstruction in Fig. 5(d). It can be seen that the difference between the undersampled reconstruction and the fully-sampled reconstruction is hardly noticeable through observation. The details and the color of the scene are well reconstructed even with a high compression ratio. In this experiment, we successfully reconstruct a color image of a complex scene from undersampled data by using one single-pixel detector. In comparison with the existing color single-pixel techniques [9–14], the proposed technique has two distinct advantages. First, it can produce high-quality reconstructions from a high compression ratio without any iterative computational procedures. Second, it only requires the use of one single-pixel detector.

#### B. Triple-Modality Imaging

As shown in Fig. 5(a), the spatial information occupies the lowest-frequency region, while the spectral (color) information occupies the highest-frequency region in the Fourier spectrum. There is still a medium-frequency region available in the Fourier space. In the second experiment, we design an additional mask to modulate the light signals in the detection path to further encode the 3D depth information into the available medium-frequency region. As such, we can achieve spatial-spectral-depth triple-modality imaging.

The experimental setup is shown in Fig. 6, where we insert a grating mask at the detection path. By inserting this grating mask, we can impose a fringe pattern (i.e., the image of the grating through Lens 1) in the reconstructed image. We also note that the optical axis of the illumination system and that of the detection system are intersected at an angle of 15 deg. As such, our setup is subject to the triangular principle [19], and the fringe pattern is deformed according to the object’s 3D surface. In mathematical terms, the phase of the deformed fringe pattern is modulated by the depth of the object surface. We use a binary grating mask for modulation and introduce a defocus distance to the grating mask. Thus, the image of the grating mask approximates a sinusoidal pattern that is equivalent to a delta function in Fourier space. By carefully designing the frequency of the mask pattern, we can modulate the depth information into the medium-frequency region. The central frequency of the region is determined by the frequency of the image of the grating mask. As the frequency of the grating mask used in our experiment is 10 $\mathrm{lines}\xb7{\mathrm{mm}}^{-1}$, the frequency of the image of the grating is ${f}_{0}=0.117pixe{l}^{-1}$ according to the magnification of the imaging system. In this experiment, our object is a human mask with a color pattern painted on top of it. The focal lengths of Lens 1 and Lens 2 are both 50 mm. The resolution of the reconstruction is $400\text{\hspace{0.17em}\hspace{0.17em}}(\mathrm{W})\times 512\text{\hspace{0.17em}\hspace{0.17em}}(\mathrm{H})$ pixels. Each component in the sampled is acquired from its center along a circular path.

As shown in Fig. 7(a), we sample the Fourier spectrum with 24,576 measurements, among which 4,912 measurements are for spatial information, 18,840 are for depth information, and the reset 824 are for color information. In other words, the compression ratio is 88%. Applying inverse 2D Fourier transform to the Fourier spectrum acquired, we recover the image in Fig. 7(b). The reconstructed image clearly shows the intensity distribution of the target object with a fringe pattern overlaying on top of it. From the magnified view, we can clearly see the deformation of the fringe pattern, which carries the modulated depth information; we can also see the mosaic structures that carry the color information. In short, three modalities are successfully encoded and recovered from a single Fourier spectrum measurement.

In this experiment, we recover spatial and color information using the same process as that in the former experiment. The additional depth information is demodulated by extracting the medium-frequency components [Fig. 8(a)] and applying an inverse Fourier transform to derive a deformed fringe pattern [Fig. 8(b)]. The deformation of the fringes (i.e., a modulated phase pattern) is retrieved by the Fourier transform fringe analysis method [7,19], and we derive the modulated phase map in Fig. 8(c). According to the geometry of our experimental setup, we derive the phase-to-height conversion factor, $k=1.086\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}/\mathrm{rad}$. With the conversion factor, we further derive the height distribution (3D reconstruction) of the target object in Fig. 9. According to the 3D reconstruction, we derive the height difference from the top of the nose and the top of the forehead, $\mathrm{\Delta}h=24.544\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}$. As the true value is 24.05 mm, the height error is 0.494 mm. Thus, we claim that the proposed technique can reconstruct high-quality, full-color 3D images in a highly efficient manner.

## 4. DISCUSSION AND CONCLUSION

We note that we use fewer measurements to derive the full-color 3D reconstruction than the full-color 2D reconstruction because the target object in the 3D imaging experiment is simpler than that in the 2D imaging experiment. It can be seen that the compression ratio varies with the complexity of the target object. Complex target objects have less redundancy and, therefore, the achievable compression ratio is correspondingly lower. Thus, the actual number of measurements depends on the pixel counts and the complexity of the object image. In our experiment, we set a predefined value of compression ratio in Fourier spectrum acquisition. It is possible to further develop an algorithm to determine the compression ratio adaptively [20]. Due to difficulties in fabricating the multi-spectral mask on a DMD chip under our experimental conditions, we suffer from low projection rate (6 patterns/s) when using a digital projector for spectrally coded illumination. If the multi-spectral mask is available, one can use a DMD to remarkably accelerate the data-acquisition process. State-of-the-art DMDs are able to generate $\sim \mathrm{22,000}$ patterns per second, with which reproducing the image shown in Fig. 9 only takes $\sim 1\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{s}$. We also note that the proposed compressive imaging scheme is a lossy compression. Similar to the generic image compression algorithm JPEG [21], data compression is achieved by discarding high-frequency details. The proposed scheme also allows one to flexibly adjust the compression ratio for a different reconstruction quality. We acknowledge that there exists color distortion in the color reconstructions. The reconstructions look greener than the ground truth. This problem may be due to spectral response of the utilized single-pixel detector. We believe this problem can be resolved by employing an additional color calibration process.

In conclusion, we report a scheme of physical compression and a multi-modality FSI technique. We demonstrate an 88% compression ratio for producing a high-quality full-color 3D image. The reported approach provides a solution for information multiplexing in single-pixel imaging settings. It may also generate new insights for developing multi-modality computational imaging systems.

## Funding

National Natural Science Foundation of China (NSFC) (61475064); National Science Foundation (NSF) (1510077, 1555986, 1700941); National Institutes of Health (NIH) (R03EB022144, R21EB022378).

## Acknowledgment

G. Zheng acknowledges the support of NSF and NIH.

See Supplement 1 for supporting content.

## REFERENCES

**1. **J. H. Shapiro, “Computational ghost imaging,” Phys. Rev. A **78**, 061802(R) (2008). [CrossRef]

**2. **M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. Kelly, and R. G. Baraniuk, “Single-pixel imaging via compressive sampling,” IEEE Signal Proc. Mag. **25**(2), 83–91 (2008). [CrossRef]

**3. **Y. Bromberg, O. Katz, and Y. Silberberg, “Ghost imaging with a single detector,” Phys. Rev. A **79**, 053840 (2009). [CrossRef]

**4. **Z. Zhang, X. Ma, and J. Zhong, “Single-pixel imaging by means of Fourier spectrum acquisition,” Nat. Commun. **6**, 6225 (2015). [CrossRef]

**5. **J. Yang, L. Gong, X. Xu, P. Hai, Y. Shen, Y. Suzuki, and L. V. Wang, “Motionless volumetric photoacoustic microscopy with spatially invariant resolution,” Nat. Commun. **8**, 780 (2017). [CrossRef]

**6. **B. Sun, M. P. Edgar, R. Bowman, L. E. Vittert, S. Welsh, A. Bowman, and M. J. Padgett, “3-D Computational imaging with single-pixel detectors,” Science **340**, 844–847 (2013). [CrossRef]

**7. **Z. B. Zhang and J. G. Zhong, “Three-dimensional single-pixel imaging with far fewer measurements than effective image pixels,” Opt. Lett. **41**, 2497–2500 (2016). [CrossRef]

**8. **M. Sun, M. Edgar, G. Gibson, B. Sun, N. Radwell, R. Lamb, and M. Padgett, “Single-pixel three-dimensional imaging with time-based depth resolution,” Nat. Commun. **7**, 12010 (2016). [CrossRef]

**9. **S. S. Welsh, M. P. Edgar, R. Bowman, P. Jonathan, B. Q. Sun, and M. J. Padgett, “Fast full-color computational imaging with single-pixel detectors,” Opt. Express **21**, 23068–23074 (2013). [CrossRef]

**10. **L. Bian, J. Suo, G. Situ, Z. Li, J. Fan, F. Chen, and Q. Dai, “Multispectral imaging using a single bucket detector,” Sci. Rep. **6**, 24752 (2016). [CrossRef]

**11. **Y. Yan, H. Dai, X. Liu, W. He, Q. Chen, and G. Gu, “Colored adaptive compressed imaging with a single photodiode,” Appl. Opt. **55**, 3711–3718 (2016). [CrossRef]

**12. **B. Liu, Z. Yang, X. Liu, and L. Wu, “Coloured computational imaging with single-pixel detectors based on a 2D discrete cosine transform,” J. Mod. Opt. **64**, 259–264 (2017). [CrossRef]

**13. **J. Huang and D. F. Shi, “Multispectral computational ghost imaging with multiplexed illumination,” J. Opt. **19**, 075701 (2017). [CrossRef]

**14. **S. Jin, W. Hui, Y. Wang, K. Huang, Q. Shi, C. Ying, D. Liu, Q. Ye, W. Zhou, and J. Tian, “Hyperspectral imaging using the single-pixel Fourier transform technique,” Sci. Rep. **7**, 45209 (2017). [CrossRef]

**15. **Q. Guo, H. Chen, Y. Wang, Y. Guo, P. Liu, X. Zhu, Z. Cheng, Z. Yu, S. Yang, M. Chen, and S. Xie, “High-speed compressive microscopy of flowing cells using sinusoidal illumination patterns,” IEEE Photon. J. **9**, 3900111 (2017). [CrossRef]

**16. **B. Lochocki, A. Gambín, S. Manzanera, E. Irles, E. Tajahuerce, J. Lancis, and P. Artal, “Single pixel camera ophthalmoscope,” Optica **3**, 1056 (2016). [CrossRef]

**17. **W. Zhou, A. Bovik, H. Sheikh, and E. Simoncelli, “Image qualifty assessment: from error visibility to structural similarity,” IEEE Trans. Image Process. **13**, 600–612 (2004). [CrossRef]

**18. **H. S. Malvar, L. W. He, and R. Cutler, “High-quality linear interpolation for demosaicing of Bayer-patterned color images,” in *IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)* (2004), pp. 485–488.

**19. **M. Takeda and K. Mutoh, “Fourier transform profilometry for the automatic measurement of 3-D object shapes,” Appl. Opt. **22**, 3977–3982 (1983). [CrossRef]

**20. **L. Bian, J. Suo, X. Hu, F. Chen, and Q. Dai, “Efficient single pixel imaging in Fourier space,” J. Opt. **18**, 085704 (2016). [CrossRef]

**21. **G. K. Wallace, “The JPEG still picture compression standard,” IEEE Trans. Consum. Electron. **38**, xviii (1992). [CrossRef]