High-efficiency terahertz single-pixel imaging based on a physics-enhanced network

Youquan Deng; Youquan Deng; Rongbin She; Rongbin She; Wenquan Liu; Wenquan Liu; Yuanfu Lu; Yuanfu Lu; Yuanfu Lu; Yuanfu Lu; Guangyuan Li; Guangyuan Li; Guangyuan Li; Guangyuan Li

doi:10.1364/OE.486297

1. Introduction

Terahertz imaging has numerous applications in nondestructive inspection [1,2], medical imaging [3,4], pharmaceutical industry [5], and food inspection [6]. In most of these applications, real-time imaging is preferable. However, to date, multiple-pixel terahertz cameras have been too expensive to be accessible for most users. Alternatively, single-pixel imaging techniques can overcome the scarcity of low-cost focal plane array detectors, especially in the terahertz regime. The operation principle is to illuminate the object with a series of spatial light patterns, to record the total transmitted or reflected light for each pattern with a detector with no spatial resolution (i.e., a single-pixel detector), and to reconstruct the two-dimensional image making use of the correlation between the spatial distributions of the patterns and the corresponding measurements obtained by the detector [7–9]. In terahertz single-pixel imaging system, a major challenge is to reduce the acquisition time necessary to reconstruct high-quality terahertz images [10–12].

In order to improve the image acquisition efficiency, many strategies have been proposed or developed over the years. The compressed sensing (CS) technique was first proposed to break away from the traditional Nyquist sampling theorem. It exploits the sparsity of natural images in the orthogonal transform domain to accurately reconstruct the object’s image using a small number of sample measurements. Making use of the CS technique, Chan et al. [13] adopted a series of binary metal masks to modulate terahertz intensity transmitting through the object and reconstructed the full terahertz image of 1024 pixels with 300 measurements. Although great progress has been achieved, CS-based terahertz single-pixel imaging still suffers from relatively long acquisition and reconstruction time due to the large number of measurements, especially for large-sized images. To tackle this problem, deterministic model-based techniques, which adopt deterministic orthogonal basis patterns such as Hadamard patterns [14–17] or Fourier patterns [18–20] to encode the terahertz beams, and utilize matrix operations to reconstruct terahertz images, were developed recently in the optical band. In the terahertz regime, Stantchev et al. [21] demonstrated Hadamard single-pixel imaging (HSI) and reconstructed terahertz images of $32\times 32$ pixels with 40% sampling ratio, and She et al. [22] demonstrated single-pixel imaging (FSI) and further reduced the sampling ratio down to 10%. However, at undersampling conditions, deterministic model-based techniques truncate the spectrum by acquiring the low-frequency part and discarding the high-frequency part, resulting in undesirable artifacts such as ringing artifacts in the reconstructed images [23].

Deep learning networks have recently demonstrated the capability to recover images in single-pixel imaging and ghost imaging systems with acceptable quality while employing an incredibly low sample ratio. Lyu et al. [24] proposed deep-learning-based ghost imaging and demonstrated much better performance than the CS method when the sampling ratio is extremely small. He et al. [25] further modified the convolutional neural network that is commonly used in deep learning in order to fit the characteristics of ghost imaging, and obtained image faster and more accurately at a low sampling rate compared with the conventional ghost imaging method. Rizvi et al. [26] proposed a deep learning framework to improve the imaging quality of 4-step FSI, they further proposed a fast image reconstruction framework based on deep convolutional auto-encoder network for 3-step FSI [23], and showed that the proposed deep-learning-based FSI outperforms conventional FSI in terms of image quality even at very low sampling ratios. Thanks to the extremely low sampling ratio, deep learning has also been extended to realized real-time single-pixel video at 30 frames-per seconds [27], snapshot compressive imaging with video-rate high-quality [28], single-pixel imaging of high-speed moving targets [29], and single-pixel LiDAR with improved scene reconstruction quality [30].

In the terahertz regime, deep learning techniques have also been developed to improve the quality of terahertz images. Long et al. [31] demonstrated that a deep convolutional neural network can significantly improve the quality of terahertz images with increased resolution and decreased noise. Wang et al. [32] developed a complex convolution neural network and obtained terahertz images with half a wavelength resolution. Zhu et al. [33] combined terahertz single-pixel imaging with deep learning networks and improved the imaging efficiency by greatly shortening the acquisition time while ensuring acceptable imaging quality. Stantchev et al. [34] demonstrated rapid terahertz single-pixel imaging with a spatial light modulator and a convolutional neural network. However, in these limited reports, the deep learning approach was used to enhance the obtained terahertz images only, and was not involved in the encoding-decoding processes of the terahertz single-pixel imaging. In other words, the set of spatial patterns that is projected onto the object remain conventional masking patterns, such as Hadamard or Fourier patterns.

Quite recently, Wang et al. [35] proposed a physics-enhanced deep learning approach for single-pixel imaging in the visible regime. They blended a physics-informed layer and a model-driven fine-tuning process, and demonstrated high quality single-pixel imaging with the sampling ratio down to 6.25%. In this work, we extend this concept to the terahertz regime for the first time, and report high-efficiency terahertz single-pixel imaging strategy based on a physics-enhanced network (PEN). Compared with the previously reported deep learning-enhanced terahertz single-pixel imaging [31–34], here the pattern generation and the image reconstruction are performed based on deep learning networks. We will introduce the theoretical framework and the experimental setup of the terahertz PEN-based single-pixel imaging (PENSI), and compare with the classical terahertz HSI and FSI. By performing numerical simulations and experimental validations with various types of objects, we will demonstrate that the developed PENSI strategy can significantly enhance the single-pixel imaging efficiency by greatly reducing the number of measurements that are required for clear image reconstruction. Remarkably, our simulation results will show a reduction by an order of magnitude compared with the classical HSI or FSI, corresponding to a strikingly ultra-low sampling ratio down to 1.56%, and experimental results will also demonstrate an ultra-low sampling ratio down to 3.12%, twice of the simulation results.

2. Theoretical framework

Before we elaborate on the developed PENSI, let us retrospect the classical HSI and FSI. These two approaches respectively adopt Hadamard and Fourier patterns to modulate the intensity of terahertz beam passing through the object, collect the transmitted signals using a single-pixel detector, and then reconstruct the image combined with inverse Hadamard or Fourier transform, respectively.

The Hadamard spectra comprise a series of Hadamard basis patterns in the spatial domain, which correspond to different Hadamard coefficients. For each measurement, the dot product of the object and the Hadamard basis pattern projected on it is recorded by a single-pixel terahertz detector. By projecting a series of Hadamard basis patterns $P_{\mathbb {H}}^{u_{0},v_{0}}$ onto the object, the cumulative signal collected can be expressed as

(1)$$C_{\mathbb{H}}(u_{0},v_{0}) =\sum_{x=0}^{N-1} \sum_{y=0}^{N-1} P_{\mathbb{H}}^{u_{0},v_{0}}(x,y)O(x,y),$$

where $C_{\mathbb {H}}(u_{0},v_{0})$ is the coefficient of the Hadamard frequency point $(u_{0},v_{0})$, $O(x,y)$ is the object of $N\times N$ pixels, and $(x,y)$ represents the coordinate of the spatial domain. The object’s terahertz image can be reconstructed by $R= \mathbb {H}^{-1}\left \{ C_{\mathbb {H}}(u,v) \right \}$, where $\mathbb {H}^{-1}\{\cdot \}$ is the inverse Hadamard transform. The Hadamard basis patterns can be generated by

(2)$$P_{\mathbb{H}}^{u_{0},v_{0}}(x,y)=\frac{1}{2} \left [ 1+\mathbb{H}^{{-}1}\left \{ \delta (u-u_{0},v-v_{0}) \right \} \right ],$$

where $\delta (u,v)$ is the delta function. Since the Hadamard patterns consist of elements of “$-1$” and “1”, whereas the digital micromirror device (DMD) can only project masks of elements “0” and “1”, we need to perform a differential measurement to acquire Hadamard basis patterns. The differential HSI takes two measures $D_{\mathbb {H+}}$ and $D_{\mathbb {H-}}$ [34], which are acquired with Eq. (1) by projecting the Hadamard basis pattern $P_{\mathbb {H+}}^{u_{0},v_{0}}$ and its inverse $P_{\mathbb {H-}}^{u_{0},v_{0}}$, respectively. $C_{\mathbb {H}}(u_{0},v_{0})$ is then obtained with

(3)$$C_{\mathbb{H}}(u_{0},v_{0}) = D_{\mathbb{H+}} - D_{\mathbb{H-}} \,.$$

Therefore, the full sampling of $N \times N$ pixel image takes $2N^2$ measurements.

Similar to the HSI, the FSI uses a series of Fourier patterns to obtain the Fourier spectrum of the object and reconstructs the terahertz image $R$ using the inverse Fourier transform. The Fourier spectrum can be expressed as

(4)$$C_{\mathbb{F}}(u_{0},v_{0}) =\sum_{x=0}^{N-1} \sum_{y=0}^{N-1} O(x,y)\mathrm{exp} \left [{-}j2\pi \left ( \frac{u_{0}x}{N} +\frac{v_{0}y}{N} \right ) \right ] ,$$

where $C_{\mathbb {F}}(u_{0},v_{0})$ is the coefficient of the Fourier frequency point $(u_{0},v_{0})$. The object’s image can be reconstructed by $R= \mathbb {F}^{-1}\left \{ C_{\mathbb {F}}(u,v) \right \}$, where $\mathbb {F}^{-1}\{\cdot \}$ is the inverse Fourier transform. The patterns that we project onto the object are obtained by the inverse Fourier transform of delta functions at different positions:

(5)$$P_{\mathbb{F}}^{u_{0},v_{0},\phi_{0}}(x,y)=\frac{1}{2} \mathrm{real} \left [ 1+\mathbb{F}^{{-}1}\left \{ \delta (u-u_{0},v-v_{0},\phi_{0}) \right \} \right ] \,.$$

Since the Fourier spectra are complex-valued, four-step FSI measurements $D_{0}$, $D_{\pi /2}$, $D_{\pi }$ and $D_{3\pi /2}$, are required, which project Fourier patterns $P_{\mathbb {F}}^{u_{0},v_{0},0}$, $P_{\mathbb {F}}^{u_{0},v_{0},\pi /2}$, $P_{\mathbb {F}}^{u_{0},v_{0},\pi }$ and $P_{\mathbb {F}}^{u_{0},v_{0},3\pi /2}$ onto the object, respectively:

(6)$$D_{\phi_{0}}^{u_{0},v_{0}}= \sum_{x=0}^{N-1} \sum_{y=0}^{N-1} O(x,y)P_{\mathbb{F}}^{u_{0},v_{0},\phi_{0}}(x,y)\,.$$

Therefore, $C_{\mathbb {F}}(u_{0},v_{0})$ can be obtained with

(7)$$C_{\mathbb{F}}(u_{0},v_{0}) = (D_{\pi}-D_{0})+j(D_{3\pi/2}-D_{\pi/2})\,.$$

Since the Fourier spectrum is conjugated symmetric, full sampling of an image of $N \times N$ pixels using the four-step FSI also takes $2N^2$ measurements. Because the above four series of Fourier patterns compose of elements between “0” and “1”, i.e., $[0,1]$, they can be binarized by spatial dithering using the DMD system [19]. Undersampling refers to acquiring only the low-frequency coefficients and omitting the high-frequency ones, given the prior knowledge that most energy of natural images is concentrated in low-frequency bands, so we use the square strategy for the Hadamard spectrum and diamond strategy for the Fourier spectrum to reduce the measurements, respectively [20].

The PENSI approach developed in this work is illustrated in Fig. 1. Similar to Ref. [35], the developed PEN also consists of three main parts. The first part is a physics encoding layer that generates a set of optimal encoding patterns $P_{\mathbb {DL}}$, which are projected onto the object in order to modulate the terahertz intensity. The corresponding signal collected by the single-pixel detector can be expressed as

(8)$$D_{\mathbb{DL}} =\sum_{x=1}^{N} \sum_{y=1}^{N} P_{\mathbb{DL}}(x,y)O(x,y)\,.$$

The second part is a channel attention layer that reconstructs the object’s terahertz image,

(9)$$O^{'}=\sum_{i=1}^{M} D_{\mathbb{DL}}^{i} P_{\mathbb{DL}}^{i} W^{i},$$

where $M$ represents the number of optimal encoding patterns, and $W$ is the weight of the convolution layer with channel attention. Here, channel attention is adopted to improve the representational power of the network by explicitly modeling the interdependencies between the channels of its convolutional features [36]. This is distinct from Ref. [35], where differential ghost imaging was adopted. The third part is a U-net network [37] that enhances the reconstructed image through $R=U_{\theta }(O^{'})$, where $\theta$ represents parameters used in the U-net structure.

Fig. 1. Diagram of the developed terahertz PENSI framework.

Download Full Size | PDF

The weights of the decoding layer $W$, the parameters of the U-net $\theta$, and the optimal encoding patterns $P_{\mathbb {DL}}$ should be trained. Starting with random initialization, the patterns $P_{\mathbb {DL}}$ and the weights parameters $W$ and $\theta$ in the model can be optimized by solving

(10)$$\left \{ P_{\mathbb{DL}},W,\theta \right \} = \min_{\theta \in \Theta ,P\in \mathbb{P},W\in \mathbb{W} } \left \|R_{k}-O_{k} \right \| ^2,$$

where $R_{k}$ and $O_{k}$ represent the output image and the input image in the training data set, respectively. In order to generate binary patterns $P_{\mathbb {DL}}$, facilitating the experimental realization using the DMD, we adopt a hard threshold to adjust the weights of $P_{\mathbb {DL}}$ to be 0 or 1.

3. Simulation results

The freely available MNIST database [38] of handwritten digits “0”–“9” were used in simulations. A training set of randomly selected 60,000 samples was adopted to optimize all the parameters of the developed PEN, and a test set of another randomly selected 10,000 samples was used to test the as-trained network. The digits were first size-normalized and centered in a fixed-size image of $28\times 28$ pixels. We then resized the MNIST database digits images of $28\times 28$ pixels into $32\times 32$ or $64\times 64$ images as the input of the PEN, and generated the optimal encoding patterns of the same resolution.

In the implementation of the network, the learning rate [39] is set to 0.0002, and the batch normalization [40] is 512. The training is conducted in a high-end gaming computer with an Intel i9-12900k CPU, 64 GB RAM, and an NVIDIA RTX 3090 GPU. The model learning curves (not shown here) indicate that high accuracy and relatively fast convergence can be achieved within 100 epochs.

Figure 2 shows four typical encoding binary patterns that are optimally generated by the developed PEN for a MNIST handwritten digit image of two different resolutions, $32\times 32$ pixels and $64\times 64$ pixels. We find that these patterns have random distribution of binary values (black for “0” and white for “1”) with sparse streaks in the center, and that the patterns for the $64 \times 64$ pixels are of higher resolution with denser streaks. Note that the distribution of binary values, which is first initialized randomly and then optimized after the network training, depends on the nature of the images in the training set. As a comparison, the basis patterns for the HSI and the FSI are well defined, corresponding to the Hadamard and Fourier spectra in the spatial domain, respectively, as shown in Ref. [22].

Fig. 2. Four typical encoding binary patterns that are optimally generated by the developed PEN for a MNIST handwritten digit image of two different resolutions, (a) $32\times 32$ pixels and (b) $64\times 64$ pixels.

Download Full Size | PDF

We first focus on the scenario of $32\times 32$ pixels. Eight sampling processes with different numbers of measurements, $M=8$, 16, 32, 64, 128, 256, 512, and 1024, were performed. The sampling ratio $S_{\rm R}$ can be defined as $S_{\rm R}\equiv M/N^2$. Thus, a full sampling ratio means $M=N^2$, i.e., $M=1024$ for $N=32$. In contrast, for the HSI or the FSI, since full sampling of an $N \times N$ pixeled image requires $2N^2$ measurements, the sampling ratio should be defined as $S_{\rm R}\equiv M/(2N^2)$. Thus, a full sampling ratio for the HSI or the FSI means $M=2N^2$, i.e., $M=2048$ for $N=32$. We emphasize that here the definition of the sampling ratio for the HSI or the FSI is slightly different from that in our previous work [22], where $S_{\rm R}\equiv M/N^2$ was conventionally used.

Figure 3 shows the ground truth images and the simulation results of the developed PENSI, which are compared with those of the classical HSI and FSI. For fair comparison, the same number of measurements was used for different approaches. Results show that as the sampling ratio or the number of measurements increases, the reconstructed images become clearer, and that for the full sampling ratio, i.e., $M=1024$, all the images can be clearly reconstructed for all the three approaches. For the HSI, all the reconstructed handwritten digits images can be clearly recognized provided $M\geq 256$; for the FSI, the minimal required measurement number reduces by half, to $M=128$. Strikingly, for the developed PENSI, all the reconstructed images can be clearly recognized even when $M=16$, which is an order of magnitude smaller than those of the HSI or the FSI, and most reconstructed images (except the digit “2”) can be clearly recognized when the number of measurements is further down to $M=8$, which is $1/32$ of that of the HSI or $1/16$ of that of the FSI. Correspondingly, the minimum sampling ratios are $S_{\rm R}=12.5\%$ for the HSI, $S_{\rm R}=6.25\%$ for the FSI, and $S_{\rm R}=1.56\%$ for the PENSI. These obtained sampling ratios for the HSI and the FSI are consistent with the literature, for example, in the visible regime these values are $S_{\rm R}=25\%$ and 10% for the HSI and the FSI, respectively [20], and in the terahertz regime these values are about $S_{\rm R}=15.5\%$ and 6%, which are adjusted here according to the new definition in this work, for the HSI and the FSI, respectively, as experimentally demonstrated in our previous work [22]. Therefore, simulation results show that the developed PENSI approach is much more efficient than the classical HSI and FSI with a greatly reduced (by an order of magnitude) number of measurements (down to $M=8$ for images of $32\times 32$ pixels) or sampling ratio (down to $S_{\rm R}=1.56\%$) for clear image reconstruction.

Fig. 3. Comparison of the ground truth and the simulation results obtained with the classical HSI and FSI, and with the developed PENSI using different numbers of measurements. The red boxes indicate the minimum numbers of measurements required for the clear recognition of all the reconstructed digits images for the three approaches.

Download Full Size | PDF

To quantitatively evaluate the reconstructed image quality, we adopted two figures of merit, the peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM) [41], which are calculated as

(11)$${\rm PSNR}=10\cdot \log_{10}{\frac{255^2}{{\rm MSE}}},$$

(12)$${\rm SSIM}=\frac{2(u_{\rm O} u_{\rm R}+c_1)(2\sigma _{\rm OR}+c_2)}{(u_{\rm O}^{2}+u_{\rm R}^2+c_1)(\sigma_{\rm O}^{2}+\sigma_{\rm R}^2+c_1)},$$

respectively. Here, MSE is the mean square error defined as

(13)$${\rm MSE}\equiv \frac{1}{N^2} \sum_{x=0}^{N-1} \sum_{y=0}^{N-1} \left [ O(x,y) -R(x,y)\right ] ^2\,.$$

$u_{\rm O/R}$ and $\sigma _{O/R}^2$ represent the average value and the variance of the input image $O$ or the reconstructed image $R$, respectively, $\sigma _{OR}$ denotes the covariance of the input image $O$ and the reconstructed image $R$, and $c_1$ and $c_2$ are constants used to maintain stability.

Figure 4 compares the PSNR and the SSIM of the reconstructed images using the three different approaches as functions of the number of measurements. Here, both the PSNRs and the SSIMs are calculated by averaging the 10000 reconstructed images of the test set, and the error bars indicate the standard deviations. Figure 4(a) shows that as $M$ increases, the PSNR increases linearly for all the three approaches, that the PSNRs for the FSI are slightly larger than those for the HSI, and that the PSNRs for the PENSI are much larger than those for the HSI and the FSI with differences about 10 dB. Specifically, when $M=1024$, the PSNR of the PENSI reaches as high as 38.0 dB, whereas those for the FSI and the HSI are only 24.37 dB and 19.35 dB, respectively.

Fig. 4. (a) Simulated PSNRs and (b) SSIMs of the reconstructed images using three different approaches as functions of the number of measurements. The error bars indicate standard deviations of the test set.

Download Full Size | PDF

Figure 4(b) shows that for $M\leq 64$, the SSIMs are smaller than 0.2 and increase slowly with $M$ for both the HSI and the FSI. For $M\geq 128$, the SSIMs increase linearly with $M$ and reach 0.38 and 0.45 for the HSI and the FSI, respectively. In general, the SSIMs for the FSI are slightly larger than those for the HSI. For the developed PENSI, the SSIM reaches 0.758 even when $M=8$, increases with $M$ with reducing speed, and becomes saturate at the value above 0.95 when $M\geq 32$. All these quantitative results are consistent with the intuitive impression of the reconstructed images in Fig. 3.

4. Experimental demonstrations

Encouraged by the simulation results, we implemented an experimental terahertz PENSI system based on a home-built terahertz time-domain spectroscopy (THz-TDS) setup [42], as schematically shown in Fig. 5. Femtosecond laser pulses ($\lambda =800$ nm, $< 100$ fs, 80 MHz; Coherent Vitesse 800-5) illuminate a terahertz emitter to generate terahertz radiation. The collimated terahertz beam transmits through a laser-engraved metal target of diameter 3.5 mm and thickness 1 mm that is mounted on a high-resistance silicon (HRS) wafer (2000 $\Omega \cdot$cm, 500 $\mu$m thick), both sides of which have been passivated with a thin SiO$_2$ film [43]. The diameter of the collimated terahertz beam is about 3.82 mm and the average power of the terahertz radiation is about 10 $\mu$W. A continue wave (CW) laser beam of wavelength 808 nm and the maximum power 300 mW is spatially patterned via a DMD, and is then projected onto the front side of the HRS with a spot size of 4 mm $\times$ 4 mm. Note that the patterned CW laser beam size should be slightly larger than the terahertz beam size in order to realize effective modulation, and meanwhile the terahertz beam size should be larger than the target area in order to achieve a full-size image.

Fig. 5. Schematic of the experimental setup for the terahertz PENSI system. The target is a laser-engraved metal mounted on the HRS. OAP indicates off-axis parabolic mirror.

Download Full Size | PDF

The mechanism of optical modulation of the terahertz waves in each array pixel spatially was described in Ref. [44] and thus will not be elaborated here. Due to the light-induced increase in free carriers and local conductivity, the terahertz transmission within pixels illuminated by the CW laser will be “turned off”, whereas that in other pixels remains high ("on") [45]. A single-pixel terahertz detector then collects the encoded transmitted wave. We recorded the measurements at the peak of the terahertz field, which was achieved by properly tuning the delay line for femtosecond laser pulses [22]. We adopted a data acquisition card combined with a lock-in amplifier to obtain the modulated terahertz intensity under the illumination of the projected pattern.

We emphasize that the experimental setup for the terahertz PENSI system is very similar to those of the classical terahertz HSI and FSI systems [22]. The differences are that the encoding patterns are generated by the PEN and that the reconstructed images are enhanced by the U-net network for the PENSI. We note that the PEN used in experiments was trained during simulations using the MNIST database of handwritten digits. By changing the optimal encoding patterns $P_{\mathbb {DL}}$ into the Hadamard basis patterns $P_{\mathbb {H+}}^{u_{0},v_{0}}$ and $P_{\mathbb {H-}}^{u_{0},v_{0}}$, or the Fourier basis patterns $P_{\mathbb {F}}^{u_{0},v_{0},0}$, $P_{\mathbb {F}}^{u_{0},v_{0},\pi /2}$, $P_{\mathbb {F}}^{u_{0},v_{0},\pi }$ and $P_{\mathbb {F}}^{u_{0},v_{0},3\pi /2}$, and by reconstructing the image with the inverse Hadamard transform or the inverse Fourier transform, the experimental setup for the terahertz PENSI system can be conveniently modified into that for the terahertz HSI or FSI system, which was illustrated in our previous work [22].

Figure 6(a) shows the optical images of two laser-engraved objects for digits “5” and “7” in metal. We first consider terahertz images of $32 \times 32$ pixels using the HSI, the FSI and the PENSI systems. Figure 6(b) summarizes the reconstructed images with different numbers of measurements. It is clear that for all the three systems, the quality of the reconstructed images increases with $M$. The minimum numbers of measurements for clear recognition of the reconstructed digits “5” and “7” are 128, 64 and 32 for the HSI, the FSI and the PENSI, respectively, as outlined by the red boxes. These numbers are similar to those obtained from simulations in Fig. 3: $M=128$ for the HSI and the FSI, and $M=8$ for the PENSI, which are obtained only for the two digits “5” and “7”. We note that in simulation the PENSI approach reduces the minimum number of measurements to $1/8$ of that for the FSI, whereas in experiment the reduction is only $1/2$. This discrepancy arises because the PEN used in both experiments and simulations was trained using the MNIST database of handwritten digits, and was tested with the handwritten digits in simulations, but with digits of Calibri font in experiments. Therefore, the minimum sampling ratio for clear image construction is 3.12% in experiments, which is twice of the simulation result (1.56%). In other words, we have experimentally verified that the developed terahertz PENSI system is much more efficient than the classical HSI and FSI systems in terms of the minimum number of measurements for clear recognition of reconstructed terahertz images.

Fig. 6. (a) Optical images of two as-fabricated objects of digits “5” and “7” with scale bar of 1 mm. (b) Experimentally reconstructed terahertz images of $32 \times 32$ pixels using the three methods with different numbers of measurements. Red boxes indicate the minimum numbers of measurements required for clear recognition of the two reconstructed digits: 128 for the HSI, 64 for the FSI, and 32 for the PENSI.

Download Full Size | PDF

In order to quantitatively evaluate the reconstructed images, we calculate the PSNR and the SSIM by taking the reconstructed images with the PENSI system under the full sampling ratio or $M=1024$ as the terahertz ground truth images. The results are summarized in Tables 1 and 2. For the digits “5” and “7” reconstructed with the three methods, the PSNR and the SSIM both increase with $M$. The PSNRs for the FSI are slightly larger than those for the HSI, and those for the PENSI are generally larger than those for the HSI and the FSI with differences about $3\sim 7$ dB. The SSIMs are smaller than 0.2 for the HSI with $M\leq 64$ and for the FSI with $M\leq 32$. These correspond to the unrecognizable images below the red boxes in Fig. 6. As a comparison, for the PENSI, the SSIMs are larger than 0.57 even when $M=32$, corresponding to the recognizable reconstructed images. In other words, in terms of the PSNR and the SSIM with the same number of measurements, the developed terahertz PENSI system outperforms the classical terahertz HSI and FSI systems.

Table 1. Calculated PSNRs and SSIMs of the experimentally reconstructed images using different methods for digit “5”.^a

View Table | View all tables in this article

Table 2. Similar to Table 1 but for digit “7”.

View Table | View all tables in this article

5. Discussion

While until now we considered digits of $32\times 32$ pixels, the concept and the conclusions are general and can apply to any type of objects of various resolutions. As the first example, we fabricated four another types of objects, letters, symbols, operators and Chinese characters. We first consider reconstructing terahertz images of $32\times 32$ pixels using the three single-pixel terahertz imaging methods, the HSI, the FSI and the PENSI. For simplicity and fairness, we fix the number of measurements to be $M=128$, which is the minimum value that the reconstructed terahertz images of both digits “5” and “7” are recognizable, as suggested in Fig. 6.

Figure 7(a) shows the optical images of the as-fabricated objects of the four different types. Figure 7(b) depicts the experimentally reconstructed terahertz images of $32 \times 32$ pixels using the three methods. Results clearly show that for all the four types of objects, the PENSI outperforms over the other two methods, and that the FSI is much better than the HSI. More specifically, the reconstructed terahertz images using the HSI are blurred, those using the FSI suffer from relatively small contrast, whereas those using the PENSI have clear edges and large contrast. These conclusions are consistent with our previous results obtained from Figs. 3 and 6.

Fig. 7. (a) Optical images of as-fabricated objects of four different types: letters, symbols, operators, and Chinese characters. The scalar bar is 1 mm. (b) Experimentally reconstructed terahertz images of $32 \times 32$ pixels using the three methods with 128 measurements.

Download Full Size | PDF

As another example, we further consider reconstructing the terahertz images of two different resolutions, $32\times 32$ pixels and $64\times 64$ pixels. Figure 8 compares the experimentally reconstructed terahertz images of the two different resolutions using the three methods. For fair comparison, the number of measurements is also fixed to be $M=128$ and four specific objects of different types are used. Results show that regardless of the resolution, the terahertz images reconstructed by the PENSI system have the best quality, in term of clarity and contrast, and that the FSI outperforms over the HSI. These conclusions are also consistent with those obtained from simulation and experiment results using digits, as shown by Figs. 3 and 6, respectively. Therefore, we have experimentally demonstrated that the developed PENSI system is much better than the classical HSI and FSI systems for different image resolutions.

Fig. 8. Optical images of four typical objects of different types, and the corresponding experimentally reconstructed terahertz images of $32\times 32$ and of $64\times 64$ pixels obtained using the three methods with 128 measurements. The scalar bar is 1 mm.

Download Full Size | PDF

6. Conclusions

In conclusion, we have developed, for the first time, a high-efficiency terahertz single-pixel imaging approach based on PEN. The theoretical framework and the experimental realization of the developed terahertz PENSI system have been elaborated by comparing with the classical HSI and FSI systems, highlighting their similarities as well as differences. By using the optimal encoding patterns and by enhancing the reconstructed image using a U-net network, which are first trained before using, the developed PENSI has shown remarkable high efficiency in terms of the minimal number of measurements that is required for clear image reconstruction. Simulation results based on the MNIST database of handwritten digits have shown that this number for the PENSI can be reduced down to $M=16$ for images of $32\times 32$ pixels, an order of magnitude smaller than those of the HSI and the FSI. The corresponding sampling ratio is as low as $S_{\rm R}=1.56\%$. Results have also shown that with the same number of measurements, the PENSI outperforms the HSI and the FSI in terms of the PSNR and the SSIM of the reconstructed images. Experimental results based on various types of objects, including digits, letters, symbols, operators, and Chinese characters, have verified the much higher efficiency with an ultra-low sampling ratio of 3.12%, and better performance of the developed terahertz PENSI system over the HSI and the FSI systems, regardless of the image resolution being $32\times 32$ or $64 \times 64$ pixels. We therefore expect that the developed terahertz PENSI system will advance the development of real-time terahertz single-pixel imaging in a variety of applications, ranging from non-invasive inspections to security and medical diagnosis.

Funding

National Natural Science Foundation of China (62105354); Major Instrumentation Development Program of the Chinese Academy of Sciences (ZDKYYQ20220008); Guangdong International Science and Technology Cooperation Fund (2018A50506065); Program of the Department of Natural Resources of Guangdong Province, China (GDNRC[2020]031); Key Laboratory of Optoelectronic Devices and Systems of Ministry of Education and Guangdong Province.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. R. Fukasawa, “Terahertz imaging: Widespread industrial application in non-destructive inspection and chemical analysis,” IEEE Trans. Terahertz Sci. Technol. 5, 1121–1127 (2015).

2. N. Karpowicz, H. Zhong, C. Zhang, K.-I. Lin, J.-S. Hwang, J. Xu, and X.-C. Zhang, “Compact continuous-wave subterahertz system for inspection applications,” Appl. Phys. Lett. 86(5), 054105 (2005). [CrossRef]

3. X. Chen, H. Lindley-Hatcher, R. I. Stantchev, J. Wang, K. Li, A. Hernandez Serrano, Z. D. Taylor, E. Castro-Camus, and E. Pickwell-MacPherson, “Terahertz (THz) biophotonics technology: Instrumentation, techniques, and biomedical applications,” Chem. Phys. Rev. 3(1), 011311 (2022). [CrossRef]

4. Z. Yan, L.-G. Zhu, K. Meng, W. Huang, and Q. Shi, “THz medical imaging: from in vitro to in vivo,” Trends Biotechnol. 40(7), 816–830 (2022). [CrossRef]

5. D. Alves-Lima, J. Song, X. Li, A. Portieri, Y. Shen, J. A. Zeitler, and H. Lin, “Review of terahertz pulsed imaging for pharmaceutical film coating analysis,” Sensors 20(5), 1441 (2020). [CrossRef]

6. L. Afsah-Hejri, P. Hajeb, P. Ara, and R. J. Ehsani, “A comprehensive review on food applications of terahertz spectroscopy and imaging,” Compr. Rev. Food. Sci. Food Saf. 18(5), 1563–1621 (2019). [CrossRef]

7. G. M. Gibson, S. D. Johnson, and M. J. Padgett, “Single-pixel imaging 12 years on: a review,” Opt. Express 28(19), 28190–28208 (2020). [CrossRef]

8. M. P. Edgar, G. M. Gibson, and M. J. Padgett, “Principles and prospects for single-pixel imaging,” Nat. Photonics 13(1), 13–20 (2019). [CrossRef]

9. W. Jiang, X. Li, S. Jiang, Y. Wang, Z. Zhang, G. He, and B. Sun, “Increase the frame rate of a camera via temporal ghost imaging,” Opt. Lasers Eng. 122, 164–169 (2019). [CrossRef]

10. L. Zanotto, R. Piccoli, J. L. Dong, R. Morandotti, and L. Razzari, “Single-pixel terahertz imaging: a review,” Opto-Electron. Adv. 3(9), 20001201 (2020). [CrossRef]

11. Q. Hu, X. Wei, Y. Pang, and L. Lang, “Advances on terahertz single-pixel imaging,” Front. Physics 10, 982640 (2022). [CrossRef]

12. X. Yang, Z. Tian, X. Chen, M. Hu, Z. Yi, C. Ouyang, J. Gu, J. Han, and W. Zhang, “Terahertz single-pixel near-field imaging based on active tunable subwavelength metallic grating,” Appl. Phys. Lett. 116(24), 241106 (2020). [CrossRef]

13. W. L. Chan, K. Charan, D. Takhar, K. F. Kelly, R. G. Baraniuk, and D. M. Mittleman, “A single-pixel terahertz imaging system based on compressed sensing,” Appl. Phys. Lett. 93(12), 121105 (2008). [CrossRef]

14. M. J. Sun, M. P. Edgar, G. M. Gibson, B. Q. Sun, N. Radwell, R. Lamb, and M. J. Padgett, “Single-pixel three-dimensional imaging with time-based depth resolution,” Nat. Commun. 7(1), 12010 (2016). [CrossRef]

15. S. Jiang, X. Li, Z. Zhang, W. Jiang, Y. Wang, G. He, Y. Wang, and B. Sun, “Scan efficiency of structured illumination in iterative single pixel imaging,” Opt. Express 27(16), 22499–22507 (2019). [CrossRef]

16. Q. Yi, L. Z. Heng, L. Liang, Z. Guangcan, C. F. Siong, and Z. Guangya, “Hadamard transform-based hyperspectral imaging using a single-pixel detector,” Opt. Express 28(11), 16126–16139 (2020). [CrossRef]

17. E. Hahamovich, S. Monin, Y. Hazan, and A. Rosenthal, “Single pixel imaging at megahertz switching rates via cyclic Hadamard masks,” Nat. Commun. 12(1), 4516 (2021). [CrossRef]

18. Z. Zhang, X. Ma, and J. Zhong, “Single-pixel imaging by means of Fourier spectrum acquisition,” Nat. Commun. 6(1), 6225 (2015). [CrossRef]

19. Z. Zhang, X. Wang, G. Zheng, and J. Zhong, “Fast Fourier single-pixel imaging via binary illumination,” Sci. Rep. 7(1), 12029 (2017). [CrossRef]

20. Z. Zhang, X. Wang, G. Zheng, and J. Zhong, “Hadamard single-pixel imaging versus Fourier single-pixel imaging,” Opt. Express 25(16), 19619–19639 (2017). [CrossRef]

21. R. L. Stantchey, X. Yu, T. Blu, and E. Pickwell-MacPherson, “Real-time terahertz imaging with a single-pixel detector,” Nat. Commun. 11(1), 2535 (2020). [CrossRef]

22. R. She, W. Liu, Y. Lu, Z. Zhou, and G. Li, “Fourier single-pixel imaging in the terahertz regime,” Appl. Phys. Lett. 115(2), 021101 (2019). [CrossRef]

23. S. Rizvi, J. Cao, K. Y. Zhang, and Q. Hao, “Deringing and denoising in extremely under-sampled Fourier single pixel imaging,” Opt. Express 28(5), 7360–7374 (2020). [CrossRef]

24. M. Lyu, W. Wang, H. Wang, H. Wang, G. Li, N. Chen, and G. Situ, “Deep-learning-based ghost imaging,” Sci. Rep. 7(1), 17865 (2017). [CrossRef]

25. Y. He, G. Wang, G. Dong, S. Zhu, H. Chen, A. Zhang, and Z. Xu, “Ghost imaging based on deep learning,” Sci. Rep. 8(1), 6469 (2018). [CrossRef]

26. S. Rizvi, J. Cao, K. Zhang, and Q. Hao, “Improving imaging quality of real-time Fourier single-pixel imaging via deep learning,” Sensors 19(19), 4190 (2019). [CrossRef]

27. C. F. Higham, R. Murray-Smith, M. J. Padgett, and M. P. Edgar, “Deep learning for real-time single-pixel video,” Sci. Rep. 8(1), 2369 (2018). [CrossRef]

28. M. Qiao, Z. Meng, J. Ma, and Y. Xin, “Deep learning for video compressive sensing,” APL Photonics 5(3), 030801 (2020). [CrossRef]

29. W. Jiang, X. Li, X. Peng, and B. Sun, “Imaging high-speed moving targets with a single-pixel detector,” Opt. Express 28(6), 7889–7897 (2020). [CrossRef]

30. N. Radwell, S. D. Johnson, M. P. Edgar, C. F. Higham, R. Murray-Smith, and M. J. Padgett, “Deep learning optimized single-pixel LiDAR,” Appl. Phys. Lett. 115(23), 231101 (2019). [CrossRef]

31. Z. Long, T. Wang, C. You, Z. Yang, K. Wang, and J. Liu, “Terahertz image super-resolution based on a deep convolutional neural network,” Appl. Opt. 58(10), 2731–2735 (2019). [CrossRef]

32. Y. Wang, F. Qi, and J. Wang, “Terahertz image super-resolution based on a complex convolutional neural network,” Opt. Lett. 46(13), 3123–3126 (2021). [CrossRef]

33. Y. Zhu, R. She, W. Liu, Y. Lu, and G. Li, “Deep learning optimized terahertz single-pixel imaging,” IEEE Trans. Terahertz Sci. Technol. 12(2), 165–172 (2022). [CrossRef]

34. R. I. Stantchev, K. Li, and E. Pickwell-MacPherson, “Rapid imaging of pulsed terahertz radiation with spatial light modulators and neural networks,” ACS Photonics 8(11), 3150–3155 (2021). [CrossRef]

35. F. Wang, C. Wang, C. Deng, S. Han, and G. Situ, “Single-pixel imaging using physics enhanced deep learning,” Photonics Res. 10(1), 104–110 (2022). [CrossRef]

36. J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, “Squeeze-and-excitation networks,” IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 2011–2023 (2020). [CrossRef]

37. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention (Springer International Publishing, 2015), pp. 234–241.

38. L. Deng, “The mnist database of handwritten digit images for machine learning research [best of the web],” IEEE Signal Process. Mag. 29(6), 141–142 (2012). [CrossRef]

39. A. Senior, G. Heigold, M. Ranzato, and K. Yang, “An empirical study of learning rates in deep neural networks for speech recognition,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (2013).

40. S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International Conference on Machine Learning, Vol. 37 (2015), pp. 448–456.

41. Z. Ye, P. Qiu, H. Wang, J. Xiong, and K. Wang, “Image watermarking and fusion based on Fourier single-pixel imaging with weighed light source,” Opt. Express 27(25), 36505–36523 (2019). [CrossRef]

42. R. She, Y. Lu, W. Liu, R. Zhang, and G. Li, “A low-cost single-pixel terahertz imaging method using near-field photomodulation and compressed sensing,” Proc. SPIE 11196, 111960O (2019). [CrossRef]

43. R. She, W. Liu, G. Wei, Y. Lu, and G. Li, “Terahertz single-pixel imaging improved by using silicon wafer with SiO₂ passivation,” Appl. Sci. 10(7), 2427 (2020). [CrossRef]

44. L.-J. Cheng and L. Liu, “Optical modulation of continuous terahertz waves towards cost-effective reconfigurable quasi-optical terahertz components,” Opt. Express 21(23), 28657–28667 (2013). [CrossRef]

45. A. Kannegulla, Z. Jiang, S. M. Rahman, M. I. B. Shams, P. Fay, H. G. Xing, L.-J. Cheng, and L. Liu, “Coded-aperture imaging using photo-induced reconfigurable aperture arrays for mapping terahertz beams,” IEEE Trans. Terahertz Sci. Technol. 4(3), 321–327 (2014). [CrossRef]

Measurements	HSI		FSI		PENSI
Measurements	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM
8	4.58	0.07	5.33	0.00	12.53	0.39
16	6.14	0.08	10.20	0.13	11.74	0.30
32	7.98	0.13	12.58	0.26	16.01	0.65
64	10.81	0.18	15.79	0.52	17.04	0.69
128	13.58	0.35	14.43	0.43	21.30	0.90
256	13.26	0.39	14.62	0.44	23.68	0.93
512	14.92	0.47	14.83	0.46	23.92	0.93
1024	13.44	0.44	14.78	0.46	–	1

Measurement	HSI		FSI		PENSI
Measurement	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM
8	6.87	0.13	5.34	0.08	9.79	0.47
16	8.27	0.06	10.72	0.11	12.11	0.56
32	10.37	0.15	14.00	0.23	13.02	0.57
64	10.54	0.12	16.05	0.30	15.31	0.67
128	15.49	0.27	16.80	0.35	19.09	0.85
256	13.89	0.24	17.11	0.38	22.64	0.84
512	17.25	0.31	17.13	0.39	23.80	0.92
1024	15.90	0.28	17.24	0.40	–	1

Measurements	HSI		FSI		PENSI
Measurements	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM
8	4.58	0.07	5.33	0.00	12.53	0.39
16	6.14	0.08	10.20	0.13	11.74	0.30
32	7.98	0.13	12.58	0.26	16.01	0.65
64	10.81	0.18	15.79	0.52	17.04	0.69
128	13.58	0.35	14.43	0.43	21.30	0.90
256	13.26	0.39	14.62	0.44	23.68	0.93
512	14.92	0.47	14.83	0.46	23.92	0.93
1024	13.44	0.44	14.78	0.46	–	1

Measurement	HSI		FSI		PENSI
Measurement	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM
8	6.87	0.13	5.34	0.08	9.79	0.47
16	8.27	0.06	10.72	0.11	12.11	0.56
32	10.37	0.15	14.00	0.23	13.02	0.57
64	10.54	0.12	16.05	0.30	15.31	0.67
128	15.49	0.27	16.80	0.35	19.09	0.85
256	13.89	0.24	17.11	0.38	22.64	0.84
512	17.25	0.31	17.13	0.39	23.80	0.92
1024	15.90	0.28	17.24	0.40	–	1

High-efficiency terahertz single-pixel imaging based on a physics-enhanced network

Abstract

1. Introduction

2. Theoretical framework

3. Simulation results

4. Experimental demonstrations

5. Discussion

6. Conclusions

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (8)

Tables (2)

Equations (13)

Optics Express