Convolutional neural network model based on terahertz imaging for integrated circuit defect detections

Qi Mao; Yunlong Zhu; Cixing Lv; Yao Lu; Xiaohui Yan; Shihan Yan; Jingbo Liu

doi:10.1364/OE.384146

1. Introduction

The techniques of passive imaging and inspection used for the detection of integrated circuit (IC) defects have received increased attention [1,2]. Catastrophic failure can cause damages to numerous electronic components. There are many causes of IC failures [3–5]. One of the most frequent causes is electrical overstress, owing to excessive voltage infliction, voltage reverse polarity, or short circuits, and may be transient or intermittent. Without static protection, IC surface may suffer from static electricity owing to the friction developed when the electronic device slides on the surface of another material, or when the device is transported across the floor. In these cases, voltages may be generated as high as 12 kV, which are approximately 2400 times the operating voltage of the ICs [3].

Bond wire melting, metallization diffusion, and dielectric breakdown, are roots of IC failures. These are attributed to the thermal shocks induced by electrical overstress, whereby the dielectric breakdown of the ICs also generates more internal defects, such as cracks, delamination, and voids inside the packaging materials. Park et al. analyzed the failures caused by various packages and confirmed the cause of internal defects induced during the IC packaging process. Sometimes, the values of current and voltage characteristics of the defective ICs and the qualified ICs are nondistinctive [6]. To locate the voids of the packaged IC materials and delamination of the dielectric, sample cross-sectional views of the x-z or y-z planes were acquired with scanning electron microscopy. In this time-consuming detection process, the sampled ICs are completely destructed [7].

Subsequently, X-ray and terahertz (THz) imaging inspection are used to detect IC defects. These imaging techniques have become the main technologies used for contactless IC defect detections. X-rays can penetrate the nonmetallic material and image the internal IC metallic structure based on the received emission energy. However, X-rays are unable to detect the cracks, delamination, and voids inside the ICs. Furthermore, the ionization characteristics of X-ray are highly likely to damage electronic devices, and are harmful to humans [8]. Consequently, it is not appropriate to use X-rays for online real-time detection of IC defects.

The detection of defective ICs is crucial to IC manufacturing. This is attributed to the fact that when the defective ICs are integrated in the system, they will cause serious plate-level defects. The alternative approach which has received increased attention in recent years is called THz imaging [9–11]. The THz spectrum ranges between 300 GHz (0.3 THz) and 30 THz. This range is often referred to as the “terahertz gap,” and possesses an inherent ability to “see-through” numerous types of nonmetallic materials, that is, the THz radiation is capable of penetrating most materials [12–15]. This advantage allows the use of THz spectroscopic imaging in the detection of the internal structures of various objects, including ICs. Given that many materials have their unique spectral fingerprinting characteristics in the THz spectral region, THz imaging technology is ideal for the defects detection [16]. Nonetheless, a primary obstacle to the application of THz defect detection in industry is the low resolution of the THz system. Conversely, the use of imaging techniques is not the only factor considered for the efficient defect detection.

In previous research, fault determination and fault feature identification are still completed manually with a high misclassification rate and low efficiency. To further improve the efficiency and accuracy of fault identification of a large quantity of packaged ICs, noncontact, nondestructive, and fully automated machine vision-based defect detection technologies aiming to assist or to replace human inspectors were proposed for IC inspection tasks in the last few years. Generally, these studies proposed various feature extraction techniques to characterize defects and then used classifiers to classify them. Among all the machine learning algorithms, deep learning (DL) is the most successfully applied technique to solve challenging computer vision tasks [17], such as target classification and defect detection [18]. New breakthroughs based on the use of this method continue to emerge [19]. For instance, Simonyan et al. used a 3×3 convolution filter to build a deep convolutional neural network (CNN) model known as VGG16 network, which was successfully implemented in computer vision tasks [20]. Using the residual learning framework ResNet, He et al. proposed a simplified CNN model that improved the efficiency of model training and achieved significant improvements in accuracy [21].

Typically, these deep CNN models contain a large number of parameters that require time-consuming training. When trained with a small dataset, models with unmatched architecture can lead to overfitting [22]. To solve the problems mentioned above, we propose a compact CNN model based on the VGG16 network to achieve faster yet accurate defect detection. To ensure a better fitting of our model to applications in practice, defective IC were artificially fabricated by emulating the defects of actual ICs. The THz imaging defect detection dataset was established with THz spectral imaging technology.

The main contributions of this study are two-fold. First, a THz imaging IC dataset was collected and labeled. Secondly, a CNN architecture was proposed based on the VGG16 network that improved the optimization performance of the CNN model based on a noncanonical optimization strategy. Finally, the experimental results demonstrated the efficacy of the proposed CNN model of THz imaging IC defect detection.

2. Dataset

In this section, we firstly describe how we have fabricated defective ICs, and introduce the THz transmission imaging system and some related imaging results that have shown the spectral signature of the defective region and its corresponding qualified region in the ICs. Given that the DL-based detection method requires a large training dataset of images, we have established a defective dataset and a qualified dataset for the application of THz IC defect detection.

2.1 IC sample preparation

There are several common types of IC packaging, including shrink double-in-line, leadless chip carrier, quad flat no-lead, and quad flat pack packaging schemes. We prepared 46 qualified ICs and 82 defective ICs for THz imaging, and emulated IC field failures. Additionally, we subjected the qualified ICs to a voltage, which was 20 V greater than the stipulated voltage.

2.2 THz IC imaging

The THz imaging system has emerged as a powerful tool for detecting nonmetallic defects. It is thus applicable to many types of IC packaging composed of nonmetallic materials that are penetrated by THz waves that loose minor parts of their energies during this process. With the use of THz imaging in transmission mode, the detection of the previously mentioned damages in ICs becomes possible at a spatial scale of approximately 100 microns. The system used in our study is the T-Gauge 5000 manufactured by Advanced Photonix, as shown in Fig. 1(a). Figure 1(b) is a photograph of the actual THz time-domain spectroscopy (THz-TDS) system operating in transmission mode. The measured sample was fixed on a two-dimensional moving stage. The spectral bandwidth ranges from 0.01 to 5 THz, and the signal-to-noise ratio within this frequency spectral band is > 60 dB. The f-number is the ratio of the focal length and diameter of the focusing lens. The focal length of our system in the transmission mode is 150 mm, and the diameter of lens is 38 mm. The value of f-number is 3.95. When the THz beam passes through the integrated circuit shell, the intensity of its THz pulse can be recorded by the imaging system’s detector.

Fig. 1. (a)Schematic of the T-Gauge 5000 system, (b) physical diagram of the THz-TDS system operating in the transmission mode.

Download Full Size | PDF

In this study, we performed a series of THz imaging tests on defective and qualified ICs. The entire set of tests were completed in an ultraclean laboratory that ensured temperature and humidity stability. Accordingly, we employed point-by-point raster scanning with the THz beam which was focused on the samples, and we set the scanning step length and scanning speed of the moving stage to 0.25 mm and 50 mm/s, respectively. Based on sample imaging with the use of the THz-TDS system, each data pixel captures the spectral information in the time domain. After preprocessing the raw data, we obtained many two-dimensional images in the frequency range from 0.01 to 5 THz. We conducted a series of tests with the X-ray and THz imaging systems. Some IC defects and nonmetallic damages attributed to packaging voids or delamination of the dielectric substrate were not visible. If we consider the STC89C52C IC as an example, we set the scanning step length to 0.25 mm. The imaging area of the XY-stage is 2.7 cm×7 cm for STC89C52C. Thus, we acquire the size of corresponding THz image as 108×280, also the value of acquisition matrix. When the imaging area of the XY-stage had be changed, the different acquisition matrix would be acquired. Thus, the value of acquisition matrices completely depends on the scanning area of the XY-stage. Even if the size of several ICs is the same, their value of acquisition matrices for the imaging may be diverse. These sizes would uniformly changed to 64×64 as CNN input. The typical qualified imaging results are respectively presented in Figs. 2(a) and 2(b). Typical defective imaging results are respectively presented in Figs. 2(c) and 2(d), whereby the images on the left column represent the X-ray images, and the images on the right column represent the THz images at the frequency of 0.6 THz. Comparing the qualified and defective IC with X-ray imaging, we cannot detect the delamination defects of the dielectric substrate in the left image in Fig. 2(c). By contrast, the THz image on the right distinctly shows the destructed part (Fig. 2(d)).

Fig. 2. (a) X-ray image of the qualified IC. (b) THz image of the qualified IC. (c) X-ray image of the detective IC.(d) THz image of the detective IC.

Download Full Size | PDF

To demonstrate why there is such a big difference between the qualified and detective IC imaging results, we analyzed the waveform of THz-TDS to characterize relevant features. The detected THz time domain pulses from the reference and from two pairs of qualified and detective IC are illustrated in Fig. 3(a). The THz transmission pulses from the tested detective IC show significant time delays and attenuations. Compared with the frequency-domain pulse amplitudes of the qualified IC, Fig. 3(b) shows that the detective IC exhibit considerably smaller amplitudes. This unexpected change of the defective region arises from the energy attenuation caused by the retrograde oscillation of the THz waves at the defects in the IC. This is equivalent to the THz amplitude extinction coefficient of the defect area of the defective IC, which is significantly higher than the qualified area. The THz spectral characteristics are sufficient for the distinction of specific regions based on which inferences are drawn on whether these correspond to qualified or detective IC. The value of the amplitude extinction coefficient can be calculated using the following equations [23].

(1)$$\alpha = \textrm{20}(\log _{10}^e){\mu _a} \approx 8.7{\mu _a}$$

(2)$${\mu _\textrm{a}} = \textrm{ - }\frac{\textrm{1}}{\textrm{d}}\ln \frac{{{A_T}}}{{{A_0}}}$$

where d is the thickness of the IC, μ_a is the attenuation coefficient in units of dB/cm, α is the amplitude attenuation coefficient in units of cm⁻¹, A₀ is the THz pulse reference amplitude, and A_T is the amplitude of the THz pulse, which propagates through the IC. The amplitude extinction coefficient fully demonstrates the region in which the THz imaging reveals defects based on the attenuation of the signal that was caused by the retrograde THz wave oscillations that originated from the stratifications or voids of the medium. Using these equations, the variation of the amplitude extinction coefficient of the qualified and detective IC can be calculated and plotted as shown in Fig. 3(c).

Fig. 3. (a) THz time-domain pulses obtained from reference and two sets of qualified and detective IC. (b) THz frequency-domain pulse amplitudes of the qualified and detective IC. (c) THz amplitude extinction coefficients of the qualified and detective IC.

Download Full Size | PDF

2.3 THz imaging IC detection dataset

Defect detection differs from object classification in the field of computer vision. Regarding the object classification tasks in the field of computer vision, the goal is to determine the category of the object, and the fixed-scale image pixels are used as input to the classification model. However, defect detection by THz imaging differs from the general object classification based on the consideration of the following two facts: a) the resolution of THz imaging increases as the THz wave frequency increases and b) the reduction of the penetration depth of the THz wave to the sample results in an image with a poor signal-to-noise ratio. When THz imaging IC dataset are reconstructed to detect defects, we must collect as many target images as possible to train a good predictive model.

First, the original THz time-domain data should be preprocessed. Taking into account the aforementioned facts, we selected 26 images from each sample which covered frequencies in the range of 0.5 THz and 0.8 THz. We then constructed the THz imaging IC defect detection dataset, which consisted of 1184 qualified IC THz images and 2144 defective IC THz images, whereby the acquisition matrices of the images were different, as shown in Fig. 4.

Fig. 4. Number of images as a function of the acquisition matrix (Type A: 60×100, Type B: 100×240, Type C: 80×200, Type D: 108×280, Type E: 120×160, Type F: 120×200, Type G: 120×260, and Type H: 120×148).

Download Full Size | PDF

One of the disadvantages of IC THz images is its low resolution. This makes its application to industrial defection impractical most of the time. Part of the reason is attributed to the fact that both for human and traditional algorithms, the defects present in the THz IC images are difficult to distinguish. Therefore, it is very important to find a better identification model to replace the manual detection schemes used for industrial IC defect detection.

3. Proposed method

The most extensively used DL model is the CNN. The basic CNN is characterized by its convolutional layer that performs convolution operations on the input data. The convolutional layer can be viewed as a set of neurons, and the convolution operation is interpreted as an image filtering process. However, the convolutional layer is different from the classical neural layer. Firstly, each neuron in the convolutional layer only connects to a small subset of neurons in the adjacent network layer. In this way, a single neuron extracts and analyzes a small part of the image features. Secondly, neurons in the same set of filters share their weight values. These two characteristics of the convolutional layer are critical for effective image processing outcomes, and greatly reduce the number of neural network weighting parameters and the training time.

A CNN structure usually consists of the convolutional layers, the pooling layers, and the fully connected layers. To obtain the feature maps of the imported images, a number of filters are applied in the convolutional layers. As mentioned earlier, the convolutional neural networks can learn to extract the relevant features, and the weights of the parameters are constantly adjusted during training. The pooling layer is the realization of the downsampling of the features of the imported images. The features extracted are passed to a fully connected layer that classifies the input images based on the extracted information.

VGG16, a classical variation of CNN architecture composed of thirteen convolutional layers and three fully connected layers, is the silver medal winner of the localization and classification tracks of the ILSVRC-2014 competition. This complicated architecture has a deep structure and contains a large number of parameters. It is thus unable to achieve real-time efficiency.

3.1 Proposed CNN structure

Many researchers have tried to achieve higher recognition accuracy by either increasing the depth of the network or by adopting more complex modules. These deep CNN models containing massive network parameters cannot reach speeds over 30 frames per second (fps) in real-time processing even with graphics processing unit (GPU) acceleration. They are thus inapplicable to high-speed, online, THz IC imaging-based defect inspections [24].

To balance the recognition accuracy and computational efficiency, we propose the visual geometry group (VGG)-based compact CNN models, which result in better recognition with significantly fewer parameters. These models have good performance in high-speed online defect detection in the THz IC images. Using such a lightweight CNN architecture has several advantages, including more efficient model training, less liable overfitting on small dataset, and easy implementation in embedded systems. As shown in Fig. 5, this CNN variation uses a 64×64 image as its input. It has five convolutional layers, each of which is followed by a maximum pooling (max-pooling) layer. At the tail of the network before the output layer, one or two fully connected layers can exist. The strides of the convolutional and pooling layers are set to unity.

Fig. 5. Proposed CNN structure for 64×64 images.

Download Full Size | PDF

3.2 Zero-padding method

Zero-padding is an effective technique used to control the size of the feature dimension while preventing dimensional loss. Figure 6 shows the implementation of one-dimensional zero padding method. The number of zeros filled on the left (denoted by PL) and on the right (denoted by PR) can be computed based on the following equations,

(3)$$N = \textrm{ceil}(\frac{M}{S})$$

(4)$$PT = (N - 1) \times S + F - M$$

(5)$$PL = floor(\frac{{PT}}{2})$$

(6) $$PR = PT - PL$$

where M = 5 represents the input size, S = 1 is the stride, F = 3 is the filter width, and N is the output size. The padding result yields PL = 1, PR = 1, and N = 5. In this study, the strides of the convolutional and pooling layers are set to unity. Therefore, using the zero-padding method in our experiment, zeros would be added automatically to fill the convolution process in this study [17].

Fig. 6. Zero-padding method used in the training of the CNN model.

Download Full Size | PDF

3.3 Dropout

Dropout is a key to improve the generalization ability of neural networks, and also provides an efficient method that combines exponentially many different neural networks [25]. Dropout refers to the process based on which units are entirely dropped out randomly along with their incoming and outgoing connections from the neural network, as shown in left image of Fig. 7. The dropout rate p is a tunable hyperparameter representing the probability of neurons, which are retained each time a batch of samples is provided to the network. Applying the dropout to a neural network is equivalent to sampling a “sparse” network. A sparse network consists of all the units that survived the dropout, as shown in the right image of Fig. 7.

Fig. 7. Dropout neural network model. (a) A standard neural network with two hidden layers. (b) An example of a thinned net produced by applying dropout to the network on the left.

Download Full Size | PDF

As a result, training a sparse network with dropout can be considered as a random training scheme from a collection of 2n sparse networks with wide weight sharing. If a neural unit is reserved with probability p during training, the outgoing weights of that unit are multiplied by p at testing time, as shown in Fig. 8. Srivastav found that training a network with dropout and using the approximate averaging method at test time would significantly raise model generalizability on a wide variety of classification problems compared to other regularization methods. In the simplest case, the probability p of retention of other units was set to 0.5 in all fully connected layers. In summary, the dropout yielded an excellent accuracy, great generalization ability to various neural networks, and it has a wide variety of applications in practice.

Fig. 8. (a) Probability p related to the unit with weight w of the next layer during training. (b) The unit is always present and multiplied by p at the testing time.

Download Full Size | PDF

4. Construction of the optimized neural network

By capturing a lot of information, THz imaging differs from optical imaging by its lower resolution. We trained the proposed CNN model of the THz defect detection scheme by feeding it with an IC dataset. In addition, effective supplementary techniques were applied to improve the accuracy of defect detection, including: (1) the CNN structure, and (2) the dropout rate.

In this section, the THz imaging IC dataset, which consisted of 2144 defective IC images and 1184 qualified IC images, was first randomly and equally split into the training and the testing sets, whereby the proposed CNN model based on the VGG network was respectively trained and tested. All experiments were carried out on an Nvidia GeForce GTX 1070 Ti GPU with 8 GB of memory under CUDA version 9.0. Additionally, we adjusted the learning rate to 0.0005, and the values of the batch size and epochs to 32 and 200, respectively. We chose the rectified linear unit (ReLU) as the activation function. A gradient descent method was used for the training of the CNN.

4.1 Testing results of CNN structures

The CNN structure in this study consists of five couples of commutative convolutional layers and pooling layers and one or two fully connected (FC) layers. Table 1 lists the specifications of all the convolutional and pooling layers. FC1 is the first FC layer, and FC2 is the second FC layer. The item Con(3×3×32) means that each convolution filter has 3×3×32 kernels, where 32 represents the number of filters in the previous layer. The padding method is also applied to the CNN model to prevent dimensional loss.

Table 1. Layer configurations of convolutional neural network (CNN) models

View Table | View all tables in this article

CNN-i-j indicates that there are i neurons in the FC1 layer and j neurons in the FC2 layer. For instance, CNN-256 indicates that 256 neurons are present in FC1 and none exist in the FC2 layer. All CNN models were executed ten times, and the testing results along with their maximum, minimum, mean, and standard deviation values were recorded for the evaluation of the model. The results of the proposed CNN model with one FC layer and those with two FC layers are respectively listed in Tables 2 and 3.

Table 2. CNN model outcomes with one fully connected (FC) layer (%)

View Table | View all tables in this article

Table 3. CNN model outcomes with two FC layers (%)

View Table | View all tables in this article

There are five types of CNN models with one FC layer. The best mean accuracy trained by CNN-512 is 99.53%, which is slightly better than 99.50% yielded by CNN-256. The best standard deviation is 0.0052 trained by CNN-512. This is significantly lower than the corresponding value of 0.0065 associated with the CNN-256. The predicted results of these CNN models are fitted by a “parabola,” thus indicating that CNN-512 reaches the peak of the best mean accuracy among all the proposed CNN models with one FC layer.

Based on CNN-512, four CNN models with two FC layers were considered in this study. In Table 3, the predicted results with respect to the two FC layers yield obvious increases compared to the results in Table 2. The best mean accuracy (99.96%) and standard deviation (0.0008) are outcomes based on the training by CNN-512-128 that achieves the minimum accuracy of 98.95%.

4.2 Dropout rate

We also trained the dropout neural networks for defect detection on the THz imaging IC dataset based on CNN-512-128. Dropouts were applied by varying the dropout rate to each maximum pooling layer or fully connected layer. It was found that the dropout applied in the last maximum pooling layer yielded optimal performances.

The CNN-512-128 model was then trained without and with dropouts in the last maximum pooling layer when the dropout rate p was 0.8, 0.5 and 0.25, respectively. Figure 9 is the report of the training and testing accuracies for each dropout rate. As shown in Fig. 9(a), the last maximum pooling layer without dropout achieved a much faster convergence than other methods with positive dropout rates. However, the CNN without dropout in the last maximum pooling layer achieved poor generalization outcomes. As shown, the testing accuracy curve deviates from the training accuracy curve. Therefore, we conclude that the best results are those associated with the dropout in the last maximum pooling layer with p = 0.25. Its output at testing time yielded a higher degree of fitting compared to the expected output at the training time. This indicated that the dropout improved the generalization ability and yielded a better efficiency.

Fig. 9. Training and testing accuracies of THz imaging IC dataset based on CNN-512-128 without dropout(a) and with different dropout rate p under 0.8(b), 0.5(c) and 0.25(d) in the last maximum pooling layer.

Download Full Size | PDF

4.3 Other methods

To evaluate the performances of the proposed CNN model, VGG8, VGG13, and VGG16 convolution neural networks were selected to compare the effectiveness of CNN-512-128 in this study, as shown in Table 4. Comparing these networks, the mean accuracies trained through these networks were almost the same. It should be noted that the mean training loss and processing time trained through CNN-512-128 were revealed the minimal value. The compared results indicated that CNN-512-128 was the more suitable for the defect detection of THz imaging IC dataset.

Table 4. Comparison results of CNN models and other methods

View Table | View all tables in this article

5. Discussion

The modules and number of layers of a neural network have a tremendous impact on the detection performance. In the previous section, we demonstrated that the methods described in this study were able to enhance the performance of the proposed CNN model, and discussed in detail the experimental results. Network structure determination, dropout, batch normalization, and learning rate significantly affected the accuracy of CNN detection. The parameters of the network structure were optimized such that the model not only achieved higher accuracy of detection after training but also converged faster during the training.

As described in this study, as the structure becomes more compact, the model becomes more suitable for small dataset. The reason is attributed to the fact that the training of a complex neural network with a large number of weight parameters during optimization requires increased overfitting risks. Consequently, for a smaller dataset, it is wise to start with a thinner CNN architecture and change the parameters of the CNN models as needed. Avoiding the use of the batch mechanism when the dropout is applied to the last maximum pooling layer and the two FC layers, we successfully improved the generalization ability of the CNN model. This will lead to the acceleration of the convergence rate. The dropout method was respectively applied in the maximum pooling and to the two FC layers with the dropout rate set to 0.25 and 0.5. As a result, the best effective CNN model was CNN-512-128 and its learning rate was equal to 0.005. It should be noted that continuous-wave THz imaging systems can also be used for defect identification. To obtain optimal performance of CNN model based on single frequency IC dataset, the number of the samples needs to be greatly extended in the continuous-wave THz imaging systems.

6. Conclusions

This study proposed a compact CNN model for THz IC imaging defect detection that constitutes an important subject in system integration and quality control. It was shown that the proposed method yielded an excellent performance. The main contributions of this study included the construction of the THz imaging IC defect detection dataset, the proposition of a compact CNN model, and the optimization of the CNN model based on the training of the dataset. The numerous comparisons conducted in this study confirmed that the proposed CNN-based IC defect detection method was quite effective and significant for the application of the THz imaging IC defect detection system.

Nevertheless, the proposed CNN method was limited in view of the following. First, we needed to collect more types of defects for circuit failures. Second, the training process was relatively time-consuming. In light of these limitations, additional in-depth research studies need to be carried out in the future to a) expand the detection search on more extensive defect types, and b) to speed up the training by further studying CNN-based transfer learning.

Funding

Key Technologies Research and Development Program (2018YFB1004004, 2018YFB1702701).

Disclosures

The authors declare no conflicts of interest.

References

1. W. C. J. Tam and R. D. S. Blanton, “LASIC: Layout analysis for systematic IC-defect identification using clustering,” Ieee Transactions On Computer-Aided Design of Integrated Circuits and Systems 34(8), 1278–1290 (2015). [CrossRef]

2. S. Chen and D. Perng, “Automatic optical inspection system for IC molding surface,” J. Intell. Manuf. 27(5), 915–926 (2016). [CrossRef]

3. E. Keenan, R. G. Wright, R. Mulligan, and L. V. Kirkland, “Terahertz and laser imaging for printed circuit board failure detection,” in Proceedings AUTOTESTCON 2004, (IEEE, 2004), 563–569.

4. S. Vora, R. Jiang, S. Vasudevan, and E. Rosenbaum, “Application level investigation of system-level ESD-induced soft failures,” (IEEE, 2016), 1–10.

5. S. Vora, R. Jiang, S. Vasudevan, and E. Rosenbaum, “Application level investigation of system-level ESD-induced soft failures,” (IEEE, 2016), 1–10.

6. S. Park, J. Jang, and H. Kim, “Non-destructive evaluation of the hidden voids in integrated circuit packages using terahertz time-domain spectroscopy,” J. Micromech. Microeng. 25(9), 095007 (2015). [CrossRef]

7. E. Martin, C. Larato, A. Clément, and M. Saint-Paul, “Detection of delaminations in sub-wavelength thick multi-layered packages from the local temporal coherence of ultrasonic signals,” NDT&E Int. 41(4), 280–291 (2008). [CrossRef]

8. K. Ahi, N. Asadizanjani, S. Shahbazmohamadi, M. Tehranipoor, and M. Anwar, “Terahertz characterization of electronic components and comparison of terahertz imaging with x-ray imaging techniques,” in Terahertz Physics, Devices, and Systems IX: Advanced Applications in Industry and Defense, (International Society for Optics and Photonics, 2015), 94830K.

9. P. Dean, O. Mitrofanov, J. Keeley, I. Kundu, L. Li, E. H. Linfield, and A. Giles Davies, “Apertureless near-field terahertz imaging using the self-mixing effect in a quantum cascade laser,” Appl. Phys. Lett. 108(9), 091113 (2016). [CrossRef]

10. D. Yee, K. H. Jin, J. S. Yahng, H. Yang, C. Y. Kim, and J. C. Ye, “High-speed terahertz reflection three-dimensional imaging using beam steering,” Opt. Express 23(4), 5027–5034 (2015). [CrossRef]

11. Ü Alkuş, E. S. Ermeydan, A. B. Sahin, I. Cankaya, and H. Altan, “Enhancing the image resolution in a single-pixel sub-THz imaging system based on compressed sensing,” Opt. Eng. 57(04), 1 (2018). [CrossRef]

12. Z. Song, S. Yan, Z. Zang, Y. Fu, D. Wei, H. Cui, and P. Lai, “Temporal and spatial variability of water status in plant leaves by terahertz imaging,” IEEE Trans. Terahertz Sci. Technol. 8(5), 520–527 (2018). [CrossRef]

13. J. Liu, P. Li, Y. Chen, X. Song, Q. Mao, Y. Wu, F. Qi, B. Zheng, J. He, H. Yang, Q. Wen, and W. Zhang, “Flexible terahertz modulator based on coplanar-gate graphene field-effect transistor structure,” Opt. Lett. 41(4), 816 (2016). [CrossRef]

14. Q. Mao, Q. Y. Wen, W. Tian, T. L. Wen, Z. Chen, Q. H. Yang, and H. W. Zhang, “High-speed and broadband terahertz wave modulators based on large-area graphene field-effect transistors,” Opt. Lett. 39(19), 5649–5652 (2014). [CrossRef]

15. T. Wen, J. Tong, D. Zhang, Y. Zhu, Q. Wen, Y. Li, H. Zhang, Y. Jing, and Z. Zhong, “Semiconductor terahertz spatial modulators with high modulation depth and resolution for imaging applications,” J. Phys. D: Appl. Phys. 52(25), 255303 (2019). [CrossRef]

16. R. Kuroda, M. Yasumoto, N. Sei, H. Toyokawa, H. Ikeura-Sekiguchi, H. Ogawa, M. Koike, and K. Yamada, “Measurement of coherent terahertz radiation for time-domain spectroscopy and imaging,” Radiat. Phys. Chem. 78(12), 1102–1105 (2009). [CrossRef]

17. L. Wen, X. Li, L. Gao, and Y. Zhang, “A new convolutional neural network-based data-driven fault diagnosis method,” IEEE Trans. Ind. Electron. 65(7), 5990–5998 (2018). [CrossRef]

18. Z. Deng, H. Sun, S. Zhou, J. Zhao, L. Lei, and H. Zou, “Multi-scale object detection in remote sensing imagery with convolutional neural networks,” Isprs Journal of Photogrammetry and Remote Sensing 145, 3–22 (2018). [CrossRef]

19. J. Zhang, W. Xing, M. Xing, and G. Sun, “Terahertz image detection with the improved faster region-based convolutional neural network,” Sensors 18(7), 2327 (2018). [CrossRef]

20. M. Simon, Y. Gao, T. Darrell, J. Denzler, and E. Rodner, “Generalized orderless pooling performs implicit salient matching,” in Proceedings of the IEEE international conference on computer vision, (2017), 4960–4969.

21. K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in European conference on computer vision, (Springer, 2016), 630–645.

22. G. Fu, P. Sun, W. Zhu, J. Yang, Y. Cao, M. Y. Yang, and Y. Cao, “A deep-learning-based approach for fast and robust steel surface defects classification,” Opt. Laser. Eng. 121, 397–405 (2019). [CrossRef]

23. K. Ahi, S. Shahbazmohamadi, and N. Asadizanjani, “Quality control and authentication of packaged integrated circuits using enhanced-spatial-resolution terahertz time-domain spectroscopy and imaging,” Opt. Laser. Eng. 104, 274–284 (2018). [CrossRef]

24. S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in Neural Information Processing Systems, (NIPS, 2015), 91–99.

25. H. Wang, W. Yang, Z. Zhao, T. Luo, J. Wang, and Y. Tang, “Rademacher dropout: An adaptive dropout for deep neural network via optimizing generalization gap,” Neurocomputing 357, 177–187 (2019). [CrossRef]

Layer name	CNN Models
L1	Conv(3×3×32)
L2	Conv(3×3×64)
L3	Conv(3×3×128)
L4	Conv(3×3×256)
L5	Conv(3×3×512)

No.	CNN-128	CNN-256	CNN-512	CNN-1024	CNN-2560
Maximum	100	100	100	100	100
Minimum	97.11	98.31	98.43	97.83	97.66
Mean	99.41	99.50	99.53	99.33	99.34
Standard deviation	0.0096	0.0065	0.0052	0.0081	0.0078

No.	CNN-512	CNN-512-64	CNN-512-128	CNN-512-256	CNN-512-512
Maximum	100	100	100	100	100
Minimum	98.43	98.91	98.95	99.69	99.16
Mean	99.53	99.77	99.96	99.95	99.81
Standard deviation	0.0052	0.0038	0.0008	0.0011	0.0030

Networks	Mean Accuracy (%)	Mean Training Loss	Processing Time (minute)
CNN-512-128	99.96	0.0057	2.07
VGG8	99.96	0.0086	3.44
VGG13	99.93	0.0081	5.07
VGG16	99.96	0.0076	8.09

Layer name	CNN Models
L1	Conv(3×3×32)
L2	Conv(3×3×64)
L3	Conv(3×3×128)
L4	Conv(3×3×256)
L5	Conv(3×3×512)

Convolutional neural network model based on terahertz imaging for integrated circuit defect detections

Abstract

1. Introduction

2. Dataset

2.1 IC sample preparation

2.2 THz IC imaging

2.3 THz imaging IC detection dataset

3. Proposed method

3.1 Proposed CNN structure

3.2 Zero-padding method

3.3 Dropout

4. Construction of the optimized neural network

4.1 Testing results of CNN structures

4.2 Dropout rate

4.3 Other methods

5. Discussion

6. Conclusions

Funding

Disclosures

References

Cited By

Figures (9)

Tables (4)

Equations (6)

Optics Express