## Abstract

The terahertz (THz) spectra in the range of 0.2–1.6 THz (6.6–52.8 cm^{−1}) of wheat grains with various degrees of deterioration (normal, worm-eaten, moldy, and sprouting wheat grains) were investigated by terahertz time domain spectroscopy. Principal component analysis (PCA) was employed to extract feature data according to the cumulative contribution rates; the top four principal components were selected, and then a support vector machine (SVM) method was applied. Several selection kernels (linear, polynomial, and radial basis functions) were applied to identify the four types of wheat grain. The results showed that the materials were identified with an accuracy of nearly 95%. Furthermore, this approach was compared with others (principal component regression, partial least squares regression, and back-propagation neural networks). The comparisons showed that PCA-SVM outperformed the others and also indicated that the proposed method of THz technology combined with PCA-SVM is efficient and feasible for identifying wheat of different qualities.

© 2014 Optical Society of America

## 1. Introduction

Terahertz (THz) radiation, which falls between the microwave and infrared bands, consists of electromagnetic waves at frequencies from 0.1 to 10 THz (or wave numbers from 3.3 to 333 cm^{−1}). The energy level of THz radiation is about 1–10 $meV$, which is the energy of molecular transitions in many biological and chemical materials. Numerous organic molecules exhibit strong absorption and dispersion in the THz range, thus, many materials can be identified and measured [1–4]. THz spectroscopy of materials provides rich information on their physics and chemistry based on the molecular structure, which is not available in other parts of the electromagnetic spectrum (X-rays and microwaves), so THz absorption fingerprints can be used for object identification. Furthermore, THz radiation can penetrate many nonpolar materials, such as clothing, cardboard, and suitcases, with modest attenuation, making it well suited for noncontact, nondestructive, non-ionizing measurement. Recent rapid development of THz time domain spectroscopy (TDS) and related THz technologies has demonstrated the advantages of THz radiation and its potential for widespread application in various fields, such as biology [5, 6], chemistry [7, 8], agriculture [9], and many other fields.

Wheat grain contains high amounts of carbohydrates (constituting 70% or more of its dry weight), proteins (about 15%, which is a higher protein content than maize and rice), and fat (about 6%) and also provides important vitamins, fiber, and minerals; thus, it serves as a vital reliable source of food and nutrients for humans [10]. Further, wheat grain is one of the main food crops that exhibit weak antimycotic characteristics and insect resistance [11]. Recent post-harvest losses in cereals amounted to a large quantity per year in China [12]. According to the statistics of the Food and Agriculture Organization, post-harvest losses in China are higher than in other countries (as high as 8%) [13]. Mildewed, moist, and sprouted grain can cause losses during improper storage, and the increasing volume of moldy and sprouted grain will occur soon [14]. The main cause of post-harvest losses is the lack of timely inspection of the quality of stored grain [15]. Therefore, rapid nondestructive evaluation techniques are of great interest for wheat quality inspection, which is important for maintaining national food security and reducing stored grain loss.

Many studies have reported nondestructive measurement methods that are applicable to stored grain materials. Lee et al. [16] used the random forest algorithm and near-infrared (NIR) spectroscopy to analyze agricultural samples with three geographical origins. Kandala and Sundaram [17] used a parallel-plate capacitance sensor to determine the moisture content of grain and nuts. Eifler et al. [18] used an electronic nose to differentiate between infected and non-infected wheat grains. Wu et al. [19] used machine vision to identify three types of grain samples. Although those technologies have been applied to stored grain detection, the applications are still limited in certain circumstances. Compared to NIR, THz radiation, with its longer wavelengths, which has its own characteristics [7], and THz radiation can “see” through common packaging and textile materials [1].

THz-TDS has recently been used for qualitative and quantitative analyses [20, 21]. Zhang et al. qualitatively studied the THz spectra of acephate using first principles calculations based on the density functional method and experimental data [22]. Hua and Zhang qualitatively and quantitatively studied the THz spectra of four types of pesticides, food powders, and a mixture, and used the partial least squares (PLS) method to obtain different weight ratios of imidacloprid in a mixture [20]. Ma et al. qualitatively studied thiabendazole and quantitatively analyzed a mixture of thiabendazole and polyethylene using various algorithms (including PLS, iPLS, biPLS, and mwPLS) [4]. Those reports demonstrated qualitative and quantitative analyses of objects using linear regression methods; however, few papers have analyzed the effects of nonlinear behavior.

The aim of this paper is to develop a method of identifying different wheat samples using THz spectra combined with chemometric tools. The THz spectra of wheat grains with various degrees of deterioration were investigatedin the range of 0.2–1.6 THz and were found to have no obvious characteristic absorption peaks. Principal component analysis (PCA) was used to extract features from absorption spectra, and then uses them as the input to a support vector machine (SVM) model. The tested variable selection kernels (linear, polynomial, and radial basis functions, RBFs) were applied to identify four types of wheat grains. In addition, this method was compared with other approaches: principal component regression (PCR), PLS, and back-propagation (BP) neural networks. The comparisons results show that PCA-SVM outperforms the other models and is a potential identification tool for the further quality analysis of agricultural products.

## 2. Experimental

#### 2.1 Experimental setup

We employed a standard THz-TDS system to investigate the optical properties of wheat samples of different qualities. A schematic diagram of the experimental setup used in this study is shown in Fig. 1. A mode-locked femtosecond titanium-sapphire laser, which generates laser pulses of 100 fs duration at a central wavelength of around 800 nm with a repetition frequency of 80 MHz, was used for the emission and detection of pulsed THz radiation. The pulsed system was described in recent papers [23, 24].

An investigation of the phase shift and spectral damping of THz pulses revealed the absorption coefficient and refractive index of the samples [23, 25]. The experiment was conducted at room temperature (about 292 K), and the wave path was enclosed in a nitrogen-purged box to reduce the absorption of the THz waves by atmospheric water vapor and enhance the signal-to-noise ratio [24].

#### 2.2 Sample preparation

We attempted to identify four types of wheat grains: worm-eaten, moldy, sprouting, and normal. In a worm-eaten sample, the nutrient component was partially eaten by worms; a moldy sample has visible *Aspergillus niger* dots; wheat that is sprouting exhibits some stage of germination. Some properties of the samples are shown in Table 1, quality evaluation of wheat was demonstrated in Ref [10]. To identify the condition of the wheat, we selected samples from the same production area from the School of Food Science and Technology, Henan University of Technology, Zhengzhou, China. The moisture content of the samples was about 12.5%.

For each type of wheat, 30 samples were used without further processing before milling. The milled wheat grain was sieved by filtering laws using 200-eye sieves. The sieved product was then pressed into circular slices about 1.0-1.5 mm thick and 13 mm in diameter under a pressure of 10 MPa with a tablet press, and all the samples were labeled according to their properties as being worm-eaten, moldy, sprouting, or normal.

#### 2.3 Data acquisition

THz-TDS allows us to measure both the phase and amplitude of THz pulses passing through the reference and sample; the THz waveform is acquired and transformed by a fast Fourier transform into the frequency domain [20]. By comparing the sample and reference pulses, the complex refractive index of the sample $N(\omega )$, which indicates the macroscopic optical properties, is calculated as follows:

where $\omega $ represents the frequency of the radiation, $n(\omega )$ is the dispersion, $i$ is the imaginary unit, and $k(\omega )$ is the extinction coefficient of the measured sample. In the transmission setup, ${E}_{s}(\omega )$ and ${E}_{ref}(\omega )$ represent the THz spectra from the sample and reference, respectively. The complex transmission coefficient is defined as follows [20, 26, 27]:## 3. Modeling methods

#### 3.1 Principal component analysis

PCA is a statistical analysis method that can be used to represent the near-original data set in an orthogonal space of smaller dimensions. It is mainly an eigenvector-based multivariate analysis that extracts the maximum amount of variance in the original data set [28, 29]. The number of PCs is no larger than the number of the original variables. PC1 has the maximum information in the data, which is orthogonal to (i.e., uncorrelated with) the next component, PC2. PC2 has more information than PC3, which is orthogonal, and so on. These PCs can be specified by the following steps [30].

Step 1 Standardize the original data matrix ${X}_{m\times n}$ ($m$ represents the number of samples, $n$ represents the dimensions of the samples) as follows:

Then calculate the covariance matrix ${S}_{n\times n}$.

Step 2 Obtain the eigenvalues (${\lambda}_{i}$) and related eigenvectors (${\mu}_{i}$) of ${S}_{n\times n}$.

Step 3 The PCs ${Z}_{i}={\mu}_{i}\times {X}^{*}{}_{}(i=1,2,\cdots ,n)$ and their cumulative contribution rates should total at least 80% [31]. Further, the PCs (${Z}_{1},{Z}_{2},\cdots ,{Z}_{k}$) represent a lower-dimensional (*k*-dimensional) data set.

#### 3.2 Support vector machine

SVM is a supervised learning method that analyzes data and recognizes patterns. It is used for classification and various types of predictions. The theory of SVM has been described in detail elsewhere [32]. In this paper, a “one-against-one” approach was used for multi-class classification, and a regression model was constructed using a nonlinear mapping function that maps the input data onto a higher-dimensional space, enabling the solution of the nonlinear optimization problem in a linear kernel space. The relationship between the independent variable $x$ and dependent variable $y$ has to be estimated in the SVM model [33], which is given by a deterministic function${f}_{(x)}$, $y={f}_{(x)}+{N}^{\prime}$, where ${f}_{(x)}={w}^{T}\cdot \phi (x)+b$; ${N}^{\prime}$ is the noise, which is defined by the error tolerance ($\epsilon $); $w$ and $b$ are the regression function parameters; and $\phi (\cdot )$ is the kernel function. A functional form for ${f}_{(x)}$ is obtained by training the SVM model [33] on a sample set given a set of training data such as

In this formulation, ${y}_{k}$ is an experimental value for the given values of the input variable ${x}_{k}$.Then $w$ and $b$ are derived by minimizing the error function principle:

The Lagrange multiplier $({\alpha}_{i}-{\alpha}_{i}{}^{*})$can be replaced by ${\overline{\alpha}}_{i}$ if ${\overline{\alpha}}_{i}=0$, corresponding to the data within the $\epsilon -insensitiv{e}^{}tube$; otherwise, if ${\overline{\alpha}}_{i}\ne 0$, those data sets, which are called the support vectors, are involved in the final regression function. Therefore, Eq. (11) can be rewritten as

One method of building the SVM regression model is the adoption of the kernel function $k({x}_{i},{y}_{i})=\varphi {({x}_{i})}^{T}$ [34]. For comparison, three kernel functions were selected in this work: a linear kernel function, a polynomial kernel function, and an RBF. They are described as follows:

- 1 Linear kernel: $k({x}_{i},{y}_{i})={x}_{i}\cdot {y}_{i}$
- 2 Polynomial kernel: $k({x}_{i},{y}_{i})={({x}_{i}\cdot {y}_{i}+1)}^{d}$
- 3 RBF: $k({x}_{i},{y}_{i})=\mathrm{exp}(-\frac{{\Vert {x}_{i}-{y}_{i}\Vert}^{2}}{{\gamma}^{2}})$

The adjustable kernel function parameters $C{,}^{}\gamma $ should be defined using the best-fitting value, which can be obtained by trial and error. A massive increase in the parameters could cause incorrect classification; on the other hand, changing the parameters to a small value will also cause data prediction to fail. A grid search using fivefold cross-validation was conducted to find the optimal parameters by which the model can achieve the best forecast results [35].

To evaluate the performance of the established models, the root mean square error (RMSE) [20] was employed to evaluate the developed model’s accuracy. The RMSE is calculated as

*i*th sample in the data set and the predicted value of the

*i*th sample in the developed model, respectively.

## 4. Results and discussion

#### 4.1 Spectroscopic analysis

Four different types of wheat grains were to be identified: normal, worm-eaten, moldy, and sprouting. The THz spectra of 30 samples selected from each type were measured by THz-TDS in the frequency range of 0.2–1.6 THz. Each wheat sample was scanned four times, and the spectral records were averaged to produce a spectrum with 256 points in the transmission mode. A data set consisting of 120 samples was randomly split into a training set (60 samples) and a validation set (60 samples). Figure 2 shows the THz waveforms of the four types in the time domain. After fast Fourier transforms were applied to the time pulse, the absorption spectra of the samples can be calculated using Eq. (4); the results are shown in Fig. 3.

As shown in Fig. 2, the similarity of the waveforms demonstrates the stable performance of the system. Moreover, the decrease in the pulse amplitude and the time delay indicate the effects of sample absorption and the refractive index difference between the sample and reference. Figure 3 demonstrates that the samples have no obvious absorption peaks in the frequency range examined. Because of the similarity of the samples’ THz spectra, more sophisticated computational analysis methods were employed. As a result, a data set with 256 spectral features and 120 samples was used to construct a classification model for the wheat grain types.

#### 4.2 PCA analysis

To reduce the dimensionality of the relevant features from the spectral data, PCA was performed. According to the PCA results, the eigenvalues of the top four eigenvectors extracted from the original THz spectra data are 91.16%, 5.35%, 1.83%, and 0.17%, respectively. The top four PCs represent 98.51% of the total contribution to the original data. They cover the maximum information across all the samples and reduce the dimensions of the origin from 256 spectral measurements to four components. Figure 4 shows the two-dimensional scores plotted using the top two PCs; the quality of the 120 wheat samples can be identified. Therefore, PCA can extract the THz features of the wheat samples effectively. The top four PCs were selected as the input to the proposed SVM model for detecting and separating the wheat grain types.

#### 4.3 SVM analysis

In this work, classification approaches were introduced for identifying the four wheat grain types (normal, worm-eaten, moldy, and sprouting) using the THz absorption spectra. Building the SVM regression model from the training set using the three kernel functions requires the fitting parameters $C{,}^{}\gamma $. Fine tuning of those values can apparently promote the predictive ability of the model. The adjustable model parameter $\gamma $ was set to different values while the other parameters are considered to vary, and then several regression models were constructed. The RMSE was evaluated according to the differences among the values predicted by the regression models. Table 2 shows the percentage of wheat types correctly identified when the SVM was applied.

Table 2 shows that the predictions provided some accurate results regarding each type; the overall performance for each kernel is also shown. The optimal value of the parameter $\gamma $ in the SVM model is 4 according to Fig. 5, which shows the $\gamma $ values versus the RMSE in 12 constructed SVM models; however, an optimal $C$ value of 2 was obtained using a grid search algorithm. The performance of the SVM model differs with the kernel function; the prediction accuracies are ranked from high to low as follows: linear, polynomial, RBF.

#### 4.4 PCA-SVM analysis

PCA was used to reduce the dimensions of the measurement data, and an SVM model was constructed to identify the wheat types. The top four PCs obtained by PCA were selected as inputs to the model to predict the wheat types.

Similar to the SVM analysis in section 4.3, Fig. 6 shows the parameter $\gamma $ in the 16 models constructed in this work versus reasonable criterion RMSE. The model with a $\gamma $ value of 3.5 yielded the best prediction result. Furthermore, the optimal value of $C$ was identified as 1.6 by using the grid search optimization approach. The results of validation of the PCA-SVM model with the three selected kernel functions are listed in Table 3.

Table 3 shows that the accuracies of the predictions differed, but the method yielded satisfactory classification results for the wheat grain types. The performance of the models with linear and polynomial kernels was superior to that of the model with an RBF. The overall percentage of correct classifications was at least 90% in all cases, and the prediction accuracy was 100% for the normal and sprouting wheat grains, although a few misclassifications occurred for the other types. Overall, the results of PCA-SVM were much better than those of the SVM method.

#### 4.5 Analysis of the model performance

Finally, to verify the performance of the proposed method, PCR, PLS, and BP neural networks were also applied as reference methods. These methods are widely used in spectroscopic studies [36]. Further, their performances are indexed to the prediction accuracy. Table 4 shows the accuracy of each modeling approach in predicting the validation set.

The results indicate that preprocessing the input data using PCA improved the SVM operation, and the SVM model is especially effective in the solution of small-sample problems and can avoid falling into local minima. The prediction accuracy of the PCA-SVM model reached 95%, and the other models listed in Table 4 (PCR, PLS, BP) failed to offer a more reasonable prediction accuracy for the wheat types. The best-fitting values of parameters $\gamma $ and $C$ for PCA-SVM (with the RBF kernel function) are identified as 1.2 and 3, respectively. Overall, the comparison of the four methods in Table 4 clearly shows that the performance of the PCA-SVM model is much better than that of the other models.

#### 4.6 Discussion

Our results in this study suggest that THz spectra can be used in combination with chemometric methods to identifying the quality of wheat grains (i.e., whether they are normal, worm-eaten, moldy, or sprouting). The THz spectra of the samples reflects the overall response of different wheat components; they have no obvious absorption peaks, because the wheat samples are mixtures with different contents of each wheat component and complex structures. However, the spectra of the abnormal wheat grains reflect mainly the information from the low-quality component, such as bud mutation, mildewing, and consumption by worms.

As we mentioned previously, nondestructive measurement methods include NIR spectroscopy, an electronic nose, and machine vision. Ref [18]. presents an electronic nose that differentiates between infected and uninfected grains; although the classification rates are higher than 80%, the electronic nose cannot obtain a better accuracy when dealing with classification of different species and infection levels. Ref [16]. reports a method of discriminating the geographical origin of agricultural products (only imported versus domestic) by using the random forest algorithm and NIR spectroscopy; it also analyzes the discrimination errors for various building ranges from 500 to 10,000 trees. The accuracy of the geographical origin assignment varies widely among the agricultural products. Ref [19]. uses the color machine vision system to identify grains, and an accuracy of 90% was obtained for the testing data set. However, because the morphological features are highly correlated, the classification accuracy cannot be improved, and image acquisition and processing are more complicated in the real-world applications as well.

In our work, PCA was applied to reduce the dimensionality of the THz spectra of wheat grain sample and extract features from the original data. The PCA yielded four PCs with a total contribution of more than 98%; then SVM was employed to identify the THz spectra of wheat grain types. To improve the prediction accuracy (which reached 95%), three kernel functions were selected. The optimal parameters of the model were obtained by constructing 16 models. The best prediction accuracy was obtained using the model with parameters of $\gamma $ = 3.5 and $C$ = 1.6. Finally, because of the good wheat type predictions of the SVM model, the prediction accuracy of the PCA-SVM exceeded that of the PCR, PLS, and BP neural network methods.

In comparison with conventional methods of identifying wheat quality, THz spectroscopy is a noninvasive, nondestructive method of analyzing and identifying substances in different fields. The THz spectra, which enable rapid analysis of different samples without complicated preprocessing, have potential for use as a quality identification tool. When the THz absorption spectra of a mixed sample have no clear fingerprint characteristics, the application of chemometric methods to the THz absorption spectra offers effective identification of wheat quality. Furthermore, various wheat samples (such as those from different product years or geographical origins) might exhibit different responses to THz radiation; more accurate performance regarding wheat quality requires further investigation.

## 5. Conclusion

In this research, a classification model was developed to identify four types of wheat grains using their THz spectra in the range of 0.2–1.6 THz. To extract features from the original spectral data, PCA was explored as a preprocessing method for an SVM model. The effects of the PCA on the output results of the SVM model were investigated, and the experimental results showed that the PCA-SVM approach can yield a satisfactory prediction accuracy. In addition, various classification methods (PCR, PLS, and BP neural networks) were used to identify the four types of wheat grains. The comparison results demonstrated that the PCA-SVM method outperforms the others. Therefore, this study has demonstrated the effectiveness of PCA-SVM for identifying different types of wheat grains, and it can be concluded that THz spectroscopy combined with chemometric methods is a potential tool for qualitative analysis and study of agricultural products.

## Acknowledgments

This work was supported by the National High-tech R&D Program of China (863 Program) (Grant No. 2012AA101608) and the National Natural Science Foundation of China (Grant No. 61071197). Finally, the authors are grateful to the reviewers for their helpful comments and constructive suggestions.

## References and links

**1. **I. Amenabar, F. Lopez, and A. Mendikute, “In introductory review to THz non-destructive testing of composite mater,” J Infrared, Millimeter, Terahertz Waves **34**(2), 152–169 (2013). [CrossRef]

**2. **E. Castro-Camus, M. Palomar, and A. A. Covarrubias, “Leaf water dynamics of Arabidopsis thaliana monitored in-vivo using terahertz time-domain spectroscopy,” Sci Rep **3**, 2910–2914 (2013). [CrossRef] [PubMed]

**3. **B. Ferguson and X. C. Zhang, “Materials for terahertz science and technology,” Nat. Mater. **1**(1), 26–33 (2002). [CrossRef] [PubMed]

**4. **Y. H. Ma, Q. Wang, and L. Y. Li, “PLS model investigation of thiabendazole based on THz spectrum,” J. Quant. Spectrosc. Radiat. Transf. **117**, 7–14 (2013). [CrossRef]

**5. **P. H. Siegel, “Terahertz technology in biology and medicine,” IEEE Trans Microw Theory **52**(10), 2438–2447 (2004). [CrossRef]

**6. **S. Hadjiloucas, L. S. Karatzas, and J. W. Bowen, “Measurements of leaf water content using terahertz radiation,” IEEE Trans Microw Theory **47**(2), 142–149 (1999). [CrossRef]

**7. **P. C. Ashworth, E. Pickwell-MacPherson, E. Provenzano, S. E. Pinder, A. D. Purushotham, M. Pepper, and V. P. Wallace, “Terahertz pulsed spectroscopy of freshly excised human breast cancer,” Opt. Express **17**(15), 12444–12454 (2009). [CrossRef] [PubMed]

**8. **L. V. Titova, A. K. Ayesheshim, A. Golubov, D. Fogen, R. Rodriguez-Juarez, F. A. Hegmann, and O. Kovalchuk, “Intense THz pulses cause H2AX phosphorylation and activate DNA damage response in human skin tissue,” Biomed. Opt. Express **4**(4), 559–568 (2013). [CrossRef] [PubMed]

**9. **A. A. Gowen, C. OSullivan, and C. P. ODonnell, “Terahertz time domain spectroscopy and imaging: Emerging techniques for food process monitoring and quality control,” Trends Food Sci. Technol. **25**(1), 40–46 (2012). [CrossRef]

**10. **O. O. Oladunmoye, R. Akinoso, and A. A. Olapade, “Evaluation of some physical-chemical properties of wheat, cassava, maize and cowpea flours for bread making,” J. Food Qual. **33**(6), 693–708 (2010). [CrossRef]

**11. **F. Crista, I. Radulov, L. Crista, A. Berbecea, and A. Lato, “Influence of mineral fertilization on the amino acid content and raw protein of wheat grain,” J. Food Agric. Environ. **10**, 47–50 (2012).

**12. **Y. E. Zhang, Q. Q. Chu, and H. G. Wang, “Trends and strategies of food security during process of urbanization in China,” Res Agric Modernization **30**, 270–274 (2009).

**13. **L. Y. Guo, “Reduce grain loss and combat food waste,” China Grain Econ. 17-18 (2013).

**14. **S. Neethirajan, C. Karunakaran, D. S. Jayas, and N. D. G. White, “Detection techniques for stored-product insects in grain,” Food Contr. **18**(2), 157–162 (2007). [CrossRef]

**15. **H. L. Zhou, *Research on Intelligent Multifunction Monitoring and Control System Platform For Grain Storage* (Beijing University of Posts and Telecommunications, 2010).

**16. **S. Lee, H. Choi, K. Cha, M. K. Kim, J. S. Kim, C. H. Youn, S. H. Lee, and H. Chung, “Random forest as a non-parametric algorithm for near-infrared (NIR) spectroscopic discrimination for geographical origin of agricultural samples,” Bull. Korean Chem. Soc. **33**(12), 4267–4270 (2012). [CrossRef]

**17. **C. V. Kandala and J. Sundaram, “Nondestructive measurement of moisture content using a parallel-plate capacitance sensor for grain and nuts,” IEEE Sens. J. **10**(7), 1282–1287 (2010). [CrossRef]

**18. **J. Eifler, E. Martinelli, M. Santonico, R. Capuano, D. Schild, and C. Di Natale, “Differential detection of potentially hazardous Fusarium species in wheat grains by an electronic nose,” PLoS ONE **6**(6), e21026 (2011). [CrossRef] [PubMed]

**19. **L. L. Wu, J. Wu, Y. X. Wen, L. R. Xiong, and Y. Zheng, “Classification of single cereal grain kernel using shape parameters based on machine vision,” in *Advanced Designs and Researches for Manufacturing*, Pts. 1–3, P. C. Wang, X. D. Liu, and Y. Q. Han, eds. (Trans Tech Publications Ltd., Stafa-Zurich, 2013), pp. 2179–2182.

**20. **Y. Zhang, X. H. Peng, Y. Chen, J. Chen, A. Curioni, W. Andreoni, S. K. Nayak, and X. C. Zhang, “A first principle study of terahertz (THz) spectra of acephate,” Chem. Phys. Lett. **452**(1-3), 59–66 (2008). [CrossRef]

**21. **R. Gente, N. Born, N. Voß, W. Sannemann, M. K. J. Léon, and E. Castro-Camus, “Determination of leaf water content from terahertz time-domain spectroscopic data,” J Infrared, Millimeter, Terahertz Waves **34**(3-4), 316–323 (2013). [CrossRef]

**22. **Y. F. Hua and H. J. Zhang, “Qualitative and quantitative detection of pesticides with terahertz time-domain spectroscopy,” IEEE Trans Microw Theory **58**(7), 2064–2070 (2010). [CrossRef]

**23. **I. Pupeza, R. Wilk, and M. Koch, “Highly accurate optical material parameter determination with THz time-domain spectroscopy,” Opt. Express **15**(7), 4335–4350 (2007). [CrossRef] [PubMed]

**24. **Z. Xiao-li and L. Jiu-sheng, “Diagnostic techniques of talc powder in flour based on the THz spectroscopy,” J. Phys. Conf. Ser. **276**, 012234 (2011). [CrossRef]

**25. **M. Scheller, C. Jansen, and M. Koch, “Analyzing sub-100-μm samples with transmission terahertz time domain spectroscopy,” Opt. Commun. **282**(7), 1304–1306 (2009). [CrossRef]

**26. **T. D. Dorney, R. G. Baraniuk, and D. M. Mittleman, “Material parameter estimation with terahertz time-domain spectroscopy,” J. Opt. Soc. Am. A **18**(7), 1562–1571 (2001). [CrossRef] [PubMed]

**27. **L. Duvillaret, F. Garet, and J. L. Coutaz, “A reliable method for extraction of material parameters in terahertz time-domain spectroscopy,” IEEE J. Sel. Top. Quantum Electron. **2**(3), 739–746 (1996). [CrossRef]

**28. **K. Schweizer, P. C. Cattin, R. Brunner, B. Müller, C. Huber, and J. Romkes, “Automatic selection of a representative trial from multiple measurements using Principle Component Analysis,” J. Biomech. **45**(13), 2306–2309 (2012). [CrossRef] [PubMed]

**29. **R. Noori, M. S. Sabahi, A. R. Karbassi, A. Baghvand, and H. T. Zadeh, “Multivariate statistical analysis of surface water quality based on correlations and variations in the data set,” Desalination **260**(1-3), 129–136 (2010). [CrossRef]

**30. **J. P. S. Parkkinen, J. Hallikainen, and T. Jaaskelainen, “Characteristic spectra of Munsell colors,” J. Opt. Soc. Am. A **6**(2), 318–322 (1989). [CrossRef]

**31. **T. L. Liu, Q. Y. Su, Q. Sun, and L. M. Yang, “Recognition of corn seeds based on pattern recognition and near infrared spectroscopy technology,” Guang Pu Xue Yu Guang Pu Fen Xi **32**(5), 1209–1212 (2012). [PubMed]

**32. **M. He, G. L. Yang, and H. Y. Xie, “A hybrid method to recognize 3D object,” Opt. Express **21**(5), 6346–6352 (2013). [CrossRef] [PubMed]

**33. **N. Cristianini and J. Shawe-Taylor, *An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods* (Cambridge University Press, Cambridge, 2000).

**34. **C. Cortes and V. Vapnik, “Support-vector networks,” Mach. Learn. **20**(3), 273–297 (1995). [CrossRef]

**35. **Y. Maali and A. Al-Jumaily, “Self-advising support vector machine,” Knowl. Base. Syst. **52**, 214–222 (2013). [CrossRef]

**36. **E. Marengo, M. Bobba, E. Robotti, and M. Lenti, “Hydroxyl and acid number prediction in polyester resins by near infrared spectroscopy and artificial neural networks,” Anal. Chim. Acta **511**(2), 313–322 (2004). [CrossRef]