Abstract
Overcoming the collinearity problem in regression by data compression techniques [i.e., principal component regression (PCR) and partial least-squares (PLS)] requires estimation of the number of factors (principal component) to use for the model. The most common approach is to use cross-validation for this purpose. Unfortunately, cross-validation is time consuming to carry out. Accordingly, we have searched for time-saving methods to estimate the number of factors. Two approaches were considered. The first uses the estimated standard error of the model and the second is an approximation to a cross-validation leave-one-out method. Both alternatives have been tested on spectroscopic data. It has been found that, when the number of wavelengths is limited, both methods give results similar to those obtained by full cross-validation both for PCR and PLS. However, when the number of wavelengths is large, the tested methods are reliable only for PCR and not for PLS.
PDF Article
More Like This
Cited By
You do not have subscription access to this journal. Cited by links are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.
Contact your librarian or system administrator
or
Login to access Optica Member Subscription