Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group
  • Journal of Near Infrared Spectroscopy
  • Vol. 19,
  • Issue 4,
  • pp. 233-241
  • (2011)

The Importance of Balanced Data Sets for Partial Least Squares Discriminant Analysis: Classification Problems Using Hyperspectral Imaging Data

Not Accessible

Your library or personal account may give you access

Abstract

This study investigates the effect of imbalanced spectral data in the training set, when developing partial least squares discriminant analysis (PLS-DA) classification models for use in future predictions. The experimental study was performed using a real hyperspectral short-wavelength infrared image data set collected from bakery products (buns) containing contaminants (flies) but similar applications for other insects, paper and plastic were also tested. The contaminants represent a very small proportion of the images relative to the bun. The PLS-DA model aims at accurately detecting and classifying the contaminants and this requires a modification of the calibration data set. The paper deals with problems caused by unbalanced calibration data sets and how to remedy them. In the example it was demonstrated that, by balancing the calibration data from 58,476 bun pixels + 279 fly pixels to 279 bun + 279 fly pixels, the number of true predictions could be improved with a smaller number of PLS components used in the model. The improvement for flies increased from 65% true predictions with ten PLS components to > 99% true prediction with five to six PLS components. The true prediction for bun went from 100% to 99.5% with six PLS components which is an acceptable reduction. Theoretical explanations are included.

© 2011 IM Publications LLP

PDF Article
More Like This
Discrimination of healthy and carious teeth using laser-induced breakdown spectroscopy and partial least square discriminant analysis

Meisam Gazmeh, Maryam Bahreini, and Seyed Hassan Tavassoli
Appl. Opt. 54(1) 123-131 (2015)

Comparison of two partial least squares-discriminant analysis algorithms for identifying geological samples with the ChemCam laser-induced breakdown spectroscopy instrument

Ann M. Ollila, Jeremie Lasue, Horton E. Newsom, Rosalie A. Multari, Roger C. Wiens, and Samuel M. Clegg
Appl. Opt. 51(7) B130-B142 (2012)

Cited By

You do not have subscription access to this journal. Cited by links are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.