Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Hyperspectral face recognition using improved inter-channel alignment based on qualitative prediction models

Open Access Open Access

Abstract

A fundamental limitation of hyperspectral imaging is the inter-band misalignment correlated with subject motion during data acquisition. One way of resolving this problem is to assess the alignment quality of hyperspectral image cubes derived from the state-of-the-art alignment methods. In this paper, we present an automatic selection framework for the optimal alignment method to improve the performance of face recognition. Specifically, we develop two qualitative prediction models based on: 1) a principal curvature map for evaluating the similarity index between sequential target bands and a reference band in the hyperspectral image cube as a full-reference metric; and 2) the cumulative probability of target colors in the HSV color space for evaluating the alignment index of a single sRGB image rendered using all of the bands of the hyperspectral image cube as a no-reference metric. We verify the efficacy of the proposed metrics on a new large-scale database, demonstrating a higher prediction accuracy in determining improved alignment compared to two full-reference and five no-reference image quality metrics. We also validate the ability of the proposed framework to improve hyperspectral face recognition.

© 2016 Optical Society of America

1. Introduction

Current face recognition systems are rife with serious challenges, such as accessory for camouflage, illumination variation, facial expression, aging, etc. [1–5]. Hyperspectral imaging is able to provide new opportunities for face recognition b using an advanced discrimination [4–8]. Two major contributions of the hyperspectral imaging to the improved face recognition include [9,10]: 1) the invariance of illumination conditions that results from the recovery of objects’ spectral properties and 2) the ability to detect distinct patterns contained in human faces where such discriminative patterns cannot be captured by trichromatic (RGB) or monochromatic (gray-scale) imaging. However, the hyperspectral imaging to acquire non-rigid objects introduces new challenges such as an inter-band misalignment that results from subject motion during data acquisition. To the best of authors’ knowledge, conventional research on hyperspectral face recognition have not been directed toward devising improved alignment to reduce inter-band misalignment artifacts in hyperspectral image cubes. As illustrated in Fig. 1, the proposed work mainly focuses on the issue of addressing the inter-band misalignment in hyperspectral image cubes to improve the face recognition rate. To illustrate the inter-band misalignment in the hyperspectral image cube, each of the estimated warps from a fixed bounding box-based alignment approach is depicted by a rectangle in each band of (a). As shown in the first row of (b), this input cannot be sufficiently aligned by the fixed bounding box-based approach using one manual input set of eye coordinates, which is typically used in [5,8,11]. In the alignment scenario of the fixed bounding box-based approach, there is no guarantee that inter-band misalignment artifacts are removed in hyperspectral image cubes due to the presence of subject motion during data acquisition, even though the aligned hyperspectral image cubes are resized to lower pixel resolutions. As compared to the fixed bounding box-based approach, robust alignment by sparse and low-rank decomposition approach [12] yields highly accurate alignment as presented in the second row of (b), accordingly, resolving the inter-band misalignment problem in the hyperspectral image cube to improve face recognition performance. We refer to a stack of bands in hyperspectral imaging as a hyperspectral cube modeled with two spatial dimensions and one spectral dimension. In this context, the accuracy of identification is directly correlated with the quality of alignment [13].

 figure: Fig. 1

Fig. 1 Overview of the proposed framework (best viewed in color): (a) input, (b) alignment approaches, and (c) assessing improved alignment via two proposed qualitative prediction models.

Download Full Size | PDF

One challenge specific to the hyperspectral imaging systems, inherent in the image acquisition process itself, is that the individual bands of hyperspectral images must be acquired quickly and sequentially to avoid inter-band misalignments. However, for lower wavelength range along with low transmittance as shown in Fig. 2(a), it is necessary to adapt the higher camera exposure time for acquiring high quality bands in the hyperspectral image. Furthermore, an extended period of time during data acquisition produces a challenge for subject participants to remain motionless.

 figure: Fig. 2

Fig. 2 (a) Spectral transmittances of a VariSpec VIS liquid crystal tunable filter (LCTF) from 400 nm to 700 nm in 10 nm intervals and (b) the normalized spectral power distributions (SPDs) of synthetic and natural lights (best viewed in color): halogen (H), 40W LED (L), projector with a blue filter (P), mixtures of the studied lights (H+P, L+P, and H+L+P), and the measured D65 illuminant.

Download Full Size | PDF

In cases where hyperspectral image cubes in [8, 22] are acquired with a constant exposure time, there is less concern about inter-band misalignments because subject motion is insignificant with shorter data acquisition time. However, it is shown without adapting the camera exposure time at each wavelength that the IRIS-M [22] and PolyU-HSFD [8] databases have an essential limitation in addressing low quality of hyperspectral images particularly in blue bands of the visible spectrum. This is due primarily to the low transmittance characteristics of an LCTF as shown in Fig. 2(a) and relatively low radiant power of synthetic lights at specific wavelength regions as shown in Fig. 2(b). Figure 2(a) shows that the spectral transmittance of the LCTF decreases with a shorter wavelength, implying that longer exposure time is a prerequisite for the short wavelength regions of the spectrum. In addition, it is necessary for a synthetic light source to have relatively more radiant power near blue bands depicted in Fig. 2(b). As a result, the proposed database (IRIS-HFD-2014) was developed by appropriately setting the camera exposure time according to low transmittances of the LCTF and low radiant power of a light source at the corresponding wavelength regions. Nonetheless, longer data acquisition time potentially creates additional inter-band misalignments that are one of the most important challenges for practical hyperspectral face recognition systems.

In this paper, we present a study of inter-band misalignments through an extensive analysis of hyperspectral image cubes collected by the LCTF in our experiments. We investigated 11 state-of-the-art alignment approaches, and present a new framework to automatically select the optimal alignment method among four feasible approaches based on the minimum inter-band misalignment artifacts in hyperspectral image cubes. To evaluate the performance of the proposed selection method, we provide two different metrics for alignment quality assessment.

A full-reference alignment quality assessment is first proposed based on a principal curvature map [23,24], which is estimated by computing the maximum or minimum eigenvalues of a 2 × 2 Hessian matrix in each band, to evaluate the pixel-wise similarity index [16, 17] between the reference and sequential target bands in the hyperspectral image cubes.

A no-reference alignment quality assessment is also proposed based on the cumulative probability of target colors in the hue, saturation, and value (HSV) color space [25] to evaluate the alignment quality of a single sRGB image [26]. We observed that the color distribution of the misaligned sRGB image is more widely spread over the HSV color space compared to the aligned sRGB image, which is concentrated on the red color of the hue component. This is due to the spectral distortion associated with inter-band misalignments. In other words, since the linear combination of all of the measured reflectance spectra fundamentally estimate the colors of the sRGB image at the given pixels, the colors of the misaligned sRGB image are distorted when a subject motion during data acquisition makes some bands shift out of the stacked bands. Therefore, the distorted colors in sRGB images can be designed as a criterion to determine the alignment index for hyperspectral image cubes. For the proposed sRGB color mapping, after the conversion from the reflectance data for the visible range to the CIE-XYZ space is performed with regard to a CIE 1931 2° standard observer and a CIE illuminant D65 reference white, the CIE-XYZ space is transformed back to the sRGB color space [26,27].

The performance of the proposed method is compared with seven state-of-the-art image quality assessments [14–18, 20, 21] on IRIS-HFD-2014. In the experiment, the proposed automatic selection of the optimal alignment method significantly outperformed conventional methods. In addition, we prove that the alignment accuracy dominantly affects the face recognition rate of a probabilistic linear discriminant analysis (PLDA) approach [28]. In Table 1, we summarize key acronyms used in this paper.

Tables Icon

Table 1. Summary of important acronyms used in this paper.

The remaining content of this paper is organized as follows: in Section II, we briefly introduce IRIS-HFD-2014. In Section III, we investigate face alignment techniques to address inter-band misalignments in hyperspectral image cubes. In Sections IV and V, two proposed metrics to evaluate the alignment quality of hyperspectral image cubes are presented, respectively. In Section VI, we show experimental results of the proposed metrics, and Section VII concludes the paper.

2. Hyperspectral face database

Practical use of hyperspectral face recognition has been limited due to a limited database in the public domain. There are only five available hyperspectral face databases: CMU [11], IRIS-M [22], PolyU-HSFD [8,29], Stanford [30], and UWA-HSFD [5,31,32]. We refer the interested reader to [33] for more detail. In this regard, the purpose of the proposed database is to meet the emerging demands for a new hyperspectral face database that can serve as a benchmark for comprehensively and statistically evaluating the performance of hyperspectral face recognition algorithm.

In this paper, we provide a review with regard to UWA-HSFD [5,31]. UWA-HSFD developed by the University of Western Australia consists of 79 data subjects in the frontal view taken over 4 sessions. Each hyperspectral image was captured by the VariSpec LCTF integrated with a photon focus camera under halogen lamps. Each dataset of hyperspectral image cubes contains 33 bands covering the visible spectral range from 400 nm to 720 nm with 10 nm steps. Since UWA-HSFD considered the adaptation of the camera exposure time according to lower transmittances of the filter and lower illumination intensities in each band, this database contains relatively lower noise level than the PolyU-HSFD. Similar to IRIS-HFD-2014, UWA-HSFD suffers from inter-band misalignments that result from slight head movements during data acquisition [34].

IRIS-HFD-2014 was recently developed over multiple sessions in the IRIS laboratory at the University of Tennessee. IRIS-HFD-2014 contains a total of 490 hyperspectral face cubes obtained from a participant subject base that includes 86 males (66%) and 44 females (34%), including diverse ethnic backgrounds physical appearances. This database is designed to address several challenges of conventional face recognition systems, including variations in time, pose (both frontal and profile views), and structural feature (e.g., glasses). Similar to the IRIS-M, PolyU-HSFD, and UWA-HSFD, we employed an LCTF to acquire hyperspectral image cubes that cover the visible spectral range from 420 nm to 700 nm in 10 nm steps (29 narrow-bands). However, IRIS-HFD-2014 can provide more spectral properties of facial tissue in the visible spectrum than the PolyU-HSFD and IRIS-M databases, as shown in Fig. 3. This is because the hyperspectral image cubes in IRIS-HFD-2014 were collected by adapting the camera exposure time according to the transmittance of the filter and illumination intensity in each band. Moreover, as shown in Fig. 4, UWA-HSFD contains blurred band images especially in the blue bands as illustrated in Fig. 4(a) due to low illumination intensities at lower wavelengths in the halogen lamp (see Fig. 2(b)) and overexpose during data acquisition. As compared to UWA-HSFD, IRIS-HFD-2014 built under the developed light source provides higher quality band images at low wavelengths as shown in the third row of Fig. 3.

 figure: Fig. 3

Fig. 3 Comparison of three databases collected by LCTFs where in each row from top to bottom, sample bands covering the visible range from 420 nm to 690 nm in 30 nm intervals are taken from PolyU-HSFD [8,29], IRIS-M [22], and IRIS-HFD-2014, respectively. IRIS-HFD-2014 will be made publicly available. Note that the two bands at 420 nm and 450 nm do not exist in the IRIS-M database.

Download Full Size | PDF

 figure: Fig. 4

Fig. 4 Comparison of UWA-HSFD [43,45] and IRIS-HFD-2014.

Download Full Size | PDF

The configuration of the proposed hyperspectral imaging system is shown in Fig. 5. IRIS-HFD-2014 was acquired using an X-rite ColorChecker Classic placed to the side of the individuals as shown in Fig. 5(a) so as to allow for calibration and analysis of facial color. We used a Lumia 5.1 Reef as the light source as shown in Fig. 5. The VariSpec VIS LCTF was mounted in front of a detector as shown in Fig. 5(c). Between the LCTF and the detector, we equipped a 25 mm fixed focal length lens as shown in Fig. 5(d) which has a very wide aperture of f/0.95. We used a XIMEA xiQ USB3.0 camera as the detector as shown in Fig. 5(e).

 figure: Fig. 5

Fig. 5 Overview of the experimental setup of our hyperspectral imaging system: (a) target, (b) LED light source, (c) LCTF, (d) lens, (e) detector, and (f) system controller.

Download Full Size | PDF

As illustrated in Fig. 2(a), the spectral transmittances of the LCTF becomes smaller as the wavelength decreases. As a result, it is necessary to properly adapt the camera exposure time according to the spectral transmittances at each wavelength. For example, for a shorter wavelength, we set the exposure time longer in order to accumulate more radiant energy in the detector. In addition to considering low illumination intensities in the data acquisition process, we disabled the fourth channel of the light source, that are deep red (660 nm to 665 nm) and turquoise (492.5 nm to 495 nm), to obtain more radiant power in the short wavelength regions. Since the proposed light source inherently has lower radiant power than the spectral transmittances of the LCTF from 470 nm to 520 nm and from 670 nm to 700 nm as shown in Fig. 6(a), we increased the camera exposure time at these wavelengths as shown in Fig. 6(b), which elucidates the SPD of the Lumia 5.1 Reef when we turned off the fourth channel. Figure 6(b) illustrates the exposure time utilized in the data acquisition process. The SPD of the light source was measured with an Ocean Optics Model SD100 spectrometer. Once we automatically determine the exposure time for each wavelength of interest as in [35], we estimate reflectance images using the method proposed in [4].

 figure: Fig. 6

Fig. 6 (a) The SPD of Lumia 5.1 Reef and (b) tuned camera exposure time utilized in our data acquisition. Note that since our light source inherently has low radiant power relative to the spectral transmittances of the LCTF from 470 nm to 520 nm and from 670 nm to 700 nm as shown in (a), we increased the camera exposure time at these wavelengths.

Download Full Size | PDF

In the following sections, we deal only with the subject area in the target presented in Fig. 5(a) by specifying a region of interest (ROI) with a binary mask.

3. Analysis of face alignment methods for inter-band misalignment

In this section, we analyze four different alignment techniques in order to account for inter-band misalignments in hyperspectral image cubes. To this end, we determine a set of feasible alignment approaches to minimize inter-band misalignment artifacts in hyperspectral image cubes. Figure 7 shows an example of the noticeable artifacts associated with the inter-band misalignment when the misaligned bands are mapped to the sRGB image. As shown in Fig. 7, inter-band misalignments result in motion blurring combined with distorted colors that are caused by spectral distortion.

 figure: Fig. 7

Fig. 7 Examples of the presence of inter-band misalignment artifacts in the sRGB color space (best viewed in color). As illustrated in this figure, inter-band misalignments result in the problem of motion blurring combined with distorted colors that are caused by spectral distortion. The sRGB image is generated from ID: F009_02 in IRIS-HFD-2014. Note that the hyperspectral image cube used in this figure is different from the sample hyperspectral image cube of IRIS-HFD-2014 in Fig. 3.

Download Full Size | PDF

We first employ two conventional inter-channel alignment approaches including the fixed bounding box-based (FBB) [5, 8, 11, 41] and eye coordinate-based (EC). hyperspectral image cubes are mostly aligned using FBB, which requires the selection of the initial position of an ROI from sequential images. However, the primary concern in using FBB is to establish an alignment baseline in order to specify the subject motion during the data acquisition. Likewise, although EC is the most straightforward approach to address inter-band misalignments in hyperspectral image cubes, it is a challenge to consistently select eye coordinates at the same positions by hand over all of the bands in the hyperspectral image cube. To achieve aligned hyperspectral image cubes based on EC, we choose the canonical frame of size 140 × 160 pixels. We manually select eye coordinates at the middle of the eyes to crop each band as in [12], and then normalize the distance between the selected eye points to 80 pixels. For FBB, we only use the eye coordinate at 420 nm found in EC.

Next, inspired by the warp update rule of Lukas-Kanade [42], robust alignment by sparse and low-rank decomposition (RASL) [12] and online robust image alignment (ORIA) [19] based on the iterative convex optimization are considered as feasible candidates to overcome inter-band misalignments in hyperspectral image cubes. RASL seeks an optimal set of image domain transformations to achieve a promising performance with highly accurate and consistent face alignment for IRIS-HFD-2014. Since RASL requires manual data input of eye coordinates for all of the bands and has high computational cost and memory usage, it becomes impractical as the total number of hyperspectral image cubes increases. ORIA was proposed to address the limitations of RASL while maintaining the comparable accuracy of face alignment with reduced computational cost and memory usage. ORIA solves a sequence of convex optimization problems to minimize an l1-norm, and at the same time to update previously well-aligned images. Similar to FBB, ORIA only requires a single set of eye coordinates at 420 nm found in EC.

The landmark-based alignment approach then presents a viable way to reduce the computational complexity of the hyperspectral image cube alignment. We investigate five different approaches: tree structure part model (TSPM) [36]; supervised descent methods (SDM) [37]; cascaded deformable model (CDM) [38]; discriminative response map fitting (DRMF) [39]; and incremental parallel cascade model (IPCM) [40]. Whereas landmark-based alignment approaches are able to offer fully automatic processes for the hyperspectral image cube alignment, their performances depend significantly on the initial position of a facial model determined by a face detection algorithm [38]. We heuristically observed that the performance of the face detection methods used in [37–40] decreases with high variations in intensity at different wavelengths. However, we found that TSPM can accurately detect faces over all the hyperspectral image cubes in IRIS-HFD-2014. Thus, we utilized TSPM to incorporate other landmark-based alignment approaches in initializing the facial models. Nevertheless, as shown in Figs. 8 and 9, the landmark-based alignment approaches often failed to converge to global or local minima to update correct parameters of facial models and to localize consistent landmarks over the hyperspectral image cubes. According to our observations, we found two drawbacks of the landmark-based alignment approaches in hyperspectral image cubes: 1) the landmark-based alignment approaches are heavily sensitive to variations in intensity resulting mainly from the lack of photon energy, particularly at short wavelengths near blue bands in the visible spectrum; and 2) the facial models trained on dissimilar facial databases cannot be directly applied to hyperspectral image cubes without training the facial models. Indeed, when annotating the facial models, it is necessary to manually select more than 30 landmarks for each band in general. Such a task becomes impractical when applied to a large hyperspectral image cube set such as IRIS-HFD-2014.

 figure: Fig. 8

Fig. 8 Sample results of five state-of-the-art landmark-based alignment approaches at 420 nm, 500 nm, 600 nm, and 700 nm tested on ID: F048_01 in IRIS-HFD-2014: (a) TSPM [36], (b) SDM [37], (c) CDM [38], (d) DRMF [39], and (e) IPCM [40]. Figures are best viewed in color.

Download Full Size | PDF

 figure: Fig. 9

Fig. 9 Examples of inaccurate localization of landmarks: (a) the results of TSPM [36] and (b) the results of IPCM [40] at 600 nm and 700 nm. As illustrated in (a) and (b), whereas both TSPM and IPCM detected all of the facial features in Fig. 8(a) and 8(e), they were afflicted with the correspondence problems of landmarks due to inaccurate localizations of the detected points in HFIs.

Download Full Size | PDF

We finally examine two popular image matching approaches: LK [42] and SIFTFlow [43]. These approaches are commonly used for image stitching and stereo matching [43]. To solve the challenges related to image alignment, LK used Gauss-Newton optimization to minimize the sum of squared error between two images. For SIFTFlow approaches, Liu et al. [43] utilized scale-invariant feature transform (SIFT) to characterize local gradient information on two input images where the SIFT descriptors of two images are matched along flow vectors. To test these image matching approaches, we assign the reference image to be the first band (i.e., 420 nm) of the hyperspectral image cube set. We iteratively update the target images from 430 to 700 nm in increments of 10 nm. Figure 10 shows sample results of image matching algorithms that fail to remove inter-band misalignments on most of hyperspectral images in IRIS-HFD-2014.

 figure: Fig. 10

Fig. 10 Examples of failures of LK [42] from 600 nm to 690 nm in 30 nm steps and SIFTFlow [43] at 580 nm, 590 nm, and 600 nm on ID: F048_01 in IRIS-HFD-2014.

Download Full Size | PDF

Therefore, we adopt two conventional and two iterative convex optimization-based alignment approaches: FBB, EC, RASL, and ORIA, which can deal with inter-band misalignments involved in hyperspectral image cubes throughout this paper. Figure 11 shows the aligned hyperspectral image cubes according to four selected alignment approaches and the corresponding sRGB images where rectangular ROIs are magnified in the last column. To visually compare the aligned hyperspectral image cubes, we connected the left eye at 420 nm to the right eye at 700 nm. As depicted with the red line in Fig. 11(a), the subject tended to move toward the bottom right-hand corner during data acquisition. As a result, this hyperspectral image cube cannot be sufficiently aligned by the FBB due to the presence of subject motion. EC, RASL, and ORIA explicitly achieved well-aligned results as compared to the FBB as shown in Figs. 11(b)–(d). On the enlarged ROIs in the last column of Fig. 11, however, we can still observe color distortion near the right jaw, particularly in the result of EC. Although it is vital to produce highly accurate alignment results from RASL and ORIA, it is nevertheless difficult to visually determine improved alignment between RASL and ORIA. It is thus necessary to evaluate the alignment quality of the hyperspectral image cubes produced by the selected alignment approaches using an objective score to assess improved alignment.

 figure: Fig. 11

Fig. 11 Examples of the aligned bands at 420 nm, 500 nm, 600 nm, and 700 nm according to (a) FBB, (b) EC, (c) RASL, and (d) ORIA. The corresponding sRGB images are shown in the fifth column where ROIs marked by rectangles are magnified in the last column (best viewed in color). Note that in this figure, we use the same hyperspectral image cube as in Fig. 7.

Download Full Size | PDF

4. Full-reference metric based on curvature model similarity

In this section, we present the proposed method for a full-reference alignment quality assessment based on a principal curvature map [23,24] derived by computing the maximum or minimum eigenvalues of a 2 × 2 Hessian matrix that describes the local curvature of the image. A major difficulty for a full-reference alignment quality assessment with hyperspectral image cubes is the existence of high intensity changes at different wavelengths. To deal with the intensity variation depending on the wavelengths in hyperspectral image cubes, we adopt the basic framework of gradient map-based image quality assessments [16,17]. A major contribution of the proposed approach compared to two existing gradient map-based approaches is that we use the principal curvature map as a local quality map to evaluate the pixel-wise gradient similarity index between the reference and the sequential target bands in the hyperspectral image cube. The proposed method can provide more consistent and sufficient information for describing facial features as shown in Fig. 12.

 figure: Fig. 12

Fig. 12 Comparison of the results of (a) gradient similarity model (GSM) [17], (b) gradient magnitude similarity (GMS) [16], and (c) the proposed method based on curvature model similarity (CMS) at 420 nm, 550 nm, and 700 nm on the results of RASL [12] applied to ID: F067_01 in IRIS-HFD-2014 as shown in (d).

Download Full Size | PDF

To obtain the principal curvature map, we first convolve an input aligned band A at the point p with a variable-scale Gaussian g(p, σ) as a function d(p, σ) = g(p, σ) ⊗ A(p), where ⊗ represents the convolution operation. We then form a 2 × 2 Hessian matrix as in [23]

H(p)=[Dxx(p)Dxy(p)Dxy(p)Dyy(p)],
where Dxx, Dxy, and Dyy denote the second-order derivatives of the input band at point p. After building H(p), we compute the maximum and minimum eigenvalues, λ1 and λ2, respectively, from H(p). Next, we obtain the principal curvature map (PCMP) of the input band as PCMP = |λ1|. A curvature model (C) can be given by normalizing PCMP as
C(p)=PCMP(p)max(PCMP(p)).
Note that to achieve highly accurate alignment quality assessment, we only use the maximum eigenvalue λ1 of the Hessian matrix since we heuristically found that the C(p) built from the minimum eigenvalue λ2 tends to be unpredictable where inconsistent curvature lines or edges reduce the accuracy of alignment quality assessment.

Given n aligned gray-scale bands Ai, i ∈ [1, n], in an input set, we choose the first band (i.e., 420 nm) as a reference A1 and the remaining bands as targets Aj, j ∈ [2, n]. We first compute Cr for A1 once and then iteratively compute Cj for Aj until j = n. The score of the curvature model similarity (CMS) is defined as

CMS(j)=1mp=1m2Cr(p)Cj(p)+ΓCr(p)2+Cj(p)2+Γ,
where Γ denotes a small positive constant that supplies numerical stability by keeping the denominator away from being zero. Γ = 10−5 was used in the experiment. m is the total number of pixels in the band. A single overall score for the alignment quality in all of the bands of the input hyperspectral image cube using average pooling is computed as
CMSM=1(n1)jCMS(j).
In Table 2, we show the scores of the proposed CMS over the aligned hyperspectral image cubes produced by FBB, EC, RASL, and ORIA by comparing two recent approaches: 1) gradient similarity model (GSM) [17]; and 2) gradient magnitude similarity (GMS) [16]. Figure 13 shows that the alignment quality of RASL is visually better than FBB. The scores of CMS in Table 2 agree with the visual assessment where the higher scores indicate the better alignment quality on a hyperspectral image cube. However, both GSM and GMS incorrectly indicate that the alignment quality of FBB is better than that of RASL as opposed to our observation. Accordingly, the proposed CMS offers better prediction for the alignment quality among the four aligned hyperspectral image cubes. These examples validate that the proposed method can accurately predict the better-quality alignment among four aligned hyperspectral image cubes.

Tables Icon

Table 2. Comparison of the reported scores for the sample sets in Fig. 12.

 figure: Fig. 13

Fig. 13 Comparison of the results of four selected alignment approaches with the corresponding sRGB images. This hyperspectral image cube is taken from ID: F067_01 in IRIS-HFD-2014.

Download Full Size | PDF

Despite the effectiveness of the proposed CMS in predicting the alignment quality of the hyperspectral image cube, CMS is required to iteratively evaluate the similarity scores over all of the bands in the input hyperspectral image cube. To overcome the limitations of the proposed CMS while supporting comparable accuracy of assessing the alignment quality, we develop an alternative metric to evaluate the alignment quality with a single sRGB image in the following section.

5. No-reference metric based on cumulative probability of target colors in HSV

In this section, we propose a no-reference alignment quality assessment, based on the cumulative probability of target colors in the HSV color space, for evaluating the alignment quality of a single sRGB image [26]. The proposed no-reference alignment quality assessment was inspired by the presence of distorted colors in the sRGB color space that were caused by inter-band misalignments, as presented in Fig. 7. More specifically, to analyze the effects of inter-band misalignments in hyperspectral image cubes, both of the misaligned and aligned sRGB images established by FBB and RASL in Fig. 11 are projected into the HSV color space [25], which is specified by three components: hue (H), saturation (S), and value (V). As shown in Fig. 14, the color distribution of the misaligned image is more widely spread over the HSV color space compared to the aligned image, which appears to be concentrated on the red color of the hue component. Notice that in Figs. 14(c) and 14(d), the dark colors with low V values are produced by the subject’s clothes where such colors do not appear in the misaligned image in Figs. 14(a) and 14(b). These observations indicate that we can predict the alignment quality of a single sRGB image by exploring the distribution of the distorted colors.

 figure: Fig. 14

Fig. 14 Analyses of the effects of inter-band misalignments in the HSV color space. For the misaligned sRGB image in Fig. 11(a), we can observe that the distribution of the colors is more widely spread over the HSV color space as shown in (a) and (b) in different views where the vertical axis is the V value, the horizontal distance from the axis is the S value, and the angle is the H value. However, the color distribution of the aligned sRGB image in Fig. 11(c) is concentrated near the red color of the hue component as illustrated in (c) and (d). Note that the dark colors with low V values in Fig. 11(c) are caused by the subject’s clothes where the colors do not appear in Fig. 11(a).

Download Full Size | PDF

To convert the color space from sRGB [26] to HSV [25], we first find the maximum and minimum component values among R, G, and B ∈ [0, 1] as

Γmaxmax(R,G,B),Γminmin(R,G,B),
where R, G, and B stand for the red, green, and blue components in the sRGB color space, respectively. Next, we compute the saturation S as
S={ΓδVifV>00otherwise,
where the value V = Γmax and Γδ ≜ Γmax − Γmin. Note that if S = 0, then H is undefined. The preliminary hue, ∈ [−1, 5] is then computed as
H˜={(GB)/ΓδifR=Γmax(BR)/Γδ+2ifG=Γmax(RG)/Γδ+4ifB=Γmax.
Next, the hue H ∈ [0, 1] can be given by normalizing as
H=16{(H˜+6)ifH˜<0H˜otherwise.
We then remove undesired hue values with excessively low and high saturation by assigning β values as
H(p)={H(p)ifαS(p)<1αβotherwise,
where α = 0.05 in our experiments and p denotes pixel coordinates on an image. We set β = −1, which is a value out of range in H. In Fig. 15, we illustrate the hue component on a color wheel that is divided into six sectors related to three primary colors (red, green, and blue) and three mixed colors (cyan, magenta, and yellow). To assess the alignment quality of an input image, we compute the probability of the distorted colors, ϕ ∈ [2, 6], of the hue component as
Pϕ(x)=Pr(x|2(ϕ1)112<x2ϕ112),
where we define a random variable x as 0 ≤ x ≤ 1, xH′ in the interval [−1, 1]. Note that the probability of the red color of the hue component can be given as
Pϕ=1(x)=Pr(x|1112<xx112).
The single overall score of the proposed no-reference metric using a single sRGB image is computed as
AQscores=1ϕPϕforϕ[2,6],
where a higher score indicates better alignment quality. We refer to the proposed no-reference metric as hue-based alignment quality assessment (HUQA).

 figure: Fig. 15

Fig. 15 Illustration of the hue component on a color wheel divided into six sectors associated with three primary and three pairwise mixed colors ϕ, including red, yellow, green, cyan, blue, and magenta.

Download Full Size | PDF

In Fig. 16, we show the probability difference of six representative colors including red, yellow, green, cyan, blue, and magenta in the hue component between the misaligned and aligned sRGB images where the probabilities of the distorted colors on the misaligned image in Fig. 11(a) are observably higher than those of the aligned image in Fig. 11(c). These results imply that the proposed method can be used for assessing a single overall alignment quality to determine the better-quality alignment among four aligned image sets formed by FBB, EC, RASL, and ORIA.

 figure: Fig. 16

Fig. 16 Comparison of the probability of six representative colors (red, yellow, green, cyan, blue, and magenta) between (a) the misaligned and (b) the aligned sRGB images corresponding to Fig. 11(a) and Fig. 11(c), respectively.

Download Full Size | PDF

6. Experimental results

In this section, we demonstrate the efficacy of the proposed metrics on two cases: 1) rigid objects such as mannequins [Fig. 17] and 2) non-rigid subjects in the frontal hyperspectral image cubes taken from the first session in IRIS-HFD-2014, which consists of 130 hyperspectral image cubes collected from 86 males and 44 females of diverse ethnic backgrounds and diverse physical appearance.

 figure: Fig. 17

Fig. 17 Examples of rigid objects: (a) mannequin 1 and (b) mannequin 2, where hyperspectral image cubes are aligned by FBB and displayed using sRGB values.

Download Full Size | PDF

In the following subsection, we explain the choice of a parameter for CMS. We then verify the correctness of the CMS metric on the rigid object sets. Next, we further test CMS on the non-rigid subject sets that include more challenging problems associated with inter-band misalignments in IRIS-HFD-2014 that contains large displacements in 2 to 50 pixels. Prediction accuracy is used to evaluate the performance of the proposed metrics. We compare the performance of the proposed CMS to two gradient map-based image quality assessments [16,17]. For HUQA compared with five state-of-the-art no-reference image quality assessments [14,15,18,20,21] that are typically used to evaluate the sharpness or blurriness of an input image, we repeatedly follow the same tasks as used in the verification of the CMS metric. We use the original software provided by the authors of the competing approaches to obtain all of our experimental results. It is worth noting that in our experimental results the higher predicted scores represent the better alignment quality. Finally, we verify that the improved alignment leads directly to better accuracies of face recognition.

6.1. Choice of σ for CMS

We only have one parameter σ to define the Gaussian scale in the CMS metric. For experimental determination of the optimal σ value on IRIS-HFD-2014, we examine how various σ values from 1.0 to 2.0 affect the task of predicting improved alignment among four selected alignment approaches. Note that the errors are computed by counting the number of subject IDs where the scores of FBB are higher than other scores of EC, RASL, or ORIA. As shown in Fig. 18, the highest accuracy of CMS is obtained when σ ≥ 1.4. There is no further impact on the accuracy even when considering substantial differences in scale. Therefore, for efficiency, we define the Gaussian scale for the proposed CMS metric as σ = 1.4, for all remaining experiments throughout this paper.

 figure: Fig. 18

Fig. 18 The number of errors on IRIS-HFD-2014 as σ increases from 1.0 to 2.0. σ denotes the Gaussian scale and is a parameter for the proposed CMS metric. We observe that CMS appears not to enhance the accuracy of the prediction to determine improved alignment when the σ value is greater than 1.4. The proposed CMS obtains four errors when the scores of FBB are compared with other scores of EC, RASL, and ORIA. However, there is no error if we consider the errors compared only with RASL and ORIA as shown in Section VI.C. Note that the number of errors of GSM and GMS in both cases is unchanged.

Download Full Size | PDF

6.2. Results of CMS on rigid object sets

On rigid object sets, we investigate the experimental evaluation of the correctness of the proposed CMS compared with two recent full-reference image quality assessments, both based on the gradient map, that consist of GSM [17] and GMS [16]. GSM uses four 5 × 5 kernels predefined with weighting coefficients to compute gradient values according to two versions: 1) a block-wise version was used for image blocks x and y and 2) a pixel-wise version was used for the central pixels of image blocks x and y. GMS utilizes Prewitt filters along horizontal and vertical directions to obtain the gradient map. Note that to cogently examine the performance of GSM, GMS, and the proposed CMS, we set C = 10−5 in the functions used for measuring gradient similarity in both GSM and GMS. We refer the interested reader to the respective papers [16,17] for more detail.

We test the proposed metric on the rigid sets to illustrate its correctness on synthetic hyperspectral image cubes taken under controlled conditions. Since the position of mannequins is fixed, FBB as the ground truth alignment will achieve the most improved results. In addition, there is no difference between the aligned hyperspectral image cubes produced by FBB and EC because an eye coordinate set is selected on the first band (i.e., 420 nm) for all of the bands of the rigid sets. Accordingly, we examine only three alignment approaches: FBB, RASL, and ORIA. Note that although RASL and ORIA yield well-aligned results for rigid objects, the alignment quality of FBB is intuitively better than that of RASL and ORIA since there is no subject motion on rigid sets.

In Fig. 19 and Table 3, we show the results of the correctness of two full-reference image quality assessments and CMS on rigid object sets. Compared to GSM and GMS, the proposed CMS metric consistently predicts all of the scores for evaluating the alignment quality of the reference and target bands in the rigid hyperspectral image cubes created by three alignment approaches, as shown in Fig. 19. In addition, although all CMS scores are relatively close in range, CMS correctly indicates improved alignment such as FBB in both rigid object sets, as shown in Table 3.

 figure: Fig. 19

Fig. 19 Results of the correctness of two full-reference image quality assessments and the proposed CMS on rigid object sets. Figures are best viewed in color.

Download Full Size | PDF

Tables Icon

Table 3. Results of the correctness of two full-reference image quality assessments and the proposed CMS on two rigid object sets.

6.3. Results of CMS on IRIS-HFD-2014

We further demonstrate the effectiveness of the proposed CMS metric on the non-rigid subject sets. In this task we examine the prediction consistency and accuracy involved in determining the better-quality alignment among four selected alignment approaches: FBB, EC, RASL, and ORIA.

FBB is employed as the ground truth to evaluate the performance of the considered assessments for alignment quality. For example, if subject motion occurs during data acquisition, the alignment quality of FBB is typically worse than the other alignment approaches. Although FBB cannot compete with RASL and ORIA in most cases, FBB yields better alignment results than EC when there is insignificant subject motion during data acquisition because EC requires continual and repetitive manual data input of the eye coordinates. Therefore, under these experimental observations we also examine the scores of FBB with RASL and ORIA to compute the considered errors.

To validate the performance of the proposed CMS metric with respect to prediction accuracy, we compare the number of errors on IRIS-HFD-2014 by analyzing the scores of FBB against those of RASL and ORIA as shown in Fig. 20. For instance, GSM incorrectly predicts the alignment quality for 9 hyperspectral image cubes in Fig. 20(a). 15 errors were found in GMS metric in Fig. 20(b). As compared to GSM and GMS metrics, the proposed CMS more precisely predicts the alignment quality without any error over 130 hyperspectral image cubes in IRIS-HFD-2014. In addition, we compute the prediction accuracy associated with the errors found in GSM, GMS, and CMS as

Accuracy=1εN,
where ε denotes the number of errors and N represents the number of hyperspectral image cubes, such as N = 130 in IRIS-HFD-2014. We report the prediction accuracy of two full-reference image quality assessments and the proposed CMS in Table 4, where the prediction accuracy is computed by (13) in two cases: 1) comparing the predicted scores of FBB with other scores of EC, RASL, and ORIA and 2) comparing the predicted scores of FBB with the scores of RASL and ORIA. As shown in Table 4, the proposed CMS achieves highly accurate prediction in determining improved alignment in both cases. Next, we examine the performance of the selected alignment approaches using the proposed CMS metric.

 figure: Fig. 20

Fig. 20 Errors of two full-reference image quality assessments and the proposed CMS on IRIS-HFD-2014. Note that the errors are computed by counting the number of subject IDs where the scores of FBB are higher than either the scores of RASL or ORIA. As shown in (c), there is no error in the CMS metric compared to GSM and GMS where the number of errors of (a) GSM and (b) GMS is 9 and 15, respectively.

Download Full Size | PDF

Tables Icon

Table 4. Results of the prediction accuracy of two full-reference image quality assessments and the proposed CMS on IRIS-HFD-2014.

In Fig. 21, we depict all of the predicted scores of the proposed CMS using scatter plots and regression lines according to four selected alignment approaches where the predicted scores are sorted in ascending order. To investigate trends in the predicted scores computed by the proposed CMS metric, we utilize the least-squares method [44] to estimate a linear regression. As shown in Table 5, the intercept of the regression line according to RASL is higher than other alignment approaches, implying that the alignment quality of RASL is better than the others. Furthermore, the estimated slope coefficients of RASL and ORIA are lower than that of FBB and EC, indicating that RASL and ORIA consistently yield well-aligned results because the trend lines have lower correlation between subjects and the predicted scores. Therefore, these experimental results demonstrate that RASL achieves highly accurate alignment with high consistency as compared to FBB, EC, and ORIA.

 figure: Fig. 21

Fig. 21 Scatter plots with the regression lines corresponding to the results of the proposed CMS in IRIS-HFD-2014 (best viewed in color).

Download Full Size | PDF

Tables Icon

Table 5. Parameters of the estimated regression lines for FBB, EC, RASL, and ORIA via the proposed CMS.

6.4. Results of HUQA on rigid object sets

We examine the proposed HUQA on rigid object sets to experimentally verify its correctness. We compare HUQA against five no-reference image quality assessments that include spatial-spectral entropy-based quality index (SSEQ) [21]; local phase coherence-based sharpness index (LPC-SI) [18]; blind/referenceless image spatial quality evaluator (BRISQUE) [14]; spectral and spatial sharpness (S3) [20]; and cumulative probability of blur detection (CPBD) [15]. We refer the interested reader to the respective papers [14,15,18,20,21] for more detail.

In Fig. 22 and Table 6, we demonstrate the experimental results of the correctness of five no-reference image quality assessments and the proposed HUQA on the rigid object sets. As mentioned in Section VI.B, we investigate the correctness of predicting better alignment among FBB, RASL, and ORIA. In this task, HUQA and CPDB produce higher prediction accuracy in selecting the better-quality alignment on both input sets. Although BRISQUE correctly identifies that the alignment quality of FBB is better than that of the other alignment approaches for the second input set, it fails to predict the first rigid object set. SSEQ, LPC-SI, and S3 are unsuccessful in these experiments, as shown in Table 6. Note that to fix a range of y-axis involved in the scores of the alignment quality on figures, the scores of SSEQ and CPBD are rescaled by adding 0.3 and 0.5, respectively, for all of the experiments throughout this paper.

 figure: Fig. 22

Fig. 22 Results of the Correctness of five state-of-the-art no-reference image quality assessments and the proposed HUQA on two rigid object sets. Figures are best viewed in color.

Download Full Size | PDF

Tables Icon

Table 6. Results of the correctness of five no-reference image quality assessments and the proposed HUQA on two rigid object sets.

6.5. Results of HUQA on IRIS-HFD-2014

For a comprehensive study on the performance of the proposed HUQA, we examine the prediction consistency and accuracy on non-rigid subject sets in IRIS-HFD-2014. We note that HUQA only requires a single sRGB image as an input instead of all the bands in the hyperspectral image cube. Furthermore, whereas full-reference metrics assess the alignment quality of the input hyperspectral image cube using average pooling schemes by iteratively measuring the similarity between the reference and target bands in the hyperspectral image cube, there is neither an iterative process nor a pooling scheme in the proposed HUQA.

In Fig. 23, we validate the prediction accuracy of five competing image quality assessments and the proposed HUQA in terms of estimating the errors where the scores of FBB are compared with RASL and ORIA. As illustrated in Fig. 23, HUQA achieves the highest prediction accuracy with only one error caused by ID: F009_01 where the predicted scores corresponding to FBB, EC, RASL, and ORIA are 0.9802, 0.9788, 0.9792, and 0.9793, respectively. Even though the HUQA yields one error, the scores are quite close together. Table 7 shows the prediction accuracy of five five no-reference image quality assessments and the proposed HUQA in the same cases as considered in Table 4. In case 1, the number of errors of SSEQ, LPC-SI, BRISQUE, S3, and CPBD are 30, 14, 34, 55, and 26, respectively. In case 2, the number of errors of SSEQ, LPC-SI, BRISQUE, S3, and CPBD are 9, 13, 15, 41, and 7, respectively. Compared to other metrics, HUQA in case 1 achieves only 4 errors. Therefore, HUQA yields the highest accuracy of predicting improved alignment in both cases.

 figure: Fig. 23

Fig. 23 Errors of five no-reference image quality assessments and the proposed HUQA where the scores of FBB are compared with RASL and ORIA.

Download Full Size | PDF

Tables Icon

Table 7. Results of the prediction accuracy of five no-reference image quality assessments and the proposed HUQA on IRIS-HFD-2014: (a) SSEQ, (b) LPC-SI, (c) BRISQUE, (d) S3, (e) CPBD, and (f) HUQA.

In Fig. 24, the experimental results show the predicted scores in the proposed HUQA corresponding to the conducted alignment approaches. The intercepts and slope coefficients of the regression lines according to FBB, EC, RASL, and ORIA are shown in Table 8. The proposed HUQA also shows that RASL yields improved alignment with higher consistency compared to the competing alignment approaches as illustrated in Table 8. This determination can also be visually verified by interpreting the slopes and intercepts of the regression lines of the conducted alignment approaches in Fig. 24.

 figure: Fig. 24

Fig. 24 Scatter plots with the regression lines corresponding to the results of the proposed HUQA on non-rigid subject sets in IRIS-HFD-2014 (best viewed in color).

Download Full Size | PDF

Tables Icon

Table 8. Parameters of the estimated regression lines for FBB, EC, RASL, and ORIA via the proposed HUQA.

6.6. Improvements in face recognition on IRIS-HFD-2014

To verify the improvements to face recognition performance, we employ a PLDA approach [28], which models intra-class and inter-class variance as multidimensional Gaussian to seek maximum facial discriminability. As mentioned by Shafey et al. [45], PLDA has achieved the state-of-the-art performance in face and speaker recognition fields. The interested reader is referred to [28,45] for more detail.

According to the previous experiments in Sections VI.B to VI.E, RASL consistently and accurately yields improved alignment compared to the selected alignment approaches. Therefore, we adopt FBB and RASL to validate how the alignment accuracy affects the performance of face identification of PLDA. In IRIS-HFD-2014, each subject has 1 to 3 hyperspectral image cubes in the frontal view, which are acquired at different sessions. Thus, the database contains a total of 179 hyperspectral image cubes of 130 data subjects. For the training set, we select 117 cubes of 99 individuals from the first and third sessions. We test with 31 individuals, using the two cubes from the first and second sessions as the gallery and probe sets, respectively. Note that since there is no overlap between the training and testing sets (gallery and probe sets) the probabilistic model generalizes from the training set to new data subjects as in [28]. For computational efficiency, the face cubes are resized to 30 × 30 × 29. Figure 25(a) demonstrates the identification results of 20 iterations of training from hyperspectral image cubes as an increase in the identity and noise subspace size from 8 to 56 dimension with a step of 4. The identity subspace denotes the directions of variation of different faces and the noise subspace represents the directions of variability for any individual face, resulting from viewing conditions such as illumination, pose, etc. The optimal subspace dimension of PLDA on the aligned hyperspectral image cubes of RASL is 20. Therefore, PLDA based on 20 factors using the aligned hyperspectral image cubes achieves higher accuracies of face recognition performance than the misaligned hyperspectral image cubes, which are 26 percentage points higher than those of the misaligned hyperspectral image cubes as shown in Fig. 25(a) and Table 9.

 figure: Fig. 25

Fig. 25 Comparison of the identification rates of PLDA [28] versus the identity and noise subspace size on the misaligned and aligned (a) hyperspectral image cubes and (b) sRGB images derived from FBB and RASL, respectively.

Download Full Size | PDF

Tables Icon

Table 9. Comparison of the identification rates of PLDA based on 20 factors using the misaligned and aligned sets.

We further investigate performance of PLDA for the same experimental protocol as in Fig. 25(a) on the results of FBB and RASL with the corresponding sRGB images instead of selecting maximally discriminative bands of the visible spectrum. The sRGB images are resized to 30 × 30 × 3. When the subspace dimension of PLDA is 20, the highest accuracy of PLDA on the sRGB images of RASL is achieved as presented in Fig. 25(b). Using the aligned sRGB images, PLDA improves the accuracy of identification by 20 percentage points compared to the misaligned images. Therefore, the recognition rates on the hyperspectral image cubes of RASL are increased by 6 percentage points compared to the aligned sRGB images as shown in Table 9, but it is unwise to draw a strong conclusion because of the difference in the number of used bands. We believe that a technique for band selection to seek maximally discriminative bands of the visible spectrum for face recognition can accomplish a similar recognition performance while only using fewer bands as in [5].

6.7. Improvements in face recognition on UWA-HSFD

In [34], we found that UWA-HSFD contains less inter-band misalignments compared to IRIS-HFD-2014. We further show the effectiveness of the proposed framework on UWA-HSFD, compared to FBB, Laplacian of Gaussian-based feature point matching approach (LoG) [34], and ORIA which only require one manual input set of two eye coordinates on the last band (720 nm) in UWA-HSFD where one manual input set is used to crop each band in the hyperspectral image cubes. As shown in Table 10, the proposed metrics correctly predict better-quality alignments in our experiments where the predicted scores are sorted in ascending order.

Tables Icon

Table 10. Result of averaging alignment errors of the conducted alignment approaches on UWA-HSFD.

To evaluate improvements of face recognition on UWA-HSFD where UWA-HSFD contains a total of 142 hyperspectral face cubes of 79 data subjects in the frontal view, we randomly select each cube from the UWA-HSFD. For the training set, we use 105 cubes of 72 subjects over 4 sessions. The gallery set in the testing set is constructed by choosing 30 cubes from the training set and 30 cubes as the probe set are selected from the remaining cubes over 4 sessions, but not included in the training stage. For computational efficiency, the face cubes are resized to 40 × 35 × 33. Table 11 shows the identification result of 20 iterations of training from hyperspectral cubes using the optimal subspace dimension 48 factors of PLDA on UWA-HSFD.

Tables Icon

Table 11. Comparison of the first rank identification rate based on 48 factors of PLDA on UWA-HSFD.

As summarized in Table 2, PLDA based on 48 factors using the aligned hyperspectral cubes produced by LoG and ORIA achieves better accuracies of face recognition than the misaligned hyperspectral cubes of the fixed bounding box-based approach (FBB), which are 10% higher than those of the misaligned hyperspectral cubes.

7. Conclusion

In this paper, we presented a new framework to determine improved alignment among four selected alignment approaches to address inter-band misalignments in hyperspectral image cubes. Specifically, we developed two different metrics such as curvature-based and hue-based alignment quality assessments to efficiently evaluate the alignment quality of the hyperspectral image cubes. Comparisons with seven state-of-the-art image quality assessment metrics on our new database showed that both the proposed metrics achieved promising prediction accuracy in determining improved alignment. We also demonstrated the ability of the proposed framework to improve hyperspectral face recognition. Therefore, the proposed metrics can be used to assess the alignment quality of both current and future alignment algorithms tested on hyperspectral image sets. In addition, the proposed framework can be utilized to advance practical face recognition systems based on hyperspectral imaging.

Funding

Qatar National Research Fund (NPRP) (4-1165-2-453); Institute for Information & Communications Technology Promotion (IITP) by MSIP (B0101-15-0525); Information Technology Research Center (ITRC) by IITP (IITP-2016-H8501-16-1018).

Acknowledgments

We would like to thank all the participants in IRIS-HFD-2014.

References and links

1. T. Ahonen, A. Hadid, and M. Pietik, “Face description with local binary patterns: Application to face recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006). [CrossRef]   [PubMed]  

2. H. Chang, Y. Yao, A. Koschan, B. Abidi, and M. Abidi, “Spectral range selection for face recognition under various illuminations,” in Proceedings of IEEE International Conference on Image Processing (IEEE, 2008), pp. 2756–2759.

3. G. Hua, M. Yang, E. Learned-Miller, Y. Ma, M. Turk, D. Kriegman, and T. Huang, “Introduction to the special section on real-world face recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 1921–1924 (2011). [CrossRef]   [PubMed]  

4. Z. Pan, G. Healey, M. Prasad, and B. Tromberg, “Face recognition in hyperspectral images,” IEEE Trans. Pattern Anal. Mach. Intell. 25(12), 1552–1560 (2003). [CrossRef]  

5. M. Uzair, A. Mahmood, and A. Mian, “Hyperspectral face recognition with spatiospectral information fusion and PLS regression,” IEEE Trans. Image Processing 24(3), 1127–1137 (2015). [CrossRef]  

6. M. Uzair, A. Mahmood, F. Shafait, C. Nansen, and A. Mian, “Is spectral reflectance of the face a reliable biometric?” Opt. Express 23(12), 15160–15173 (2015). [CrossRef]   [PubMed]  

7. H. Chang, Y. Yao, A. Koschan, B. Abidi, and M. Abidi, “Improving face recognition via narrowband spectral range selection using Jeffrey divergence,” IEEE Trans. Information Forensics and Security 4(1), 111–122 (2009). [CrossRef]  

8. W. Di, L. Zhang, D. Zhang, and Q. Pan, “Studies on hyperspectral face recognition in visible spectrum with feature band selection,” IEEE Trans. Syst., Man, Cybern. A, Syst. Humans 40(6), 1354–1361 (2010). [CrossRef]  

9. W. Cho, S. Sahyoun, S. Djouadi, A. Koschan, and M. Abidi, “Reduced-order spectral data modeling based on local proper orthogonal decomposition,” J. Opt. Soc. Am. A 32(5), 733–740 (2015). [CrossRef]  

10. A. Robles-Kelly and C. Huynh, Imaging Spectroscopy for Scene Analysis (Springer, 2013). [CrossRef]  

11. L. Denes, P. Metes, and Y. Liu, “Hyperspectral face database,” Tech. Rep. CMU-RI-TR-02-25, Robot. Inst., Carnegie Mellon Univ., Pittsburgh, PA (2002).

12. Y. Peng, A. Ganesh, J. Wright, W. Xu, and Y. Ma, “RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images,” IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2233–2246 (2012). [CrossRef]   [PubMed]  

13. G. Huang, V. Jain, and E. Learned-Miller, “Unsupervised joint alignment of complex images,” in Proceedings of IEEE International Conference on Computer Vision (IEEE, 2007), pp. 1–8.

14. A. Mittal, A. Moorthy, and A. Bovik, “No-reference image quality assessment in the spatial domain,” IEEE Trans. Image Processing 21(12), 4695–4708 (2012). [CrossRef]  

15. N. Narvekar and L. Karam, “A no-reference image blur metric based on the cumulative probability of blur detection (CPBD),” IEEE Trans. Image Processing 20(9), 2678–2683 (2011). [CrossRef]  

16. W. Xue, L. Zhang, X. Mou, and A. Bovik, “Gradient magnitude similarity deviation: A highly efficient perceptual image quality index,” IEEE Trans. Image Processing 23(2), 684–695 (2014). [CrossRef]  

17. A. Liu, W. Lin, and M. Narwaria, “Image quality assessment based on gradient similarity,” IEEE Trans. Image Processing 21(4), 1500–1512 (2012). [CrossRef]  

18. R. Hassen, Z. Wang, and M. Salama, “Image sharpness assessment based on local phase coherence,” IEEE Trans. Image Processing 22(7), 2798–2810 (2013). [CrossRef]  

19. Y. Wu, B. Shen, and H. Ling, “Online robust image alignment via iterative convex optimization,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2012), pp. 1808–1814.

20. C. Vu, T. Phan, and D. Chandler, “S3: A spectral and spatial measure of local perceived sharpness in natural images,” IEEE Trans. Image Processing 21(3), 934–945 (2012). [CrossRef]  

21. L. Liu, B. Liu, H. Huang, and A. Bovik, “No-reference image quality assessment based on spatial and spectral entropies,” Signal Processing: Image Communication 29(8), 856–863 (2014).

22. H. Chang, A. Koschan, M. Abidi, S. Kong, and C. Won, “Multispectral visible and infrared imaging for face recognition,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2008), pp. 1–6.

23. H. Deng, W. Zhang, E. Mortensen, T. Dietterich, and L. Shapiro, “Principal curvature-based region detector for object recognition,” in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1–8.

24. C. Steger, “An unbiased detector of curvilinear structures,” IEEE Trans. Pattern Anal. Mach. Intell. 20(2), 113–125 (1998). [CrossRef]  

25. L. Sigal, S. Sclaroff, and V. Athitsos, “Skin color-based video segmentation under time-varying illumination,” IEEE Trans. Pattern Anal. Mach. Intell. 26(7), 862–877 (2004). [CrossRef]  

26. M. Stokes, M. Anderson, S. Chandrasekar, and R. Motta, “Multimedia systems and equipment–colour measurement and management–Part 2–1: Colour management–default RGB colour space–sRGB,” Tech. rep., International Electrotech. Commission, IEC 61966-2-1 (1998).

27. S. Moan and P. Urban, “Image-difference prediction: from color to spectral,” IEEE Trans. Image Processing 23(5), 2058–2068 (2014). [CrossRef]  

28. S. Prince, P. Li, Y. Fu, U. Mohammed, and J. Elder, “Probabilistic models for inference about identity,” IEEE Trans. Pattern Anal. Mach. Intell. 34(1), 144–157 (2012). [CrossRef]  

29. Biometric Research Centre, “The Hong Kong Polytechnic University Hyperspectral Face Database (POLYU-HSFD),” (The Hong Kong Polytechnic University, 2013). http://www4.comp.polyu.edu.hk/~biometrics/hsi/hyper_face.htm.

30. T. Skauli and J. Farrell, “A collection of hyperspectral images for imaging systems research,” Proc. SPIE 8660, 86600C (2013). [CrossRef]  

31. M. Uzair, A. Mahmood, and A. Mian, “Hyperspectral face recognition using 3D-DCT and partial least squares,” in Proceedings of the British Machine Vision Conference (BMVA Press, 2013), pp. 1–57.

32. A. Mian, “UWA Hyperspectral Face Database,” http://staffhome.ecm.uwa.edu.au/~00053650/databases.html.

33. W. Cho, A. Koschan, and M. A. Abidi, Face Recognition Across the Imaging Spectrum (Springer, 2016), Chap. Hyperspectral Face Databases for Facial Recognition Research, pp. 47–68. [CrossRef]  

34. W. Cho, “Hyperspectral data acquisition and its application for face recognition,” Ph.D. thesis, University of Tennessee (2015).

35. D. Foster, K. Amano, S. Nascimento, and M. Foster, “Frequency of metamerism in natural scenes,” J. Opt. Soc. Am. A 23(10), 2359–2372 (2006). [CrossRef]  

36. X. Zhu and D. Ramanan, “Face detection, pose estimation, and landmark localization in the wild,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2012), pp. 2879–2886.

37. X. Xiong and F. Torre, “Supervised descent method and its applications to face alignment,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2013), pp. 532–539.

38. X. Yu, J. Huang, S. Zhang, W. Yan, and D. Metaxas, “Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model,” in Proceedings of IEEE International Conference on Computer Vision (IEEE, 2013), pp. 1944–1951.

39. A. Asthana, S. Zafeiriou, S. Cheng, and M. Pantic, “Robust discriminative response map fitting with constrained local models,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2013), pp. 3444–3451.

40. A. Asthana, S. Zafeiriou, S. Cheng, and M. Pantic, “Incremental face alignment in the wild,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2014), pp. 1859–1866.

41. R. Szeliski, Computer Vision: Algorithms and Applications (Springer, 2010).

42. S. Baker and I. Matthews, “Lucas-kanade 20 years on: A unifying framework,” Int. Jour. Computer Vision 56(3), 221–255 (2004). [CrossRef]  

43. C. Liu, J. Yuen, and A. Torralba, “Sift flow: Dense correspondence across scenes and its applications,” IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011). [CrossRef]  

44. W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, Numerical Recipes in C: The Art of Scientific Computing, 2nd Edition (Cambridge Univ. Press, 1992).

45. L. Shafey, C. McCool, R. Wallace, and S. Marcel, “A scalable formulation of probabilistic linear discriminant analysis: Applied to face recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1788–1794 (2013). [CrossRef]   [PubMed]  

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (25)

Fig. 1
Fig. 1 Overview of the proposed framework (best viewed in color): (a) input, (b) alignment approaches, and (c) assessing improved alignment via two proposed qualitative prediction models.
Fig. 2
Fig. 2 (a) Spectral transmittances of a VariSpec VIS liquid crystal tunable filter (LCTF) from 400 nm to 700 nm in 10 nm intervals and (b) the normalized spectral power distributions (SPDs) of synthetic and natural lights (best viewed in color): halogen (H), 40W LED (L), projector with a blue filter (P), mixtures of the studied lights (H+P, L+P, and H+L+P), and the measured D65 illuminant.
Fig. 3
Fig. 3 Comparison of three databases collected by LCTFs where in each row from top to bottom, sample bands covering the visible range from 420 nm to 690 nm in 30 nm intervals are taken from PolyU-HSFD [8,29], IRIS-M [22], and IRIS-HFD-2014, respectively. IRIS-HFD-2014 will be made publicly available. Note that the two bands at 420 nm and 450 nm do not exist in the IRIS-M database.
Fig. 4
Fig. 4 Comparison of UWA-HSFD [43,45] and IRIS-HFD-2014.
Fig. 5
Fig. 5 Overview of the experimental setup of our hyperspectral imaging system: (a) target, (b) LED light source, (c) LCTF, (d) lens, (e) detector, and (f) system controller.
Fig. 6
Fig. 6 (a) The SPD of Lumia 5.1 Reef and (b) tuned camera exposure time utilized in our data acquisition. Note that since our light source inherently has low radiant power relative to the spectral transmittances of the LCTF from 470 nm to 520 nm and from 670 nm to 700 nm as shown in (a), we increased the camera exposure time at these wavelengths.
Fig. 7
Fig. 7 Examples of the presence of inter-band misalignment artifacts in the sRGB color space (best viewed in color). As illustrated in this figure, inter-band misalignments result in the problem of motion blurring combined with distorted colors that are caused by spectral distortion. The sRGB image is generated from ID: F009_02 in IRIS-HFD-2014. Note that the hyperspectral image cube used in this figure is different from the sample hyperspectral image cube of IRIS-HFD-2014 in Fig. 3.
Fig. 8
Fig. 8 Sample results of five state-of-the-art landmark-based alignment approaches at 420 nm, 500 nm, 600 nm, and 700 nm tested on ID: F048_01 in IRIS-HFD-2014: (a) TSPM [36], (b) SDM [37], (c) CDM [38], (d) DRMF [39], and (e) IPCM [40]. Figures are best viewed in color.
Fig. 9
Fig. 9 Examples of inaccurate localization of landmarks: (a) the results of TSPM [36] and (b) the results of IPCM [40] at 600 nm and 700 nm. As illustrated in (a) and (b), whereas both TSPM and IPCM detected all of the facial features in Fig. 8(a) and 8(e), they were afflicted with the correspondence problems of landmarks due to inaccurate localizations of the detected points in HFIs.
Fig. 10
Fig. 10 Examples of failures of LK [42] from 600 nm to 690 nm in 30 nm steps and SIFTFlow [43] at 580 nm, 590 nm, and 600 nm on ID: F048_01 in IRIS-HFD-2014.
Fig. 11
Fig. 11 Examples of the aligned bands at 420 nm, 500 nm, 600 nm, and 700 nm according to (a) FBB, (b) EC, (c) RASL, and (d) ORIA. The corresponding sRGB images are shown in the fifth column where ROIs marked by rectangles are magnified in the last column (best viewed in color). Note that in this figure, we use the same hyperspectral image cube as in Fig. 7.
Fig. 12
Fig. 12 Comparison of the results of (a) gradient similarity model (GSM) [17], (b) gradient magnitude similarity (GMS) [16], and (c) the proposed method based on curvature model similarity (CMS) at 420 nm, 550 nm, and 700 nm on the results of RASL [12] applied to ID: F067_01 in IRIS-HFD-2014 as shown in (d).
Fig. 13
Fig. 13 Comparison of the results of four selected alignment approaches with the corresponding sRGB images. This hyperspectral image cube is taken from ID: F067_01 in IRIS-HFD-2014.
Fig. 14
Fig. 14 Analyses of the effects of inter-band misalignments in the HSV color space. For the misaligned sRGB image in Fig. 11(a), we can observe that the distribution of the colors is more widely spread over the HSV color space as shown in (a) and (b) in different views where the vertical axis is the V value, the horizontal distance from the axis is the S value, and the angle is the H value. However, the color distribution of the aligned sRGB image in Fig. 11(c) is concentrated near the red color of the hue component as illustrated in (c) and (d). Note that the dark colors with low V values in Fig. 11(c) are caused by the subject’s clothes where the colors do not appear in Fig. 11(a).
Fig. 15
Fig. 15 Illustration of the hue component on a color wheel divided into six sectors associated with three primary and three pairwise mixed colors ϕ, including red, yellow, green, cyan, blue, and magenta.
Fig. 16
Fig. 16 Comparison of the probability of six representative colors (red, yellow, green, cyan, blue, and magenta) between (a) the misaligned and (b) the aligned sRGB images corresponding to Fig. 11(a) and Fig. 11(c), respectively.
Fig. 17
Fig. 17 Examples of rigid objects: (a) mannequin 1 and (b) mannequin 2, where hyperspectral image cubes are aligned by FBB and displayed using sRGB values.
Fig. 18
Fig. 18 The number of errors on IRIS-HFD-2014 as σ increases from 1.0 to 2.0. σ denotes the Gaussian scale and is a parameter for the proposed CMS metric. We observe that CMS appears not to enhance the accuracy of the prediction to determine improved alignment when the σ value is greater than 1.4. The proposed CMS obtains four errors when the scores of FBB are compared with other scores of EC, RASL, and ORIA. However, there is no error if we consider the errors compared only with RASL and ORIA as shown in Section VI.C. Note that the number of errors of GSM and GMS in both cases is unchanged.
Fig. 19
Fig. 19 Results of the correctness of two full-reference image quality assessments and the proposed CMS on rigid object sets. Figures are best viewed in color.
Fig. 20
Fig. 20 Errors of two full-reference image quality assessments and the proposed CMS on IRIS-HFD-2014. Note that the errors are computed by counting the number of subject IDs where the scores of FBB are higher than either the scores of RASL or ORIA. As shown in (c), there is no error in the CMS metric compared to GSM and GMS where the number of errors of (a) GSM and (b) GMS is 9 and 15, respectively.
Fig. 21
Fig. 21 Scatter plots with the regression lines corresponding to the results of the proposed CMS in IRIS-HFD-2014 (best viewed in color).
Fig. 22
Fig. 22 Results of the Correctness of five state-of-the-art no-reference image quality assessments and the proposed HUQA on two rigid object sets. Figures are best viewed in color.
Fig. 23
Fig. 23 Errors of five no-reference image quality assessments and the proposed HUQA where the scores of FBB are compared with RASL and ORIA.
Fig. 24
Fig. 24 Scatter plots with the regression lines corresponding to the results of the proposed HUQA on non-rigid subject sets in IRIS-HFD-2014 (best viewed in color).
Fig. 25
Fig. 25 Comparison of the identification rates of PLDA [28] versus the identity and noise subspace size on the misaligned and aligned (a) hyperspectral image cubes and (b) sRGB images derived from FBB and RASL, respectively.

Tables (11)

Tables Icon

Table 1 Summary of important acronyms used in this paper.

Tables Icon

Table 2 Comparison of the reported scores for the sample sets in Fig. 12.

Tables Icon

Table 3 Results of the correctness of two full-reference image quality assessments and the proposed CMS on two rigid object sets.

Tables Icon

Table 4 Results of the prediction accuracy of two full-reference image quality assessments and the proposed CMS on IRIS-HFD-2014.

Tables Icon

Table 5 Parameters of the estimated regression lines for FBB, EC, RASL, and ORIA via the proposed CMS.

Tables Icon

Table 6 Results of the correctness of five no-reference image quality assessments and the proposed HUQA on two rigid object sets.

Tables Icon

Table 7 Results of the prediction accuracy of five no-reference image quality assessments and the proposed HUQA on IRIS-HFD-2014: (a) SSEQ, (b) LPC-SI, (c) BRISQUE, (d) S3, (e) CPBD, and (f) HUQA.

Tables Icon

Table 8 Parameters of the estimated regression lines for FBB, EC, RASL, and ORIA via the proposed HUQA.

Tables Icon

Table 9 Comparison of the identification rates of PLDA based on 20 factors using the misaligned and aligned sets.

Tables Icon

Table 10 Result of averaging alignment errors of the conducted alignment approaches on UWA-HSFD.

Tables Icon

Table 11 Comparison of the first rank identification rate based on 48 factors of PLDA on UWA-HSFD.

Equations (13)

Equations on this page are rendered with MathJax. Learn more.

H ( p ) = [ D x x ( p ) D x y ( p ) D x y ( p ) D y y ( p ) ] ,
C ( p ) = PCMP ( p ) max ( PCMP ( p ) ) .
CMS ( j ) = 1 m p = 1 m 2 C r ( p ) C j ( p ) + Γ C r ( p ) 2 + C j ( p ) 2 + Γ ,
CMSM = 1 ( n 1 ) j C M S ( j ) .
Γ max max ( R , G , B ) , Γ min min ( R , G , B ) ,
S = { Γ δ V if V > 0 0 otherwise ,
H ˜ = { ( G B ) / Γ δ if R = Γ max ( B R ) / Γ δ + 2 if G = Γ max ( R G ) / Γ δ + 4 if B = Γ max .
H = 1 6 { ( H ˜ + 6 ) if H ˜ < 0 H ˜ otherwise .
H ( p ) = { H ( p ) if α S ( p ) < 1 α β otherwise ,
P ϕ ( x ) = Pr ( x | 2 ( ϕ 1 ) 1 12 < x 2 ϕ 1 12 ) ,
P ϕ = 1 ( x ) = Pr ( x | 11 12 < x x 1 12 ) .
A Q scores = 1 ϕ P ϕ for ϕ [ 2 , 6 ] ,
Accuracy = 1 ε N ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.