With the advancement in sensor technology, the use of multispectral imaging is gaining wide popularity for computer vision applications. Multispectral imaging is used to achieve better discrimination between the radiance spectra, as compared to the color images. However, it is still sensitive to illumination changes. This study evaluates the potential evolution of illuminant estimation models from color to multispectral imaging. We first present a state of the art on computational color constancy and then extend a set of algorithms to use them in multispectral imaging. We investigate the influence of camera spectral sensitivities and the number of channels. Experiments are performed on simulations over hyperspectral data. The outcomes indicate that extension of computational color constancy algorithms from color to spectral gives promising results and may have the potential to lead towards efficient and stable representation across illuminants. However, this is highly dependent on spectral sensitivities and noise. We believe that the development of illuminant invariant multispectral imaging systems will be a key enabler for further use of this technology.
© 2017 Optical Society of America
Objects are perceived by their radiance in the visible region of the electromagnetic spectrum, and for a given object, the radiance depends on its material properties, its shape, and its location in the scene. The intensity, position, and spectral characteristics of the illuminant also play a major role in image generation. The spectral sensitivity of filters is another important parameter in image creation. In a simple imaging model with three channels, the image values are dependent on the light source , surface reflectance , and camera sensitivity functions , as
In the human visual system, the three cone types are sensitive to certain wavelengths in photopic vision . In the case of a camera with three channels, the color filters play a similar role. Multispectral imaging is being used to capture more spectral details in a scene as compared to conventional color images. Recently emerging technologies, such as the spectral filter arrays [2–4], enable a broader range of usage domains for multispectral imaging. The use of multispectral images in object recognition can perform better than the conventional RGB color images . An example of a multispectral imaging system to determine the quality attributes and ripeness stage in strawberries was proposed by Liu et al. . In that work, the imaging system is first radiometrically calibrated using both a diffuse white and dark target. Similarly, most existing multispectral imaging systems are specifically designed and need to be recalibrated when the imaging conditions are changed. Extending the use of the multispectral imaging system from heavily constrained environments to real-world applications is still an open challenge. One of the major obstacles is calibration of the multispectral camera according to the scene illuminant [7–11]. In this work, we investigate the use of illuminant estimation algorithms for multispectral imaging systems.
We propose to extend the illuminant estimation algorithms from three channels to N channels. Recently, Thomas  investigated the physical validity of these illuminant estimation algorithms by applying them on uncalibrated multispectral images (MSIs) with 3, 5, 12, and 20 bands. That work showed that there is a huge variability due to scene contents, and suggests that the number and potential configuration of bands has an important influence on the results. In this work, we extend those preliminary results to a more general and exhaustive investigation through an experimental framework where we simulate a multispectral imaging system using different numbers of sensors and configurations. In Ref. , only equi-Gaussian filters are used in simulations and evaluation is provided in the form of angular error and the goodness-of-fit coefficient (GFC). In this work, we use equi-Gaussian filters, the Dirac delta form of filters, and overlapping equi-energy filters for the evaluation of the effect of the filter configuration on illuminant estimation. We use the extension of specific illuminant estimation algorithms, which contain simple assumptions, provide efficient performance with natural scenes, and are robust to illumination changes since they do not require any training. We evaluate the results in the form of angular error. We also map the illuminant in the sensor domain into the chromaticity space and then evaluate the chromaticity error. In this way, we are able to compare the performance of illuminant estimation algorithms and configurations between varying numbers of filters by reducing data into a common dimensionality. The experimental framework presented here can be extended for more sophisticated illuminant estimation algorithms as well, in order to develop an optimal illuminant estimation system for multispectral imaging.
This paper is organized as follows. In Section 2, we briefly discuss computational color constancy and previous research on illuminant estimation in color images. In Section 3, we discuss previous work done on illuminant estimation in MSI and define the methodology for extension of illuminant estimation algorithms to higher dimensions. In Section 4, we present the experimental setup, Section 5 contains our results and a discussion, and Section 6 concludes the paper.
2. COMPUTATIONAL COLOR CONSTANCY REVIEW
The captured color of objects generally changes when observed under different light sources, since the creation of an image is dependent not only on the spectral reflectance property of the object’s surface and the camera’s sensor sensitivity, but also on the incident illuminant on the object, as in Eq. (1). The human visual system has the natural ability to perceive constant color of surfaces despite the change in spectral composition of the illuminant , and this ability to discard illumination effects is called “color constancy” . Color constancy is usually defined in the context of natural scenes along with flat matte and diffuse materials by a so-called “equivalent illumination model” [15,16]. Creating such a model for color constancy in computer vision is called computational color constancy (CCC). Developing an illuminant invariant computer vision system is an open area of research, and there are algorithms that are able to perform well for particular conditions and assumptions, but still a universally accepted CCC system does not exist.
CCC plays an important role in color-based computer vision applications including object recognition, tracking, and image classification . Object representation and recognition from the standpoint of computer vision is discussed in detail in Ref. . For example, in the case of object recognition, the color of the object can be used as a feature, and it should appear constant across changes in illumination . So the first step in achieving a constant representation of colors is to adjust the color changes due to the illuminant. CCC therefore deals with the representation of a scene with the effect of the illuminant being as small as possible. There are basically two approaches for this. One is to compute illuminant invariant features [20,21], and the second is to estimate the illuminant  and later apply a correction. Our work focuses on illuminant estimation in a scene.
The problem of developing an efficient and generic CCC algorithm obviously depends strongly on the illuminant estimation in a given scene, which indeed is not a straightforward task. The core challenge for CCC is that the data acquired are a combination of three unknown factors: surface reflectance properties, color of illuminant, and sensor sensitivities. Maloney and Wandell  showed that color constancy is indeed impossible without applying restrictions on spectral reflectance and illuminations.
From the imaging model given in Eq. (1), the goal of a color constancy system is to estimate the illuminant , and this estimation is performed in the camera domain:
In Eq. (2), corresponds to the illuminant’s projection over filters (IPF), which is a set of discrete values with the dimension equal to the total number of filters (N). It should be noted that IPF is the response of each filter for the illumination (ground truth or estimated), and it is not equivalent to the spectral power distribution of the illumination itself.
Since the sensor’s response is a combination of three unknown factors, the estimation of scene illuminant is an ill-posed problem  and certain assumptions have to be made in order to estimate the scene illuminant. Once the illuminant is estimated within the sensor domain, correction is applied to the acquired image in order to represent it as it would have been taken under a known light source. This process is also expressed as “discounting the chromaticity of the illuminant” by D’Zmura and Lennie . This transformation is performed as26–28]. This assumption is closely related to the Von Kries coefficient rule [29,30]. Land’s white-patch algorithm  proposes that there is at least one pixel in each color channel that causes maximum reflection of the illuminant and when such maximum responses are combined, they form the color of the illuminant. This assumption is alleviated by considering the color channels separately, resulting in the max-RGB algorithm .
The gray-world algorithm was proposed by Buchsbaum  and is based on the assumption that the average color of a scene is achromatic. The result of the gray-world algorithm was improved by Gershon et al.  by taking the average reflectance of a database and assuming the average of the scene to be equal to that average reflectance.
The shades of gray algorithm was introduced by Finlayson and Trezzi . This is a general form of max-RGB and gray-world algorithms where it is shown that the gray-world algorithm is the same as using the Minkowski norm while max-RGB is equivalent to using the norm. In their case, the general equation for estimation of light source becomes
The gray-edge algorithm proposed by van de Weijer and Gevers  assumes that the average of reflectance derivative in a scene is achromatic. This algorithm is expressed as
Edge-based CCC is explored further for higher-order derivatives in Ref. . Celik and Tjahjadi  used wavelet transform to down-sample the image before applying the gray-edge algorithm for estimation of illuminant color, and for each down-sampled image, separate estimation is performed on the high-pass filter’s result. The decision for illuminant color is based on minimum error between the estimation in consecutive scales. CCC based on spatio-temporal statistics in a scene was proposed by Chakrabarti et al. , where the spatial features of object surfaces are also accounted for in the determination of the illuminant. That work is improved in Ref.  by using only the edge information for achieving computational efficiency. There are some approaches that try to select the most appropriate estimation using intrinsic properties from other color constancy algorithms .
Gamut mapping is also used in CCC. It was introduced by Forsyth . He proposed that the color of an object is its representation under a fixed canonical light, rather than as a surface reflectance function. It is based on the assumption that for a given illuminant, one observes only a limited number of colors. Based on this assumption, any change in colors of the image is caused by the variation in color of the light source. The limited set of colors that can occur under a given illuminant is called the canonical gamut and is determined through observations of many surfaces under the known light source. Gijsenij et al.  proposed gamut estimation for illuminant by using higher-order statistics. Their results show that for a lower number of surfaces, pixel-based gamut mapping performs well, but with a large number of surfaces, the efficiency of edge-based gamut mapping increases. Color-by-correlation  is a discrete version of gamut mapping where the correlation matrix is used instead of the canonical gamut for the considered illuminants, and is used with the image data to calculate the probability that the illumination in the test image is caused by which of the known illuminants.
Huo et al.  proposed an automatic white balancing algorithm by using gray points in an image for estimation of the illuminant temperature. In their method, an RGB image is converted into YUV color space and then those pixels where or are pointed out as gray points. A feedback system is used to estimate those points, and then remaining pixels are corrected by adjusting the gain of the R or B channel according to the illuminant color being detected. Yoon et al.  proposed dichromatic line space where a dichromatic slope is formed within dichromatic line space. Illuminant chromaticity is estimated through intersection of those lines. Ratnasingam and Collins  proposed two features that are described to represent chromaticity and are independent of the intensity and correlated color temperature of the illuminant in a scene. Sapiro  presented the probabilistic Hough transform approach where a surface is selected according to the defined distribution and is used to recover the illuminant while using it along with the sensor response. Bayesian formulation for solving CCC is used by Brainard and Freeman , where each surface and light is represented by basis functions for which the probability distribution is defined. Xiong and Funt  used stereo images for extraction of 3D information as an additional source for illuminant estimation. Use of six channels is proposed by Finlayson et al.  in the chromagenic algorithm. The additional three channels are acquired by using a chromagenic filter being placed in front of the sensor. The information from these channels is used to estimate the scene’s illuminant from a set of known illuminants. Modification in the chromagenic algorithm is proposed by Fredembach and Finlayson in the bright-chromagenic algorithm , by using only the brightest pixels in the two images.
Assuming that the subspace of reflectances of all surfaces is linear and in a smaller dimension than the number of sensors, the Maloney–Wandell algorithm  proposes that the sensor responses for the surfaces under one illuminant fall within a linear subspace of the same dimensionality. Estimation of surface colors under two illuminants using retinex theory is proposed by Barnard et al.  and Finlayson et al. . Nieves et al.  proposed a linear pseudo-inverse method for recovery of the spectral power distribution of the illuminants using a learning-based procedure. Their algorithm is based on the detection of naturally occurring bright areas in natural images, acquired through the color camera.
Machine learning is also applied for illuminant estimation. In Ref. , a multilayer neural network is trained using histograms of chromaticity of input images along with the corresponding chromaticity of the illuminant. A number of similar approaches can be found in Refs. [56–58]. The support vector machine is used in Ref. , which is based on the higher-order structure of images. Recently, deep learning has also been utilized in color constancy as in Refs. [60,61]. Bianco et al.  used a convolutional neural network for illuminant estimation in raw images. For generation of ground-truth illumination, shades of gray, gray edge, and gamut mapping are applied on the training data in their proposed method. Oh and Kim  treat this as an illuminant classification problem by using deep learning.
We consider multispectral images taken in an outdoor environment that can be generated by any mixture of illuminants. We are also interested in investigating the effect of the number of filters and their configurations for illuminant estimation. We propose to select a set of illuminant estimation algorithms that can handle any type of illuminant without requiring prior training and provide straightforward extension to dimensions. We also require the estimated illuminant to be in the sensor dimension and not in the chromaticity space so that it can be used for spectral adaptation transform [ in Eq. (3)]. Following our review, we chose to investigate the extension of the gray-world, max-RGB, shades of gray, and gray-edge algorithms. Another reason for the selection of these algorithms is the diversity of spectral imaging systems in terms of spectral sensitivities and the number of channels in our experiments. Initially we do not select the learning-based algorithms as we are interested in a generic illuminant estimation framework without the need for prior training. Although the use of classification methods shows improvement in performance of illuminant estimation, the major problem with such techniques is the availability of training data and the limited set of illuminations being considered. This is not a major problem in the case of color images but may be troublesome in spectral images. Another constraint is the diversity of spectral imaging systems in terms of spectral sensitivities and the number of channels. Therefore, we limit our investigations to “equivalent illumination models.”
3. ILLUMINANT ESTIMATION FROM MULTISPECTRAL IMAGES
In this section, we will first discuss the previous work done for illuminant estimation in multispectral images and then define our proposed idea for the extension of existing illuminant estimation algorithms from color to multispectral images.
A. Related Work
In this section, we define the formation of a multispectral image and then review the literature on illuminant estimation on these images. Spectral imaging can be defined as an array of channels representing several spectral components at each spatial location. The use of spectral imaging gained worldwide attention after the launch of Landsat in 1970, and since then it has been widely used in remote sensing applications. With the development in sensor technology, the use of spectral imaging in short-range imaging is also expanded. A survey on hyperspectral and multispectral imaging technologies is provided by Vagni . In this work, we are considering only multispectral images acquired through short-range imaging techniques, where , the number of spectral filters, is typically in the range of 5–20 .
According to the sensitivity of a typical silicon sensor behind an optical system, having sensitivity range from 400 to 1100 nm, a multispectral system usually provides a combination of visible and/or near-infrared bands, where the imaging model defined in Eq. (1) still holds:
Mosny and Funt  investigated the role of additional information acquired through multispectral imaging in order to improve the performance of already existing color constancy algorithms for illuminant chromaticity estimation. They used the chromagenic algorithm , the Maloney–Wandell algorithm , the gray-world algorithm , and max-RGB . Multispectral images were synthesized for their experiments by using the spectral sensitivity of a Sony DXC-930 camera. For additional band acquisition simulation, the sensitivity curves were shifted by . They used three, six, and nine bands for image acquisition along with 1995 surfaces and 287 illuminants. For representation of results, the median angular error in the sensor domain and the median angular error for illuminants estimates converted to RGB space were used. According to their evaluation, there is a slight improvement with six channels, but overall there is no significant improvement in illuminant chromaticity estimation by increasing the number of bands. Such experiments are performed on real-world data in Ref. , where the authors have used 28 scenes being photographed with 10 different illuminations. For image acquisition, cool and warm filters were used with the camera. Their evaluation methods show the same results that additional spectral bands do not contribute significantly towards illuminant chromaticity estimation.
Shrestha and Hardeberg  proposed a spectrogenic imaging system where two images are acquired from a scene: one normal RGB image and one filtered-RGB image. Illuminant estimation of the scene using these two images is performed using the chromagenic algorithm , and its modification was proposed by Fredembach and Finlayson . Eighty-seven illuminants were used for training the system, and an illuminant with minimum fitting error was selected as the potential illuminant for the scene.
It is worth noting that the purpose of Mosny and Funt [66,67] was to investigate whether there is any improvement in illuminant estimation achieved by increasing the number of filters, while in our work we want to investigate the extension of illuminant estimation into the multispectral domain. The system proposed by Shrestha and Hardeberg  is limited in terms of bands and illuminants. We are interested in the development of an illuminant estimation framework for multispectral imaging with any number of bands and with any mixture of illuminants so that it can be used for outdoor image acquisition without requiring calibration.
B. Proposed Multispectral Illuminant Estimation Algorithms
In this work, we propose four algorithms for investigation, which are instantiations of a class of models referred to as “equivalent illumination models,” and they assume a “flat-matte-diffuse” condition. These algorithms are computational attempts to implement the model of the human visual system for color constancy using natural image statistics. We evaluate the performance of those algorithms with multispectral data by extending those techniques to N dimensions and get the estimate of the illuminant in the sensor domain. We rename those algorithms so that the confusion between color information and spectral information is eliminated.
In the gray-world algorithm, it is assumed that the average reflectance of a scene is gray or achromatic. We extend this definition for the case of multispectral images by assuming that the average reflectance in an -dimensional image is constant:
Using Eq. (4) with , the illuminant can be estimated by computing the average pixel values for each channel:
The term is the estimate of the illuminant in the sensor domain. The same technique is used for the spectral gray-edge algorithm, where each channel is treated according to Eq. (5) after smoothing through a Gaussian filter with and extraction of edges through the derivative in both spatial axes. In the case of the spectral shades of gray algorithm, Eq. (4) is used with a value of higher than 1, while for the max-spectral algorithm, we treat each spectral band separately to get the pixels with the maximum response and use them for estimating the illuminant according to the originally presented hypothesis where the authors used color images.
Our implementation strategy for extension of these algorithms is slightly different than in Ref.  as we consider each channel of a multispectral image separately. It is worth mentioning that both the shades of gray and gray-edge algorithms use Minkowski norm , and in Ref. , the authors declare that with , the best results are obtained. In our experiments, we keep the same value of as proposed by the authors; however, we perform an experiment to obtain the optimized value for this parameter and discuss it in the results section.
4. EXPERIMENTAL SETUP
A. Data Preparation
We use hyperspectral images from the Foster Dataset 2004 , which are acquired in the wavelength range of 400–720 nm. This dataset contains reflectance data from natural scenes and is adequate for our purposes because of its natural image statistics, which are fundamental to the proposed methods (Fig. 1). In order to prepare radiance data, we use D65 and F11 illuminants. We also test the framework using a combination of D65 and F11 illuminations to simulate a scene having mix D65-F11 illuminants (Fig. 2). D65 is used as a standard daylight illuminant, while F11 resembles the spectral response of a sodium-vapor lamp , which would typically represent an example of outdoor lighting, e.g., road or ski tracks. Illuminant F5 is also used in the experiments, and we found similar results to those obtained with the F11 illuminant. In this paper, we present the results obtained from the multispectral data generated through the F11 illuminant.
We also consider noise in the multispectral imaging system. Typically, the main sources of noise are photon shot noise, dark current noise, read noise, and quantization noise . We do not consider photon shot noise and dark current noise since the Foster Dataset 2004 is already corrected for these types of noise. We do not consider quantization noise either since the data is already quantized at 12 bits. We simulate the additive read noise in our experiments as normally distributed Gaussian noise with zero mean and 2% variance .
B. Sensor Configuration
The performance of the proposed algorithms would be affected by the spectral sensitivities of the sensors that capture the radiance . In our experiments, we use a Gaussian model of sensor sensitivities. Such a model has been extensively used in the literature to simulate sensors or to approximate Fabry–Perot filter transmittance . For our experiments, three sensor configurations, , , and , are investigated. Within the visible range, we define as equi-Gaussian . The full width at half maximum (FWHM) of the sensor sensitivities decreases with an increase in the number of bands, and the overlap between adjacent bands remains approximately the same. By increasing the number of bands in this configuration, we are gradually shifting from typical multispectral sensors towards hyperspectral sensors. The configuration is a simulation of the Dirac delta function where only the band corresponding to the mean of the Gaussian filter is selected while the rest of the bands are discarded. It is of interest to test whether such a configuration will provide any help in estimating the illuminants with spiky behavior (e.g., F illuminants). Configuration consists of equi-energy filters, having a fixed FWHM and , which is different from , where the FWHM of filters is changed with a change in the number of bands. Using this configuration, we evaluate the effect of overlapping of filters for illuminant estimation.
In addition to the above explained filter configurations, we also consider different numbers of bands. Three bands are used for simulating an instantiation of RGB cameras. Five and eight bands are used for simulating a typical multispectral camera . Twelve bands are used to get the best spectral reconstruction , while 20 bands are deployed to approach the properties of a hyperspectral sensor. Figure 3 shows the three different configurations with eight spectral filters.
We consider images with different numbers of bands; therefore the quantitative evaluation is not straightforward, especially when comparing results obtained with different numbers of bands. We consider different quality evaluation metrics, which include evaluation on the basis of angular error, GFC , and normalized mean square error (NRMSE). These three evaluation metrics are used only when the dimension of filters is the same, and therefore results obtained from different numbers of filters cannot be compared. The estimated illuminants and ground-truth illuminant are normalized by dividing each value from the maximum so that the range is within [0–1] and relative errors are evaluated. The three indicators are very similar in the way they evaluate the similarity between data. We determined the correlation among the computed metrics and found that the correlation between angular error and GFC is , while the correlation between angular error and NRMSE is 0.975 in our data. Therefore, we decide to discuss and analyze the results in terms of angular error in this paper.
Calculation of the angular error () between the original illuminant and the estimated illuminant is computed in radians as in Eq. (9). This is commonly used in CCC literature:
The comparison of performance is done among five different numbers of spectral filters (three, five, eight, 12, and 20), three different filter configurations (equi-Gaussian, Dirac delta, and equi-energy filters), and four different algorithms (spectral gray world, max-spectral, spectral shades of gray, and spectral gray edge). The estimated illuminant for all these configurations is compared with the ground-truth illuminant in the sensor domain.
To be able to compare results obtained from different numbers of filters, we project the data into the chromaticity space, where they could be compared at the expense of an error in the projection definition. We call this evaluation metric “ chromaticity difference,” where we perform a camera linear colorimetric calibration based on mean square error fitting on the reflectance of X-Rite ColorChecker, similar to the work of , where the authors used that technique for color reproduction of MSI. We get the CIEXYZ of both the estimated and the ground-truth illuminants using this method. values are computed from these values, and the chromatic distance between them is observed in terms of Euclidean distance. This method enables us to compare the results obtained from different numbers of filters with each other. To verify the validity of this technique, we compared the ground-truth illuminants in the sensor domain with the chromaticity value of D65 and found that the Euclidean distance between them varies between 0.000934 and 0.00523 in the chromaticity space, which is very small, and, therefore, we can neglect the chromaticity error introduced during mapping of the illuminant from the sensor domain to the chromaticity space.
We present the results in the form of mean angular error, and in order to compare the statistical significance of results, the Wilcoxon signed rank test (WST) is applied. The use of WST is recommended by Hordley and Finlayson  and is used for evaluation of the illuminant estimation performance [50,78,79]. We investigate the statistical significance among results at 95% confidence level and provide the WST scores in terms of the sum of positive scores in the same way as provided by Bianco et al. . A higher score means that one particular algorithm along with a sensor configuration is able to perform well as compared to the others. A lower WST score means that the performance is significantly lower in comparison with the rest. To illustrate the visual difference among the ground-truth illuminant and the estimated illuminant, we have included examples in the form of plots. In each figure, the IPF for and can be compared when the number of filters is the same.
5. RESULTS AND DISCUSSION
We have provided the results in Tables 1–6. Table 1 shows that in the noiseless case with three filters, spectral gray edge performs the best, followed by and then max-spectral . The configuration performs the worst for all four algorithms. Illuminant estimation from noisy data also shows the same results. There is a slight improvement in mean error in some cases when noise data is used, but this slight change is not statistically significant and the overall results are robust with noise. With five bands (Table 2), spectral gray edge is the best, followed by max-spectral for and mix D65-F11 illuminants. shows different behavior, as max-spectral performs best and spectral edge follows. With noisy data, spectral gray edge gives consistent performance in terms of WST ranking, while the performance of max-spectral is significantly reduced in the case of the F11 illuminant. Table 3 shows that with eight filters the trend for best performance shifts from spectral edge to max-spectral as performs best for both illuminants. However, in the case of , it is interesting to note that spectral shades of gray performs the second best. This behavior is explained by the spikes in the illuminant, and the configuration is able to detect those spikes more efficiently. However, with noisy data, shades of gray is unable to perform anymore and spectral gray edge gets the second best ranking while the rest of the trend remains almost the same. For 12 bands, max-spectral achieves the best estimate, followed by spectral gray edge , as seen in Table 4. The performance of those algorithms remains similar in the presence of noise. In Table 5, results from using 20 filters show that max-spectral and spectral gray edge perform almost the same in both conditions.
We also compare performance on individual multispectral images to determine the effect of scene content on illuminant estimation. Results of illuminant estimation for each individual test image, being acquired through three, five, eight, 12, and 20 spectral filters and with the three different sensor configurations, are provided in the supplementary data (Data File 1, Data File 2, Data File 3, Data File 4, Data File 5, and Data File 6). In the following, analysis is provided on the data being generated with the D65 illuminant. With three channels, images I1, I2, I4, I6, and I8 show good performance with spectral gray edge , while with images I3, I5, and I7, max-spectral performs the best. To illustrate the difference in projection of the ground-truth illuminant and the estimated illuminant, some examples are shown in Figs. 4–8. In each figure, the axis represents each filter among the N filters and configuration, while the axis represents values of and , corresponding to the IFP. The points in the figures are joined through straight lines so that the overall behavior can be observed easily. It is worth noting that the results for different numbers of filters are not comparable across Figs. 4–8, since the dimension of filters is changed in each of them. Figure 4 shows the estimated illuminants in the sensor domain for I3 and I4 when spectral gray edge is used. For five filters, I3 and I7 perform best with max-spectral , while the other images show good results with spectral gray edge . I6 performs worst with max-spectral , which is the reason that this algorithm and configuration gets the second best rank while spectral gray edge gets the highest score for five channels. Figure 5 shows the estimated illuminant in the sensor domain for I5 and I3. Figure 5 shows the poor performance of illuminant estimation for I3 and I5. At this stage, the trend of improvement in max-spectral can already be observed, which becomes clear with eight channels as max-spectral performs best for all images except I6, which works well with spectral gray edge . The performance of max-spectral for images I3 and I5 is shown in Fig. 6. The same behavior is shown by individual images with 12 and 20 channels as well. Figures 7 and 8 show the performance of max-spectral for I6 and I7 when the number of channels is 12 and 20, respectively. In other images, there is a close tie between max-spectral and spectral gray edge , but images I3 and I6 do not perform well with spectral gray edge , thus causing it to get the overall second rank. Angular errors for all the algorithms, number of filters, filter configurations, and illuminants being used are provided in the supplementary material. We have also provided the error in terms of chromaticity for each of the individual images along with the other parameters being tested, in the supplementary material.
Overall, the configuration performs the best among tested filter configurations. Max-spectral and spectral gray edge attain good results, while spectral gray world shows the worst results for all cases. shows slightly better performance with the illuminant, but otherwise it also performs worst. It is interesting to note that spectral gray edge performs better with three bands, but by increasing the number of bands, the max-spectral algorithm starts performing the best among the tested algorithms. We investigate that trend by altering the value of Minkowski norm as in Eqs. (4) and (5). When the value of the parameter is increased, more weight is given to bright pixels in an image, and this ultimately leads towards the max-spectral algorithm. We performed tests with values of varying from 1 to 1000. The results show a very interesting observation that as more weight is given to bright pixels in a scene, the illuminant estimation gets better. This explains why the max-spectral algorithm performs well especially with an increase in the number of bands. Figure 9 shows the change in angular error with variation in the value of .
Tables 1–5 provide analysis of the performance of the proposed algorithms along with a given sensor configuration, in terms of . However, these results cannot be compared across the different numbers of filters because can be compared between two vectors only if they have the same dimension (in our case, the ground-truth and estimated illuminants are in the sensor dimension).
Mosny and Funt [66,67] performed their evaluation in chromaticity space. In their method, RGB of the estimated illuminant is obtained after identifying which illuminant from a database of known illuminants it is most similar to, and using that illuminant’s RGB as the conversion value. Based on this evaluation, they concluded that there is minor improvement in increasing the number of bands from three to six for illuminant estimation, but a further increase to nine bands does not provide any improvement. For evaluating the effect of the number of bands, we perform the evaluation based on chromaticity error in Table 6 but with a different approach as defined in Section 4.C. The comparison is performed among five different numbers of spectral filters (three, five, eight, 12, 20), three sensor configurations (equi-Gaussian, Dirac delta, and equi-energy filters), and four algorithms (spectral gray world, max-spectral, spectral shades of gray, and spectral gray-edge algorithm). We have used Euclidean distance for evaluation of the chromaticity error since we are assuming that our evaluation is in terms of physical measurement. Using the chromaticity space allows us to retain our assumption and enables the comparison between the ground-truth illuminant and the estimated illuminant.
Evaluation based on chromaticity error for D65 shows that the best result is obtained from spectral gray edge , and the second best results are from the spectral gray edge and . However, there is a significant statistical difference between and for this illuminant, which becomes more prominent in the case of noisy data.
With the F11 illuminant, max-spectral performs the best and is followed by max-spectral . This behavior of is explained from the spectral power distribution of F11, as shown in Fig. 2(b). The spiky character of this illuminant can be best acquired with the ideal Dirac delta type of filters. However, in the presence of noise, the performance of max-spectral is significantly reduced. Max-spectral performs best in the case of noisy data and is followed by max-spectral .
In the case of the mix D65-F11 illuminant, max-spectral performs the best while spectral gray edge and max-spectral perform second and third best, respectively. Since the behavior of the mix D65-F11 illuminant is influenced by peaks of the illuminant, performs best in this case. The same trend continues in the case of noisy data where the statistically significant difference among results is more prominent in light of WST rankings.
It is interesting to note that by increasing the number of channels beyond eight, there is a reduction in performance of illuminant estimation algorithms. This suggests that spectral resolution should also be maintained in a multispectral imaging system. As noticed from Table 6, the configuration performs the worst because of huge overlapping among filter sensitivities. This leads to the conclusion that by increasing the number of bands, more noise is introduced during image acquisition, and, therefore, the performance of the illuminant estimation algorithm is degraded. To validate this, we performed an additional illuminant estimation experiment using the native spectral resolution of the data, which is equivalent to a Dirac delta configuration with 33 filters (). There is no improvement in results when compared with the already obtained results from 20 channels, and it performs the worst when noise is added to the system. This fact is also observed by Wang et al. , where the spectral reconstruction results start degrading after increasing the number of filters beyond 12.
Although the results and ranks are based only on eight images of similar contents and may not lead to a strong conclusion, our investigation suggests several general behaviors. First, overlapping equi-energy filters may be most suitable for natural or smooth illuminations. Although there may be loss of spectral resolution in the case of using large overlapping sensors, since natural illuminations behave smoothly throughout the visible spectrum, overlapping equi-energy filters are able to perform well. We observe the same trend after noise is added to the images before illuminant estimation. Second, the max-spectral and spectral gray-edge algorithms provide better results than the other tested algorithms in general. The result is rather dependent on image content also, and in some of the images, a better estimate of the illuminant is achieved (data seems to follow the illumination); in others the results are quite noisy. Third, we found contradictory results as compared to Mosny and Funt , and our results suggest that illuminant chromaticity can be better retrieved when we increase the number of bands. However, the impact on color rendering is yet to be investigated. The optimum number of bands seems to be around eight. Finally, we still cannot provide clear indications on how good illuminant estimation is in terms of usability. In practice, the indicator used only provides relative ranking and objective indications on quality. Further analysis is required to understand what accuracy should be achieved for acquiring an illuminant invariant representation of multispectral images.
6. CONCLUSION AND FUTURE WORK
In this work, we proposed to extend illuminant estimation from color to multispectral imaging. Based on an extensive review of state of the art algorithms for CCC, we selected four algorithms that belong to the class of equivalent illumination models, and extended them from three channels to N channels. We named those extended algorithms the spectral gray-world, max-spectral, spectral shades of gray, and spectral gray-edge algorithms. Results show that both spectral gray-edge and max-spectral algorithms perform well in illuminant estimation. Comparison among three different sensor sensitivities is also performed, and the overlapping equi-energy filters are able to estimate the illuminant more accurately as compared to equi-Gaussian or Dirac delta functions for a limited number of channels. The same results are obtained when noise is added to the image data, which shows that the proposed extended algorithms for illuminant estimation are robust to noise.
The illuminant estimation results obtained from simulated multispectral sensors show promising aspects of application of the proposed framework. Based on these results, future work could be derived in three directions. First, development of new algorithms or further extension of more sophisticated illuminant estimation algorithms from color to spectral may be performed. Second, the validity of the proposed framework may be evaluated for real data acquired with a multispectral camera. The evaluation can also be performed in terms of color difference and the spectral reconstruction error. Finally further development in evaluation and usability of this framework may be performed, for instance, by evaluating surface classification under different illuminations.
Université de Bourgogne; Norges Teknisk-Naturvitenskapelige Universitet (NTNU); Conseil Régional de Bourgogne, France; Norges Forskningsråd.
1. A. Stockman, D. I. A. MacLeod, and N. E. Johnson, “Spectral sensitivities of the human cones,” J. Opt. Soc. Am. A 10, 2491–2521 (1993). [CrossRef]
2. P.-J. Lapray, X. Wang, J.-B. Thomas, and P. Gouton, “Multispectral filter arrays: Recent advances and practical implementation,” Sensors 14, 21626–21659 (2014). [CrossRef]
3. J.-B. Thomas, P.-J. Lapray, P. Gouton, and C. Clerc, “Spectral characterization of a prototype SFA camera for joint visible and NIR acquisition,” Sensors 16, 993 (2016). [CrossRef]
4. R. Shrestha, J. Y. Hardeberg, and R. Khan, “Spatial arrangement of color filter array for multispectral image acquisition,” Proc. SPIE 7875, 787503 (2011). [CrossRef]
5. Y. Manabe, K. Sato, and S. Inokuchi, “An object recognition through continuous spectral images,” in 12th International Conference on Pattern Recognition (1994), Vol. 1, pp. 858–860.
6. C. Liu, W. Liu, X. Lu, F. Ma, W. Chen, J. Yang, and L. Zheng, “Application of multispectral imaging to determine quality attributes and ripeness stage in strawberry fruit,” PLoS ONE 9, e87818 (2014). [CrossRef]
7. F. Imai and R. Berns, “Spectral estimation using trichromatic digital cameras,” in International Symposium on Multispectral Imaging and Color Reproduction for Digital Archives (1999), pp. 42–49.
8. D. Connah, S. Westland, and M. G. A. Thomson, “Recovering spectral information using digital camera systems,” Coloration Technol. 117, 309–312 (2001). [CrossRef]
9. E. M. Valero, J. L. Nieves, S. M. C. Nascimento, K. Amano, and D. H. Foster, “Recovering spectral data from natural scenes with an RGB digital camera and colored filters,” Color Res. Appl. 32, 352–360 (2007). [CrossRef]
10. R. Shrestha and J. Y. Hardeberg, “Spectrogenic imaging: a novel approach to multispectral imaging in an uncontrolled environment,” Opt. Express 22, 9123–9133 (2014). [CrossRef]
11. J. Y. Hardeberg and R. Shrestha, “Multispectral colour imaging: time to move out of the lab?” in Mid-Term Meeting of the International Colour Association (AIC), Tokyo, Japan, 2015, pp. 28–32.
12. J.-B. Thomas, “Illuminant estimation from uncalibrated multispectral images,” in Colour and Visual Computing Symposium (CVCS), Gjøvik, Norway, 2015, pp. 1–6.
13. O. Bertr and C. Tallon-Baudry, “Oscillatory gamma activity in humans: a possible role for object representation,” Trends Cogn. Sci. 3, 151–162 (1999). [CrossRef]
14. M. Ebner, Color Constancy, 1st ed. (Wiley, 2007).
15. D. H. Brainard and L. T. Maloney, “Surface color perception and equivalent illumination models,” J. Vis. 11(5), 1 (2011). [CrossRef]
16. D. H. Brainard, W. A. Brunt, and J. M. Speigle, “Color constancy in the nearly natural image. 1. Asymmetric matches,” J. Opt. Soc. Am. A 14, 2091–2110 (1997). [CrossRef]
17. K. Barnard, V. Cardei, and B. Funt, “A comparison of computational color constancy algorithms. I: Methodology and experiments with synthesized data,” IEEE Trans. Image Process. 11, 972–984 (2002). [CrossRef]
18. S. J. Dickinson, “Object representation and recognition,” in What is Cognitive Science, E. Lepore and Z. Pylyshyn, eds. (Basil Blackwell, 1999), Chap. 5, pp. 172–207.
19. M. J. Swain and D. H. Ballard, “Color indexing,” Int. J. Comput. Vis. 7, 11–32 (1991). [CrossRef]
20. B. V. Funt and G. D. Finlayson, “Color constant color indexing,” IEEE Trans. Pattern Anal. Mach. Intell. 17, 522–529 (1995). [CrossRef]
21. T. Gevers and A. W. Smeulders, “Color-based object recognition,” Pattern Recogn. 32, 453–464 (1999). [CrossRef]
22. S. D. Hordley, “Scene illuminant estimation: past, present, and future,” Color Res. Appl. 31, 303–314 (2006). [CrossRef]
23. L. T. Maloney and B. A. Wandell, “Color constancy: a method for recovering surface spectral reflectance,” J. Opt. Soc. Am. A 3, 29–33 (1986). [CrossRef]
24. D. Cheng, B. Price, S. Cohen, and M. S. Brown, “Effective learning-based illuminant estimation using simple features,” in IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 1000–1008.
25. M. D’Zmura and P. Lennie, “Mechanisms of color constancy,” J. Opt. Soc. Am. A 3, 1662–1672 (1986). [CrossRef]
26. E. Land, “Recent advances in retinex theory and some implications for cortical computations: color vision and the natural image,” Proc. Natl. Acad. Sci. USA 80, 5163–5169 (1983). [CrossRef]
27. E. H. Land and J. J. McCann, “Lightness and retinex theory,” J. Opt. Soc. Am. 61, 1–11 (1971). [CrossRef]
28. E. H. Land, “The retinex theory of color vision,” Sci. Am. 237, 108–128 (1977). [CrossRef]
29. J. von Kries, “Influence of adaptation on the effects produced by luminous stimuli,” in Sources of Color Science, D. L. MacAdam, ed. (1970), pp. 109–119.
30. J. von Kries, “Theoretische Studien über die Umstimmung des Sehorgans,” in Handbuch der Physiologie des Menschen (1905), pp. 211–212.
31. G. Buchsbaum, “A spatial processor model for object colour perception,” J. Franklin Inst. 310, 1–26 (1980). [CrossRef]
32. R. Gershon, A. D. Jepson, and J. K. Tsotsos, “From [R, G, B] to surface reflectance: computing color constant descriptors in images,” in Proceedings of the 10th International Joint Conference on Artificial Intelligence, San Francisco, CA, 1987 (Morgan Kaufmann, 1987), pp. 755–758.
33. G. D. Finlayson and E. Trezzi, “Shades of gray and colour constancy,” in Color and Imaging Conference, Scottsdale, Arizona, 2004, Vol. 1, pp. 37–41.
34. J. van de Weijer and T. Gevers, “Color constancy based on the grey-edge hypothesis,” in IEEE International Conference on Image Processing (2005), Vol. 2, pp. II–722-5.
35. J. van de Weijer, T. Gevers, and A. Gijsenij, “Edge-based color constancy,” IEEE Trans. Image Process. 16, 2207–2214 (2007). [CrossRef]
36. T. Celik and T. Tjahjadi, “Adaptive colour constancy algorithm using discrete wavelet transform,” Comput. Vis. Image Underst. 116, 561–571 (2012). [CrossRef]
37. A. Chakrabarti, K. Hirakawa, and T. Zickler, “Color constancy with spatio-spectral statistics,” IEEE Trans. Pattern Anal. Mach. Intell. 34, 1509–1519 (2012). [CrossRef]
38. M. Rezagholizadeh and J. Clark, “Edge-based and efficient chromaticity spatio-spectral models for color constancy,” in International Conference on Computer and Robot Vision (2013), pp. 188–195.
39. A. Gijsenij and T. Gevers, “Color constancy using natural image statistics and scene semantics,” IEEE Trans. Pattern Anal. Mach. Intell. 33, 687–698 (2011). [CrossRef]
40. D. Forsyth, “A novel algorithm for color constancy,” Int. J. Comput. Vis. 5, 5–35 (1990). [CrossRef]
41. A. Gijsenij, T. Gevers, and J. van de Weijer, “Generalized gamut mapping using image derivative structures for color constancy,” Int. J. Comput. Vis. 86, 127–139 (2010). [CrossRef]
42. G. D. Finlayson, S. D. Hordley, and P. M. HubeL, “Color by correlation: a simple, unifying framework for color constancy,” IEEE Trans. Pattern Anal. Mach. Intell. 23, 1209–1221 (2001). [CrossRef]
43. J. Huo, Y. Chang, J. Wang, and X. Wei, “Robust automatic white balance algorithm using gray color points in images,” IEEE Trans. Consum. Electron. 52, 541–546 (2006). [CrossRef]
44. K.-J. Yoon, Y. J. Chofi, and I.-S. Kweon, “Dichromatic-based color constancy using dichromatic slope and dichromatic line space,” in IEEE International Conference on Image Processing (2005), Vol. 3, pp. III–960-3.
45. S. Ratnasingam and S. Collins, “Study of the photodetector characteristics of a camera for color constancy in natural scenes,” J. Opt. Soc. Am. A 27, 286–294 (2010). [CrossRef]
46. G. Sapiro, “Color and illuminant voting,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 1210–1215 (1999). [CrossRef]
47. D. H. Brainard and W. T. Freeman, “Bayesian color constancy,” J. Opt. Soc. Am. A 14, 1393–1411 (1997). [CrossRef]
48. W. Xiong and B. Funt, “Stereo retinex,” in The 3rd Canadian Conference on Computer and Robot Vision (2006), pp. 15.
49. G. D. Finlayson, S. D. Hordley, and P. Morovic, “Chromagenic colour constancy,” in 10th Congress of the International Colour Association (2005).
50. C. Fredembach and G. Finlayson, “Bright-chromagenic algorithm for illuminant estimation,” J. Imaging Sci. Technol. 52, 040906 (2008). [CrossRef]
51. B. A. Wandell, “The synthesis and analysis of color images,” IEEE Trans. Pattern Anal. Mach. Intell. PAMI-9, 2–13 (1987). [CrossRef]
52. K. Barnard, G. Finlayson, and B. Funt, “Color constancy for scenes with varying illumination,” Comput. Vis. Image Underst. 65, 311–321 (1997). [CrossRef]
53. G. D. Finlayson, B. V. Funt, and K. Barnard, “Color constancy under varying illumination,” in 5th International Conference on Computer Vision (1995), pp. 720–725.
54. J. L. Nieves, C. Plata, E. M. Valero, and J. Romero, “Unsupervised illuminant estimation from natural scenes: an RGB digital camera suffices,” Appl. Opt. 47, 3574–3584 (2008). [CrossRef]
55. V. C. Cardei, B. Funt, and K. Barnard, “Estimating the scene illumination chromaticity by using a neural network,” J. Opt. Soc. Am. A 19, 2374–2386 (2002). [CrossRef]
56. H. R. V. Joze and M. S. Drew, “Improved machine learning for image category recognition by local color constancy,” in IEEE International Conference on Image Processing (2010), pp. 3881–3884.
57. J. T. Barron, “Convolutional color constancy,” in IEEE International Conference on Computer Vision (2015), pp. 379–387.
58. V. Agarwal, A. V. Gribok, and M. A. Abidi, “Machine learning approach to color constancy,” Neural Netw. 20, 559–563 (2007). [CrossRef]
59. N. Wang, D. Xu, and B. Li, “Edge-based color constancy via support vector regression,” IEICE Trans. Inf. Syst. E92-D, 2279–2282 (2009). [CrossRef]
60. Z. Lou, T. Gevers, N. Hu, and M. Lucassen, “Color constancy by deep learning,” in British Machine Vision Conference (2015).
61. W. Shi, C. C. Loy, and X. Tang, “Deep specialized network for illuminant estimation,” in European Conference on Computer Vision (Springer, 2016), pp. 371–387.
62. S. Bianco, C. Cusano, and R. Schettini, “Color constancy using CNNs,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops (2015), pp. 81–89.
63. S. W. Oh and S. J. Kim, “Approaching the computational color constancy as a classification problem through deep learning,” Pattern Recogn. 61, 405–416 (2017). [CrossRef]
64. F. Vagni, “Survey of hyperspectral and multispectral imaging technologies,” Technical Report (North Atlantic Treaty Organization, Sensors and Electronics Technology Panel, 2007).
65. K.-S. Lee, W. B. Cohen, R. E. Kennedy, T. K. Maiersperger, and S. T. Gower, “Hyperspectral versus multispectral data for estimating leaf area index in four different biomes,” Remote Sens. Environ. 91, 508–520 (2004). [CrossRef]
66. M. Mosny and B. Funt, “Multispectral colour constancy,” in Color and Imaging Conference (Society for Imaging Science and Technology, 2006), Vol. 2006, pp. 309–313.
67. M. Mosny and B. Funt, “Multispectral color constancy: real image tests,” Proc. SPIE 6492, 64920S (2007). [CrossRef]
68. D. H. Foster, K. Amano, S. M. C. Nascimento, and M. J. Foster, “Frequency of metamerism in natural scenes,” J. Opt. Soc. Am. A 23, 2359–2372 (2006). [CrossRef]
69. M. DiCosola, “Understanding illuminants,” Technical Report (X-Rite, 1995).
70. M. Rezagholizadeh and J. J. Clark, “Image sensor modeling: color measurement at low light levels,” J. Imaging Sci. Technol. 58, 304011 (2014). [CrossRef]
71. K. Barnard and B. Funt, “Camera characterization for color research,” Color Res. Appl. 27, 152–163 (2002). [CrossRef]
72. G. D. Finlayson, M. S. Drew, and B. V. Funt, “Spectral sharpening: Sensor transformations for improved color constancy,” J. Opt. Soc. Am. A 11, 1553–1563 (1994). [CrossRef]
73. P.-J. Lapray, J.-B. Thomas, P. Gouton, and Y. Ruichek, “Energy balance in spectral filter array camera design,” J. Eur. Opt. Soc. 13, 1 (2017). [CrossRef]
74. X. Wang, J.-B. Thomas, J. Y. Hardeberg, and P. Gouton, “Multispectral imaging: narrow or wide band filters?” J. Int. Colour Assoc. 12, 44–51 (2014).
75. J. Hernández-Andrés, J. Romero, J. L. Nieves, and R. L. Lee, “Color and spectral analysis of daylight in southern Europe,” J. Opt. Soc. Am. A 18, 1325–1335 (2001). [CrossRef]
76. J. Conde, H. Haneishi, M. Yamaguchi, N. Ohyama, and J. Baez, “CIE-XYZ fitting by multispectral images and mean square error minimization with a linear interpolation function,” Rev. Mex. Fis. 6, 601–607 (2004).
77. S. D. Hordley and G. D. Finlayson, “Reevaluation of color constancy algorithm performance,” J. Opt. Soc. Am. A 23, 1008–1020 (2006). [CrossRef]
78. S. Bianco, F. Gasparini, and R. Schettini, “Consensus-based framework for illuminant chromaticity estimation,” J. Electron. Imaging 17, 023013 (2008). [CrossRef]
79. G. D. Finlayson, R. Zakizadeh, and A. Gijsenij, “The reproduction angular error for evaluating the performance of illuminant estimation algorithms,” IEEE Trans. Pattern Anal. Mach. Intell. PP, 1 (2016). [CrossRef]