The perception of brightness of unrelated self-luminous colored stimuli of the same luminance has been investigated. The Helmholtz–Kohlrausch (H-K) effect, i.e., an increase in brightness perception due to an increase in saturation, is clearly observed. This brightness perception is compared with the calculated brightness according to six existing vision models, color appearance models, and models based on the concept of equivalent luminance. Although these models included the H-K effect and half of them were developed to work with unrelated colors, none of the models seemed to be able to fully predict the perceived brightness. A tentative solution to increase the prediction accuracy of the color appearance model CAM97u, developed by Hunt, is presented.
© 2013 Optical Society of America
Brightness is an attribute of visual perception according to which an area appears to emit, or reflect, more or less light . Luminance, defined as the luminous flux per unit projected area, transmitted by an elementary beam passing through a given point and in a given direction , can be considered as the photometric quantity most closely related to brightness. However, colored stimuli of equal luminance do not necessarily appear equally bright. The complex relationship between luminance and brightness—conceptualized in a brightness-to-luminance ratio (B/L)—has been extensively studied [2–5]. Deviations from unity of the B/L ratio have been observed in heterochromatic brightness matches of colored stimuli. They can be caused by using the wrong -function in the present standard system of photometry or a failure of Abney’s proportionality and additivity laws . Experimental evidence has shown the latter to be the case in direct heterochromatic brightness matching [7–11]. Although the relationship between luminance and brightness can be described to a first-order approximation by a power law , it has become clear that several other parameters, such as the luminance of the background and the colorfulness of the stimulus are involved as well. The effect of colorfulness or saturation on perceived brightness is referred to as the Helmholtz–Kohlrausch (H-K) effect . It states that highly saturated colors appear brighter than those of low saturation, even when they are equal in luminance [3,13].
In the past, a number of models have been developed to describe the brightness of a stimulus. Generally most models deal with related colors, which are perceived to belong to areas seen in relation to other colors . Typical examples are colors produced by matte objects reflecting light (object or surface colors). However, only two of the models for related colors, those based on the concept of equivalent luminance (  and ) described in the next section, include the H-K effect. Another category of stimuli is unrelated colors, which are colors perceived to belong to areas seen in isolation from any other colors [14,15]. Typical examples are self-luminous stimuli, light sources, railway, aviation and marine signal lights, traffic lights, and street lights viewed with a dark surround, e.g., on a dark night. With the implementation of light-emitting-diodes (LEDs) in signalization and in architectural lighting applications, saturation levels are reaching much higher values. The predictive power of the few existing models (CAM97u , ATD01 , and CAMFu ), detailed in the next section, to describe the brightness for this category of stimuli including the H-K effect has not yet been investigated systematically.
In this study, nine observers have evaluated the brightness of 58 unrelated self-luminous colored stimuli with a luminance of and surrounded by a dark background. The spectral radiance of the stimuli has been measured, and observer data of the brightness have been collected. The correlation between the brightness perception of these observers and the brightness calculated according to the models based on the equivalent luminance, the CAM97u model, the ATD01 model, and the CAMFu model has been investigated. Finally, a tentative improvement of the brightness prediction of CAM97u is proposed.
2. VISION MODELS FOR PREDICTING BRIGHTNESS PERCEPTION
The vision models mentioned in the introduction are extensively described below.
CAM97u is a color appearance model (abbreviated as CAM hereafter) for unrelated colors developed by Hunt . CAMs try to link the experimentally measurable optical properties of stimuli and their corresponding perceptual attributes, such as brightness, hue, colorfulness, lightness, chroma, and saturation under varying conditions by taking into account some of the physiological processes that occur in the human visual system . In general, CAMs, including the most recent CIECAM02 , are designed for object colors (related colors) and do not implement the H-K effect. Most of the models use the tristimulus values and luminance of the stimulus, the reference white, the background, and surround as input parameters.
Only CAM97u, which can be considered as an extension to Hunt’s model for related colors , was developed particularly for unrelated colors and implements the H-K effect. The input parameters of CAM97u are the tristimulus values of the stimulus and the conditioning field, which is the stimulus that is seen just before the test stimulus. In addition, the photopic and scotopic luminance of the stimulus, the adapting field, and the conditioning field are required. While the photopic luminance is used throughout the whole model (luminance adaptation, chromatic adaptation, photopic part of the achromatic signal…) the scotopic luminance is only used to calculate the scotopic part of the achromatic signal. In the model, the brightness is predicted by using the equation:19]. The inclusion of in the expression for brightness represents the H-K effect.
The ATD01 model is a color vision model, developed by Guth , built on the theoretical ideas of Helmholtz, Hering, von Kries, and Mueller. It is developed to predict the brightness, saturation, and hue of unrelated colors and to predict a wide range of vision science data on phenomena, such as chromatic discrimination, absolute thresholds, the Bezold–Brücke hue shift, the Abney effect, heterochromatic brightness matching, light adaptation, and chromatic adaptation [14,22]. In the ATD model, the tristimulus values are transformed into LMS cone responses. These LMS signals are gain-controlled and undergo a second transformation to yield an achromatic () and two chromatic or opponent signals (red–green or , blue–green or ). These , , and signals go through a compressive nonlinearity and are finally used to calculate the perceptual attributes brightness, hue, and saturation. The brightness is calculated as quadrature sum of the , , and signals:8].
C. Fu (CAMFu)
Fu investigated the color appearance of unrelated colors under photopic and mesopic levels of adaptation using stimuli displayed on a CRT . Four stimulus sizes (0.5°, 1°, 2°, and 10°) and stimuli with luminance values between 0.1 and were used. For brightness evaluation a magnitude scaling method  was adopted. Colors appeared more colorful (Hunt effect ) and brighter at higher luminance levels than at lower luminance levels and increasing the stimulus size generally increased both brightness and colorfulness. The data were compared with the predictions of CAM97u and CIECAM02, even though the latter model is in principle not applicable to unrelated colors. An alternative CAM for unrelated colors under photopic and mesopic conditions, in this paper indicated as CAMFu and highly inspired by CIECAM02, was developed. In the CAMFu model, the brightness correlate is given by
D. Equivalent Luminance Nayatani ()
Brightness has also been modeled based on the concept of equivalent luminance, which is defined as the photopic luminance of a previously determined common reference stimulus that matches the test stimulus (object or self-luminous) in terms of brightness . Two such models have been developed, one by Nayatani () and one by the CIE (). The effects of the surround, background, and field of view are ignored; the H-K effect is however taken into account. Both models apply in principle only to related colors.
Nayatani proposed two methods that take into account the H-K effect for calculating the equivalent luminance : the variable-achromatic-color (VAC) and the variable-chromatic-color (VCC) method, given by Eqs. (4) and (5), respectively:
In the VAC method, the luminance of the reference achromatic color is changed in order to match the colored stimuli. In the VCC method, the luminance of the colored stimuli is changed in order to match the achromatic reference.
E. Equivalent Luminance CIE ()
This model calculates a brightness-related equivalent luminance by using four parameters: the photopic luminance , the scotopic luminance , an achromatic adaptation coefficient , and the chromatic contribution . The achromatic adaptation contribution takes into account the so-called Purkinje effect, which causes a shift in the sensitivity of the human eye toward the blue end of the visible spectrum at low luminance levels, while the chromatic contribution allows for the H-K effect. This chromatic contribution changes with the luminance level and the chromaticity coordinates of the stimulus. A formula for the general equivalent luminance for related colors has been proposed :
This equation has been based on visual data gathered from several studies [8,13,24–26] and Eq. (6) has been tested by matching experiments . The supplementary system of photometry as described by Eq. (6) was originally developed based on the 2° quantities, except for the scotopic luminance, but can also be used for a centrally fixed 10° field.
3. EXPERIMENTAL SETUP
In this study, the brightness of unrelated colors was investigated using a specially designed viewing room [see Fig. 1(a)]. The dimensions of the room are 3 m wide by 5 m long by 3.5 m high. The walls, ceiling, and floor were covered with black curtains, gray panels, and a grayish black carpet, respectively. At one wall, a circular tube (diameter 37 cm), containing an RGB LED module and covered by a circular diffuser, was mounted inside the wall [see Fig. 1(b)]. This circular diffuser provided a stimulus of approximately 10° FOV to the observers who were seated on a fixed chair at a distance of 211 cm. The color of the stimulus was changed by controlling the intensity of the RGB LEDs using a DMX digital communication network. For the experiments, 58 colored stimuli with a constant luminance of (standard deviation ) and a wide chromaticity gamut were carefully selected. This luminance level was chosen in order to ensure photopic viewing conditions without any glare effect. Note that the luminance was calculated using the standard spectral luminous efficiency function for the CIE 10° observer . The CIE 1976 , chromaticity coordinates of the stimuli, as determined from spectral measurement using a spectroradiometer (MS260i Oriel instruments spectrograph) and suitable calibration, are illustrated in Fig. 2.
The luminance uniformity of the stimulus area was checked by measurements with a two-dimensional luminance camera (MURATest by Eldim). The luminance of the stimulus was found to gradually decrease (to approximately 20% of the average) from the center to the edge. As the human eye is insensitive to low spatial frequencies , observers were not aware of this variation. The background, consisting of a black curtain, provided a adaptive field extending to 35° around the 10° stimuli.
4. EVALUATION OF BRIGHTNESS
In the psychophysical experiment, observers were asked to evaluate the perceived brightness of the stimuli using a magnitude estimation method. With this method, a numerical and scalable result for the perceptual attribute under test, in this case brightness, can be obtained directly [21,29–31]. The experiment started by showing a reference achromatic stimulus with chromaticity close to that of illuminant D65 (, , 0.4695) and a luminance equal to that of the colored test stimuli, i.e., . The color difference between the reference stimulus and the illuminant D65 was 0.00234. To keep the colored stimuli unrelated, the presentation of a reference stimulus shown in temporal juxtaposition with the test stimulus, possibly inducing small memory errors, was preferred to a reference stimulus presented at the same time in adjacent spatial locations. To this reference a fixed brightness value of 50 was attributed. After 5 s, the colored test stimulus was presented for 15 s. Just after switching to the reference again, the observers were asked to rate the brightness of the stimulus relative to the reference achromatic stimulus. Pilot experiments had shown that it is easier to rate the brightness immediately after the colored stimulus has disappeared. Furthermore, by showing the reference stimulus after each stimulus presentation, any errors induced due to memory effects were minimized. Total darkness never occurred in order to reduce the potential for temporary blindness and afterimages. Before the experiment, the observers adapted to the dark viewing conditions for at least 5 min. The following instructions were given to each observer:
You will see 58 test stimuli. First a reference stimulus will be shown for 5 s. Each test stimulus is then presented for 15 s. Between each of these 58 test stimuli, a reference stimulus will be shown for 5 s. Give a value to the brightness of the test stimulus immediately after the disappearing of this test stimulus and in comparison with the reference. The reference has a brightness value of 50. A value of zero represents a dark stimulus without any brightness. There is no upper limit to the value of brightness.
To become familiar with the magnitude estimation method, a straightforward exercise was completed in which observers were asked to rate the length of a line in comparison with a line of length 100. A similar method is described in the ASTM International standard test method for unipolar magnitude estimation of sensory attributes . In addition, a set of training stimuli, with the same hue and luminance as in the experiments, was also presented to the observers, allowing them to be aware of the brightness range and to become familiar with brightness rating technique. The duration of this training session was about 45 min, while the experiment took about 25 min. A small break was taken between the training and the experiment in order to reduce the influence of fatigue.
Nine observers, five female and four male, with ages ranging between 23 and 30 years (average 27) participated in the experiments. All had normal color vision according to the Ishihara 24 plate Test for Color Blindness and were naïve with respect to the purpose of the experiment. To avoid irreproducibility in the luminous flux and chromaticity of the LEDs induced by thermal conditions, the 58 stimuli were presented in an identical sequence to all observers. The same sequence was also used for the optical measurements. The training session contained the same set of stimuli but in a different order.
A. Inter-observer Agreement
The agreement between any two sets of data can be analyzed using the coefficient of variation (CV) [Eq. (7)] . For a perfect agreement between two sets of data, the CV should be equal to zero. Inter-observer agreement was assessed by calculating the CV values between each individual observer’s results and the geometric mean of all the observers:
The values for the inter-observer agreement in this study ranged from 7% to 21% with an average of 11% and a median of 8% (see Table 1). This result is similar to the value of 13% reported by Luo et al.  for the lightness of related colors and better than the value of 29% reported by Fu et al.  when scaling the brightness of unrelated colors in conditions similar to those used in this study.
B. Brightness Perception
The values of the geometric mean of the observers’ brightness (further referred to as ) for each of the 58 test stimuli of equal luminance () are given in Table 2, together with the CIE 1976 , hue-angle () and the saturation () of each stimuli, calculated using Eqs. (8) and (9) :
It is clear that for each hue series, there is generally an increase in perceived brightness with saturation. A plot of for each stimulus versus the saturation is given in Fig. 3. As each stimulus has the same luminance, Fig. 3 clearly illustrates the H-K effect. The figure suggests a possible difference in the size of the H-K effect for different colors (larger for red and blue, lower for yellow). Therefore the dependency of the perceived brightness on the 11 hue series has been investigated using a one-way ANOVA. With as dependent variable and the 11 hue series as factor, the analysis showed that the observed brightness is not significantly different for the eleven hues, , . Even though prior studies of the H-K effect showed a tendency to be larger for reds and blues than for yellow , the current experimental data was insufficient to support this with statistical significance.
C. Model Performance
The ability to predict the observer data for brightness has been investigated for each of the previously described models. Before comparing their performances, some clarifications about the input values required by the models are discussed. Because of the 10° field-of-view (FOV) of the test stimuli, the 10° photometric quantities have been calculated, although not all models were developed for this FOV.
For CAM97u, the equi-energy stimulus was used for the conditioning field and the adapting field. The luminance of the conditioning field and the adapting field was taken as . The scotopic luminance of the stimuli was calculated using the spectral luminous efficiency function for scotopic vision and the scotopic luminance of the conditioning field and the adapting field was taken as .
For CAMFu, the experimental reference stimulus was taken as the reference white. The tristimulus values of the stimuli , , and of the reference white , , were normalized such that . The luminance of the reference white and the adapting field were, respectively, set to and . The scotopic luminance of the stimuli was calculated using the spectral luminous efficiency function for scotopic vision .
For each of the six models described in the second section, the averaged visual brightness as assessed by the observers, , has been plotted against the calculated brightness, , on Fig. 4. The blue, green, red, and yellow stimuli have been highlighted. To assess the amount of variation in brightness perception that is explained by each model, the coefficient of determination () of the regression between the observed and calculated brightness has been calculated. A close to 1 suggests a good prediction by the model . Although a linear relation between the observed and the calculated brightness is expected, the Spearman correlation coefficient  has also been calculated. The Spearman correlation coefficient is a rank order metric (not sensitive to the potential nonlinearity of the relation between observed and calculated values of ), having a value between (perfect negative correlation) and (perfect positive correlation). Table 3 summarizes the statistical results for each model.
From Table 3, it is clear that none of the described models perform well. Although the two best models, and , included the H-K effect explicitly, they were not developed for unrelated colors, which could explain their lack of predictive strength for the constant luminance stimuli. Remarkably the VAC model of Nayatani’s equivalent luminance, where the achromatic color is changed to match the colored stimuli, performs better than the VCC model (), although it should be less applicable to the method used in this experiment, with a constant achromatic stimulus.
While the one-way ANOVA showed that there was not a significant difference between for the 11 hue series, the same analysis with , and as dependent variables (Table 3) indicated that there is a difference between the brightness predicted by these models for the 11 hues. In fact, the models predicted a difference between the brightness for these hues while this was not observed by the observers.
The one-way ANOVA with the brightness prediction of the other models—CAM97u, ATD01, and CAMFu—also showed a difference between the predicted brightness for the 11 hue series. Although these have been developed for unrelated colors and include the H-K effect, the results from the ANOVA, the low values of the Spearman correlation coefficient, and the low coefficient of determination (Table 3) indicate that they are unable to predict the experimental brightness data (with a luminance of ). The brightness prediction of CAM97u [Eq. (1)] and CAMFu [Eq. (3)] both consist of a summation of an achromatic signal, which is nearly constant (all stimuli have equal luminance), and a contribution of the colorfulness factor , which takes into account the H-K effect. The failure of these two models to predict the perceived brightness for the self-luminous stimuli in this study might be attributed to a poor implementation of this colorfulness factor. In fact, the correlation between and can be used to investigate the implementation of the H-K effect.
The correlation between and indicates that the H-K effect is poorly predicted. This failure of CAMFu might be attributed to the use of a colorfulness factor based on the CIECAM02 model , which was not developed to handle unrelated colors.
The correlation between and shows a better prediction of the H-K effect (Fig. 5). A more detailed analysis of the hue dependency of using a one-way ANOVA analysis indicates that there is no significant difference between the predicted brightness for the 11 hues, , , which is in agreement with the observers not rating the hue series differently. Also a customized ANCOVA was performed to test the assumptions of homogeneity of regression slopes between the observed brightness and the colorfulness of CAM97u for the 11 hue series  by including, next to the main effect of and , the interaction term. This customized analysis with as dependent variable, the 11 hues as fixed factors and as covariate, showed that the effect of the interaction term between and is not significant (, ). In fact the assumption of homogeneity of the slopes between and for each hue is confirmed. The relationship between the observed brightness and the colorfulness of CAM97u is thus similar for the eleven hues.
The H-K effect can thus be predicted by while the achromatic factor is nearly constant for all 58 stimuli and does not correlate with the observed brightness. This suggests that the performance of CAM97u could be considerably improved just by increasing the contribution of this colorfulness. In fact, increasing the contribution of the colorfulness in Eq. (1) will also increase the correlation between and . However, because and are not totally independent and the magnitude of the increased colorfulness contribution should be determined, additional visual data at different luminance levels are required in order to propose a better model.
In a psychophysical experiment the brightness perception of unrelated self-luminous colored stimuli with a constant luminance was investigated. The ability of six vision models to predict the observed brightness was evaluated using the coefficient of determination, the Spearman correlation coefficient, and an ANOVA analysis. Although the models included the H-K effect and half of them were developed to work with unrelated colors, none of the models seemed to be fully able to predict the perceived brightness. The expected linear relationship between the observed and predicted brightness was not achieved.
However, the brightness prediction of CAM97u and CAMFu both consist of a summation of an achromatic signal, which is nearly constant (all stimuli have equal luminance), and a contribution of the colorfulness factor , to take the H-K effect into account. The failure of these two models to predict the perceived brightness for the self-luminous stimuli in this study might thus be attributed to a poor implementation or an underestimation of this colorfulness factor. The former is most likely the case for the Fu model as the correlation between the observed brightness and its predicted colorfulness is rather low and the model is based on CIECAM02, which is unable to handle unrelated colors. However, CAM97u, specifically designed for unrelated colors, showed a good correlation between its colorfulness factor and the observed brightness suggesting that an increase of the contribution of the colorfulness factor in the calculation of the brightness might lead to a better model. In order to determine the exact magnitude and possible luminance dependence of this increased colorfulness contribution, further research at different luminance levels will be performed.
The authors would like to thank the Research Council of the KU Leuven for supporting this research project (STIM-OT/11/056).
1. CIE, International Lighting Vocabulary (CIE Central Bureau, 2011).
2. C. L. Sanders and G. Wyszecki, “Correlate for brightness in terms of CIE color matching data,” in CIE Proceedings 15th Session (CIE Central Bureau, 1963).
3. Y. Nayatani, “Simple estimation methods for the Helmholtz—Kohlrausch effect,” Color Res. Appl. 22, 385–401 (1997). [CrossRef]
4. CIE, Supplementary System of Photometry (CIE Central Bureau, 2011).
5. CIE, Testing of Supplementary Systems of Photometry (CIE Central Bureau, 2001).
6. G. Wyszecki and W. S. Stiles, Color Science, 2nd ed. (Wiley, 1982), p. 410.
7. S. L. Guth, N. J. Donley, and R. T. Marrocco, “On luminance additivity and related topics,” Vis. Res. 9, 537–575 (1969). [CrossRef]
8. S. L. Guth and H. R. Lodge, “Heterochromatic additivity, foveal spectral sensitivity, and a new color model,” J. Opt. Soc. Am. 63, 450–462 (1973). [CrossRef]
9. R. M. Boynton and P. K. Kaiser, “Vision: the additivity law made to work for heterochromatic photometry with bipartite fields,” Science 161, 366–368 (1968). [CrossRef]
10. G. Wagner and R. M. Boynton, “Comparison of four methods of heterochromatic photometry,” J. Opt. Soc. Am. 62, 1508–1515 (1972). [CrossRef]
11. P. K. Kaiser and G. Wyszecki, “Additivity failures in heterochromatic brightness matching,” Color Res. Appl. 3, 177–182 (1978). [CrossRef]
12. S. S. Stevens, “On the psychophysical law,” Psychol. Rev. 64, 153–181 (1957). [CrossRef]
13. H. Yaguchi and M. Ikeda, “Subadditivity and superadditivity in heterochromatic brightness matching,” Vis. Res. 23, 1711–1718 (1983). [CrossRef]
14. M. D. Fairchild, Color Appearance Models, 2nd ed., Wiley-IS&T Series in Imaging Science and Technology (Wiley, 2005).
15. R. W. G. Hunt and M. R. Pointer, Measuring Colour, 4th ed., Wiley-IS&T Series in Imaging Science and Technology (Wiley, 2011).
16. R. W. G. Hunt, “Revised colour-appearance model for related and unrelated colours,” Color Res. Appl. 16, 146–165 (1991). [CrossRef]
17. S. L. Guth, “ATD01 model for color appearances, color differences and chromatic adaptation,” Proc. SPIE 4421, 303–3062002). [CrossRef]
18. C. Fu, C. Li, G. Cui, M. R. Luo, R. W. G. Hunt, and M. R. Pointer, “An investigation of colour appearance for unrelated colours under photopic and mesopic vision,” Color Res. Appl. 37, 238–254 (2012). [CrossRef]
19. R. W. G. Hunt, Measuring Colour, 3rd ed. (Fountain, 1998), pp. 239–246.
20. M. R. Luo and R. W. G. Hunt, “The structure of the CIE 1997 Colour Appearance Model (CIECAM97s),” Color Res. Appl. 23, 138–146 (1998). [CrossRef]
21. CIE, A Colour Appearance Model for Colour Management Systems: CIECAM02 (CIE Central Bureau, 2004).
22. P. Capilla, M. J. Luque, J. Gómez, and A. Palomares, “On saturation and related parameters following Guth’s ATD colour-vision model,” Color Res. Appl. 26, 305–321 (2001). [CrossRef]
23. M. R. Luo, A. A. Clarke, P. A. Rhodes, A. Schappo, S. A. R. Scrivener, and C. J. Tait, “Quantifying colour appearance. Part I. Lutchi colour appearance data,” Color Res. Appl. 16, 166–180 (1991). [CrossRef]
24. K. Sagawa and K. Takeichi, “System of mesopic photometry for evaluating lights in terms of comparative brightness relationships,” J. Opt. Soc. Am. A 9, 1240–1246 (1992). [CrossRef]
25. Y. Nakano, “A model for brightness perception and its application to individual data,” Jpn. J. Opt. 21, 705–716 (1992).
26. D. A. Palmer, “Standard observer for large-field photometry at any level,” J. Opt. Soc. Am. 58, 1296–1298 (1968). [CrossRef]
27. CIE, Colorimetry (CIE Central Bureau, 2004).
28. T. N. Cornsweet, Visual Perception (Academic, 1970).
29. F. B. Leloup, M. R. Pointer, P. Dutré, and P. Hanselaer, “Luminance-based specular gloss characterization,” J. Opt. Soc. Am. A 28, 1322–1330 (2011). [CrossRef]
30. G. A. Gescheider, “Psychophysical scaling,” Annual Rev. Psych. 39, 169–200 (1988). [CrossRef]
31. W. S. Torgerson, Theory and Methods of Scaling (Wiley, 1958).
32. ASTM International, ASTM Standard E1697-05(2012)e1, “Standard Test Method for Unipolar Magnitude Estimation of Sensory Attributes” (ASTM International, 2012).
33. P. A. García, R. Huertas, M. Melgosa, and G. Cui, “Measurement of the relationship between perceived and computed color differences,” J. Opt. Soc. Am. A 24, 1823–1829 (2007). [CrossRef]
34. A. Field, Discovering Statistics Using SPSS, 3rd ed. (SAGE, 2009).