Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Towards an image-based brightness model for self-luminous stimuli

Open Access Open Access

Abstract

Brightness is one of the most important perceptual correlates of color appearance models (CAMs) when self-luminous stimuli are targeted. However, the vast majority of existing CAMs adopt the presence of a uniform background surrounding the stimulus, which severely limits their practical application in lighting. In this paper, a study on the brightness perception of a neutral circular stimulus surrounded by a non-uniform background consisting of a neutral ring-shaped luminous area and a dark surround is presented. The ring-shaped luminous area is presented with 3 thicknesses (0.33 cm, 0.67 cm and 1.00 cm), at 4 angular distances to the edge of the central stimulus (1.2°, 6.4°, 11.3° and 16.1°) and at 3 luminance levels (90 cd/m2, 335 cd/m2, 1200 cd/m2). In line with the literature, the results of the visual matching experiments show that the perceived brightness decreases in presence of a ring and the effect is maximal at the highest luminance of the ring, for the largest thickness and at the closest distance. Based on the observed results, an image-based model inspired by the physiology of the retina is proposed. The model includes the calculation of cone-fundamental weighted spectral radiance, scattering in the eye, cone compression and receptive field post-receptor organization. The wide receptive field assures an adaptive shift determined by both the adaptation to the stimulus and to the background. It is shown that the model performs well in predicting the matching experiments, including the impact of the thickness, the distance and the intensity of the ring, showing its potential to become the basic framework of a Lighting Appearance Model.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Modelling human color perception is the main mission of color science. Multiple color appearance models (CAMs) which generate perceptual attributes of colored stimuli have been developed, such as the Nayatani et al. model [1], the Hunt model [2], CIECAM97s [3], CIECAM02 [4] and CAM16 [5], some of which are recommended by the Commission Internationale de l’Éclairage (CIE). In these models, from the optical characterization of the stimulus and background in terms of spectral radiance or tristimulus values, a number of processing steps mimicking the human visual system are defined to output a set of absolute (brightness, colorfulness, saturation and hue) and relative (lightness, chroma) visual correlates of the stimulus. Though being widely applied in the fields of printing and media reproduction, the majority of existing CAMs still have certain limitations in their applications as they assume a uniform stimulus (typically with a defined angular extent between 2 and 10°) seen on a uniform background (typically extending for 10° from the edge of the stimulus) and categorized surround [6], while real life stimuli, such as street lighting or outdoor billboards, are often observed in much more complex environments.

To extend the applications of traditional CAMs to more complex scenarios, some image color appearance models have also been created. These models are capable of predicting different spatial color perception effects such as simultaneous contrast, crispening or spreading, and are applied for image quality assessment [7], high dynamic range (HDR) image rendering [811] and image enhancement [12,13]. While the iCAM models [7,8] and the Reinhard et al. model [9] are mainly based on fundamental CAMs such as CIECAM02, Meylan et.al [10] and Benoit et al. [11] proposed models which are more physiologically based, with tone-mapping operators inspired by retinal processing mechanisms. Though the filter kernel size influences the performance of such models [14], the physiological background behind the choice of the kernel size has not been clearly justified [13]. Moreover, Kolas et al. [12] developed a framework for spatial color algorithms based on the Retinex theory by Land [15], and Provenzi et al. [13] suggested a color correction algorithm with a series of mathematical assumptions based on the human visual system. However, these models were not developed explicitly for predicting color appearance and the visual attributes, such as brightness, are not explicitly extracted from the models.

Applying the previously mentioned non-imaging and imaging color appearance models to self-luminous stimuli, such as light sources present in general lighting scenes, is, however, not straightforward. The absolute spectral radiance of the stimulus can be independent from that of the background leading to an ambiguity in the definition of and normalization to the white point. It has also been shown that these models underestimate the Helmholtz-Kohlrausch (H-K) effect [16]. To overcome those limitations, some CAMs have been developed for self-luminous stimuli [16,17]. One of them, CAM18sl [16], has been implemented in a wide range of applications such as visual gloss, the CIE UGR for discomfort glare, the CIE gray-scale calculation for self-luminous devices, the requirements for traffic signalization [18], as well as to predict the age effect in brightness perception for saturated (red and blue) stimuli [19]. Nevertheless, CAM18sl has the same limitations as the classical CAMs mentioned before, being only applicable to situations characterized by a uniform stimulus and background. As these limitations can be resolved by using image-based color appearance models, the application of the iCAM model to self-luminous scenes has been investigated [14]. Unfortunately, it has been shown that the model has some drawbacks, such as an underestimation of H-K effect and the stimulus size effect and an overestimation of the brightness perception of self-luminous stimuli surrounded by a dark background [14]. All these observations call for a more comprehensive color appearance model applicable for complex scenes including light sources, which might be called a lighting appearance model (LAM).

A first step towards a LAM is the development of an image-based model for brightness perception of self-luminous stimuli in complex situations. Having an important role in lighting and display applications, more particularly in defining glare level or contrast threshold [18,2022], multiple studies have been performed to investigate the impact of the complex viewing environment on the brightness perception of a stimulus. It has been shown that different properties of the background, such as its size and the spatial compositions, have a significant impact [21,2326]. Stevens [23] showed that the area of the inhibiting field determined the degree of brightness decrease of the object. Later, Sun et al. [24] also confirmed that by increasing the size of the luminous background, the brightness of the stimulus would decrease. The separation between the target and background stimulus which are part of the background also appears to influence how the brightness of the target stimulus is perceived: the smaller the separation, the higher the impact on the stimulus brightness [25]. Whittle [21] also found out that the introduction of a thin outline or a hue change between the stimulus and background could reduce the crispening effect. Carter et al. [26] pointed out that the brightness of a stimulus could also be impacted by the luminance changes of an extended area in the background.

In addition, Reid and Shapley proposed a model which considers the surrounding environment’s contribution through modelling brightness contrast and assimilation effects [27]. In these effects, the stimulus brightness changes in the opposite and in the same direction as the background brightness, respectively. Shevell et al. introduced a two-stage model which simulates the neural mechanisms of brightness induction using the retinal stimulation from the target stimulus area and its adjacent area, in conjunction with the neural response from the remote area in the field of view [28]. Kingdom and Moulden also developed a multi-channel model to simulate different brightness phenomena using multiple spatial scale filtering [29], which has then been extended with the two-dimensional brightness model by McArthur et al. [30]. These models are capable of predicting various well-known brightness phenomena, such as simultaneous contrast [27], in complex situations. However, the models either only investigated the phenomena for a small field of view of only a few degrees [27,28] or the models are not verified with human perception experiments [29,30]. McCann [31] and Rudd [32] also proposed strongly physiological-based lightness computation models, which work well for various spatial conditions for neutral scenes, yet, applying these models to self-luminous stimuli might face the same issues as traditional CAMs due to the ambiguity of defining the reference white point for self-luminous scenes.

In order to extend a CAM for object colors to a LAM including light sources, mastering the perception of the brightness of light sources in a complex scene is a first crucial step. This study addresses the question how different parts of the background influence the brightness perception of a neutral self-luminous stimulus. An image-based brightness model is presented which combines the merits of classical appearance models applicable to self-luminous sources with image-based approaches in which complex backgrounds are considered. As a first step, a symmetrical ring-shaped luminous background element characterized by a well-defined gap towards the stimulus is considered. Psychophysical brightness matching experiments are conducted to investigate how the distance (up to 16° from the edge of the stimulus), the area and the luminance of a neutral luminous ring-shaped background influences the brightness perception of a neutral stimulus. The experimental data are modelled using a pixel-by-pixel image-based approach including cone-fundamental weighted spectra radiance, stray light, cone compression, a receptive field post-receptor organization and an adaptive shift. A connection to the CAM18sl model is made by extending the width of the luminous background until a uniform background is reached. Although the experimental setting is still relatively simple (neutral circular stimuli, neutral and ring-shaped induction areas), the experiments allow to develop and test a next-generation image-based brightness model for self-luminous stimuli.

2. Experiment

2.1 Experimental setup

The stimuli used in the experiments were created with PsychToolbox [3335] in MATLAB [36] and displayed on an EIZO ColorEdge PROMINENCE CG3145 monitor. The reference stimulus has a field of view (FOV) of approximately 10°, while the display subtends a FOV of 82°${\times} $ 49° with the observer being seated at a distance of 40 cm to the screen. A black shield is placed in the middle of the screen to ensure that the reference and the test stimuli are visually separated and do not influence the perception of one another. The set-up of the experiment is illustrated as in Fig. 1.

 figure: Fig. 1.

Fig. 1. Pictures of the experiment set-up. (a) A view of the experiment set-up showing the shield and the screen; the reference stimulus is displayed at the left half and the test stimulus at the right half. (b) The top view of the setup. The blue circles indicate where the observer should position their head when viewing the experimental scenes.

Download Full Size | PDF

The reference stimulus consists of a neutral gray circle with a 10° diameter and reference luminance L10,ref of 65 cd/m2. The CIE 2006 10°chromaticity coordinates of (x10 = 0.31, y10 = 0.32), which is similar to the chromaticity of CIE illuminant D65.

The test scene was a central test stimulus having the same size as the reference stimulus surrounded by an additional luminous area. To ensure an equal impact of the distance and the luminance of the additional element to the central stimulus from all directions, the luminous area was chosen as a uniform ring. The test stimulus and ring had the same CIE 2006 10° chromaticity as the reference stimulus, but different luminance levels. The test stimulus was initially shown at either a low starting luminance of 35 cd/m2 or a high starting luminance of 335 cd/m2, so that at the beginning, the test stimulus would appear clearly different in perceived brightness compared to the reference stimulus. Three ring luminance levels were used in this study: L10,ring = 90 cd/m2, L10,ring = 335 cd/m2 and L10,ring = 1200 cd/m2, which covered the low, medium and high luminance ranges of the screen. Further, the ring was presented with three possible thicknesses of 0.33 cm, 0.67 cm and 1.00 cm (average angular widths of 0.25°, 0.49° and 0.73°, respectively) and at 4 different angular gaps with respect to the outer edge of the stimulus: 1.2°, 6.4°, 11.3° and 16.1°. Each ring thickness was used in combination with all three ring luminance levels at all four angular gaps.

To avoid positional bias and luminance starting point bias, the test stimuli were shown both to the left and to the right of the reference stimulus and were initially displayed at two different starting luminance levels. To account for the matching errors made by each observer, each experimental sequence included two test stimuli without any ring around it and shown at two starting luminance levels. Two repeated scenes were also included for checking the intra-observer variability. The test sequence for each observer at each experimental session was randomized to avoid ordinal bias.

In each experimental phase, first, each observer was asked to perform a trial session that included 5 random stimuli to get used to the procedure, and then one official session for each ring thickness at each ring luminance level. In each official experimental session, 22 test scenes were shown (4 distances of the ring ${\times}$ 2 starting points ${\times}$ 2 reference positions + 2 scenes to test matching error + 4 repeated scenes for intra-observer variability).

2.2 Experimental procedure

Visual data was collected using a brightness matching method. Starting with 5 minutes of adaptation, the observer was seated in a dark room and looking at a scene randomly chosen from the set of test scenes while receiving instructions.

The observer was asked to change their gaze position such that they would always maintain a fixed distance of 40 cm to the screen and look perpendicularly at the center of the reference and the test stimulus with binocular view, one at a time. The task given to the observer was to adjust the brightness of the test stimulus such that it could be visually perceived as equally bright as the reference stimulus. To perform the brightness matching task, the observer informed the experimenter whether they preferred to adjust the stimulus intensity with a keyboard themselves or they would rather instruct the experimenter orally. The screen is controlled with the 10-bit signal and the observer had the option to change the stimulus brightness with a coarse adjustment (5 RGB levels) and by fine-tuning (1 RGB level). These increment steps allow the observers to perform the adjustments with consistent perceptual brightness steps using the coarse adjustments while still having an option to fine-tune the match. The average time for each observer to finish one matching session was between 20 and 45 minutes.

The observer panel included 20 subjects (13 males and 7 females) aged between 21 and 61 years old with an average of 30.5 years. All observers had normal or corrected to normal color vision as tested by the Ishihara 24-plate test.

Once the matching result was obtained for each observer, the spectral radiance of the matched stimulus was measured with a JETI Specbos 1211 spectroradiometer. The measured area covered the central 25% of the stimulus area. Good uniformity of less than 2% difference between the minimum and the maximum luminance values was found; the screen did not show pixel cross-talk. To obtain the spectral radiance of the thin ring, the measurement was performed on an arbitrary larger area on the display having the same RGB values as the ring, as the thickness of the ring was smaller than the measurement spot of the spectroradiometer.

The short, medium and long cone-weighted and scaled spectral radiance values ${L_\rho },{L_\gamma },{L_\beta }$ of the matched stimulus are computed from the measured spectral radiance ${L_{e,\lambda }}(\lambda )$ and the set of cone fundamentals $\overline {{l_{10}}} ,\overline {{m_{10}}} ,\overline {{s_{10}}}$ for 10° stimuli as provided by the CIE in 2006 [3739]. The normalization coefficients were chosen such that for a D65 stimulus, the cone-weighted and scaled spectral radiance values are identical and equal to the CIE 2006 10° luminance value:

$$\begin{aligned} {L_\rho } &= 686.7\int\limits_{390}^{830} {{L_{e,\lambda }}(\lambda )\overline {{l_{10}}} (\lambda )d\lambda } \\ {L_\gamma } &= 768.3\int\limits_{390}^{830} {{L_{e,\lambda }}(\lambda )\overline {{m_{10}}} (\lambda )d\lambda } \\ {L_\beta } &= 1366.1\int\limits_{390}^{830} {{L_{e,\lambda }}(\lambda )\overline {{s_{10}}} (\lambda )d\lambda } \end{aligned}$$

The cone-weighted spectral radiance values of the reference stimulus and the ring are calculated similarly.

As the stimuli and the rings have a chromaticity close to that of the D65 illuminant, their three cone-weighted values are almost identical with a difference of less than 2% between the minimum and the maximum values. Therefore, the arithmetic mean, denoted as ${L_\alpha }$, is used to describe the test and the reference stimulus ${L_{\alpha ,test}}$ and ${L_{\alpha ,ref}}$, and of the ring ${L_{\alpha ,ring}}$, respectively. As long as neutral stimuli are targeted, this approach allows a reduction of the number of cone-weighted input values for the model [40]. As the cone-weighted spectral radiance value is a measure for the absorption rate of photons in the cones of the retina, these values can be considered as cone excitations.

3. Results

3.1 Observer variability

The average intra- and inter-observer variability were calculated by taking the arithmetic mean over the observers of the standardized residual sum of squares (STRESS) obtained for each observer. The value can be used to analyze the agreement between two sets of data, where two sets with the perfect agreement would result in a STRESS value of zero [41]. The inter-observer variability was calculated as the average of STRESS between the data collected by each individual observer to the average observer, and the intra-observer variability was calculated as the average of STRESS between the matches of the repeated scenes from each individual observer. The STRESS values were also calculated to check the agreement between the data collected from different reference positions, starting luminance levels and the matching error variability.

The average intra- and inter-observer variability are calculated to be 18% and 21%, respectively, which are similar ( [24,40,42,43]) or better ( [44,45]) than the results from previous literature regarding brightness experiments using matching or magnitude estimation methods. The matching error average of roughly 12% indicates a baseline of the matching reliability.

To check if the choice of the reference position and the starting luminance level have an influence on the brightness matching results, a Kruskal-Wallis test was done, which shows that there are statistical differences between the dataset collected from different reference positions (H(1) = 13.14, p-value = .0007) and from different starting luminance levels (H(1) = 34.41, p-value = 4.47e-09).

Finally, an average matching result for each test scene done by each individual observer is obtained by taking the arithmetic mean of ${L_\alpha }$, of the matched stimuli from different test scene positions and different starting luminance levels. Then, the arithmetic mean of the average results from all individual observers is computed for each test scene to present the final average matching result.

3.2 Brightness matching results

The ratio of the average cone excitation of the test stimulus ${L_{\alpha ,test}}$ over the cone excitation of the reference stimulus ${L_{\alpha ,ref}}$ (surrounded by a dark background) as a function of the gap or distance from the ring to the stimulus is illustrated in Fig. 2.

 figure: Fig. 2.

Fig. 2. The ratio of the average cone weighted spectral radiance of the matched stimuli as a function of angular distance from the stimulus edge to the ring with different ring thicknesses and ring luminance levels: (a) Without error bars. (b) With error bars. The error bars represent the standard error.

Download Full Size | PDF

As ${L_{\alpha ,test}}$ is always higher than ${L_{\alpha ,ref}}$, the results show that the observers always select a higher radiance of the test stimulus to obtain a brightness match with the reference stimulus. This implies that by introducing a luminous ring in the visual field, whatever the luminance, gap or thickness, the perceived brightness of the center stimulus will darken. The influence of the ring luminance is occurrent as higher ring luminance levels result in a higher matching value, indicating a higher inhibiting effect. In addition, the general tendency is that when the ring is at a closer distance, the impact of the ring on the stimulus brightness is higher than when the ring is at a further distance. Even at the furthest distance of 16.1°, the ratio ${L_{\alpha ,test}}/{L_{\alpha ,ref}}$ is still greater than 1. However, the impact of distance seems to disappear when the ring luminance level is low, as the matching results remain almost the same for all distances.

Finally, the thickness of the ring is shown to have a certain impact on the perceived brightness of the central stimulus. The effect is highest at the closest ring distance. It becomes obvious that the rings with a thickness of 0.67 cm and 1.00 cm have quite a similar effect, while the impact of the ring with the smallest thickness (0.33 cm) is much less. This was confirmed by a Kruskal-Wallis test to check the statistical difference between the dataset obtained with the ring thickness of 0.33 cm and the ring thickness of 1.00 cm (H(1) = 3.85, p-value = .049), and between the dataset obtained with the ring thickness of 0.67 cm and the ring thickness of 1.00 cm (H(1) = 0.05, p-value = .82).

4. Modelling

Based on the experimental results, an image-based brightness model inspired by the physiology of the retina is proposed to predict the impact of distance, the area and the luminance of the ring on the brightness perception of the stimulus. Physiological based retinal models [46,47] adopt the following workflow: the cone excitation is compressed and the output is transmitted directly to a bipolar cell, representing the center signal. The horizontal cells connect several adjacent photoreceptors within the receptive field of the bipolar cells, creating a surround signal. The center signal and surround signal are subtracted and transmitted. Similar processes occur in the retina’s inner plexiform layer leading to a response of the ganglion cells which is sent to the brain via the optic nerve [6]. This general physiological based workflow is the main inspiration for the image-based brightness model that is described in the following section.

4.1 Proposed model framework

In this section, an image-based and retinal inspired model is proposed according to the framework illustrated in Fig. 3. The model starts from the pixel by pixel ${L_\alpha }$ map. In the first processing step, scattering in the eye is modelled, resulting in a slightly blurred image. This step is followed by a compression, the calculation of a receptive field based feedback signal and an adaptive shift. Finally, a brightness related correlate is calculated.

 figure: Fig. 3.

Fig. 3. The image-based brightness prediction framework illustrated by grayscale images (with the values in each image scaled between the minimum and the maximum pixel values of the image). The stimulus size and the ring size are adjusted for illustration purpose. The * symbol represents the convolution of the image with a filter kernel.

Download Full Size | PDF

For the development of the model, a set of images has been created corresponding to each experimental scene. A resolution of 1280 ${\times}$ 675 pixels, corresponding to an angular resolution of around 4’, has been chosen. The local ${L_\alpha }$ value (${L_{\alpha ,test}}$, ${L_{\alpha ,ref}}$, ${L_{\alpha ,ring}}$ and ${L_{\alpha ,dark}} = 0$) is attributed to each corresponding pixel. The central stimulus used in the experiments corresponds to a diameter of 135 pixels in the image. This image is called the average cone excitations map ${I_\alpha }(x,y)$ (Fig. 4).

 figure: Fig. 4.

Fig. 4. An example of the input image: (a) The full image. (b) The value of each pixel from a line cut through the image.

Download Full Size | PDF

4.2 Impact of scattering in the eye

The presence of a luminous ring in the vicinity of a circular stimulus can also generate a veiling luminance at the stimulus, which might influence the matching results and which needs to be corrected for. In fact, in a human eye, a point source is not imaged on the retina as a single point but is spread out. This is caused by several optical effects of the eye [48]. The most common way to describe the distribution of these scattering effects is by introducing the point spread function (PSF) as defined by CIE [49]:

$${0.82}{$\begin{aligned} & PSF(\vartheta ) = \frac{{{L_{eq}}(\vartheta )}}{E} = [1 - 0.08 \cdot {(A/70)^4}] \cdot \left[ {\frac{{9.2 \cdot {{10}^6}}}{{{{[1 + {{(\vartheta /0.0046)}^2}]}^{1.5}}}} + \frac{{1.5 \cdot {{10}^5}}}{{{{[1 + {{(\vartheta /0.045)}^2}]}^{1.5}}}}} \right] + \\ & + [1 + 1.6 \cdot {(A/70)^4}] \cdot \left\{ {\left[ {\frac{{400}}{{1 + {{(\vartheta /0.1)}^2}}} + 3 \cdot {{10}^{ - 8}}} \right] + p \cdot \left[ {\frac{{1300}}{{{{[1 + {{(\vartheta /0.1)}^2}]}^{1.5}}}} + \frac{{0.8}}{{{{[1 + {{(\vartheta /0.1)}^2}]}^{0.5}}}}} \right]} \right\} + 2.5 \cdot {10^{ - 3}} \cdot p \end{aligned}$}$$
in which E is the corneal illuminance generated by a central point source, ${L_{eq}}$ the equivalent luminance, the polar angle between the direction of the point source and the location of ${L_{eq}}$, A the observer’s age and p a pigmentation factor. The equivalent luminance is the luminance in the object scene that has the same visual effect on the retina in a perfect eye as the effect caused by scattering in a non-perfect eye [50]. The PSF varies over 8 decades from 0 to 10°; light scattering beyond 1° is called stray light [48].

Note that PSF is expressed in units $s{r^{ - 1}}$ and that the values are normalized. Indeed, the illuminance at the cornea created by the original point source should be equal to the illuminance generated by all the equivalent sources characterized by their ${L_{eq}}$. This illuminance can be calculated from basic photometry as:

$$E = \int\!\!\!\int {{L_{eq}}} (\vartheta )\cos \vartheta .d{\Omega _{src}}$$
with ${\Omega _{src}}$ the solid angle subtended by the source area.

By applying the definition of PSF in Eq. (2), Eq. (3) can be written as:

$$1 = \int\!\!\!\int {PSF} (\vartheta )\cos \vartheta .d{\Omega _{src}}$$

Equation (4) expresses the normalization condition for the normalization condition for the PSF as applied by the CIE.

The model presented in this paper starts from the cone-weighted spectral radiance value ${L_\alpha }$ of each pixel while the PSF is defined in photometric quantities (Eq. (2)). This inconsistency can be easily solved because the PSF is in most cases only slightly wavelength dependent [51]. Under this assumption, the PSF can equivalently be defined as the ratio of the cone-weighted spectral radiance ${L_{\alpha ,eq}}$ and irradiance values ${E_\alpha }$.

In an image-based approach, scattering can be implemented by a convolution of the original ${L_\alpha }$ map with the PSF kernel defined per pixel. A similar concept has been adopted in the field of computer graphics to render highly realistic scenes [5255], as well as to model how images are formed in the retina [5658]. When applying the PSF proposed by the CIE in Eq. (2) as a filter kernel, discretization and truncation is required. For our implementation, the PSF is put to zero when $\vartheta > 10^\circ $ as the change in the values outside of that range is not significant. Consequently, to keep the corneal illuminance unchanged inside the kernel, a renormalization is also required and Eqs. (34) can be written as:

$$\begin{aligned} &{E_\alpha } = \sum\limits_x {\sum\limits_y {\frac{{{L_{\alpha ,eq}}(x,y){{\cos }^2}\vartheta (x,y){A_{pix}}}}{{{D^2}(x,y)}}} } \\ &\textrm{which is equivalent to}\\ &1 = \sum\limits_x {\sum\limits_y {\frac{{PSF^{\prime}(x,y){{\cos }^2}\vartheta (x,y){A_{pix}}}}{{{D^2}(x,y)}}} } \end{aligned}$$
with ${A_{pix}}$ as the area of each pixel, $D(x,y)$ as the distance from pixel at position $(x,y)$ to the observer’s eye (in this experiment is at 40 cm distance) and $PSF^{\prime}$ as the re-normalized and re-scaled $PSF$. When transforming Eqs. (34) to Eq. (5), the classical expression of the solid angle of one pixel has been used.

The complete procedure to correct for stray light is as follows: for each pixel under consideration, ${E_\alpha }$ generated from the pixel is calculated and multiplied by the $PSF^{\prime}$ to calculate an intermediate image ${L_{\alpha ,eq}}$ which represents the equivalent cone weighted spectral radiance map generated by stray light from the pixel under consideration to the neighboring pixels. This is repeated for all pixels and all intermediate images are added to obtain the final image ${L_{\alpha ,eq}}$.

The effect of the convolution with $PSF^{\prime}$ is illustrated in Fig. 5 when applied on a uniform stimulus subtending 10° with ${L_{\alpha ,stim}} = 50$ and surrounded by a high luminance ring of thickness 1° and located at 6.2° off-center (a gap distance of 1.2°) and with ${L_{\alpha ,ring}} = 100$.

 figure: Fig. 5.

Fig. 5. The effect of convolving the input image with a PSF kernel: The top images show the full images, and the bottom images show the changes throughout a line in the full image. (a): The original input; (b) After convolution with PSF.

Download Full Size | PDF

The net effect of the convolution is to slightly decrease the original pixel values of the ring and the central circle (4% and 0.05%, respectively) and to generate some values within the gap (up to 0.8% of the original pixel values of the ring). Note that both circle and ring are generating stray light which partly compensate each other. A fixed kernel has been used over the whole image for simplicity, while strictly spoken the kernel and the normalization should be repeated for each pixel in the image. The impact of this simplification has been checked and is found to be only minor (around 2% difference).

In implementing our proposed brightness model, the correction for retinal straylight is applied by convolving ${I_\alpha }(x,y)$ with a $PSF^{\prime}$ kernel, defined with an observer age of 35 and a pigmentation factor of 0.5. No attempt was made to use an observer-specific kernel as the effect of the convolution is only minor for the experimental scenes under consideration. With the current image resolution, the kernel size width of 10° to each side of the center of the FOV (or 20° in total) corresponds to 261${\times}$ 261 pixels. The normalization factor for the $PSF^{\prime}$ was found to be equal to 0.0564. The cone excitations after straylight correction is denoted as ${L_{\alpha ,eq}}$ and the corresponding map is ${I_{\alpha ,eq}}(x,y)$.

4.3 Compression of cone responses

A non-linear compression of the cone responses is widely believed to be one of the earliest steps in visual processing [10,46,59,60]. The cone compression signals are commonly modelled with a sigmoidal curve [9,10] or using a cubic root [17]. In this step of the model, the equivalent cone excitation values are compressed using a cubic power function. This compression has shown to perform well in CAM15u [17] and avoids the issues for a complete dark pixel (${L_\alpha }$= 0) when using a log compression:

$${L_{\alpha ,c}} = {L_{\alpha ,eq}}^{1/3}$$

The image containing ${L_{\alpha ,c}}$ values of each pixel can be considered as a compressed cone excitation map designated as ${I_{\alpha ,c}}(x,y)$.

4.4 Receptive field response

The signal generated by the central pixel under consideration ${L_{\alpha ,c}}(x,y)$ is considered as the center signal of the receptive field, reflecting the fact that near the fovea, a one-to-one connection from the cone to the bipolar cell is assumed [61]

The surround feedback signal strength from the receptive field (representing the horizontal cell connection) is modelled as a weighted Gaussian response generated by the neighboring pixels, using a fixed standard deviation. To this extend, a Gaussian filtered image is computed by convolving ${I_{\alpha ,c}}(x,y)$ with a Gaussian kernel where the Gaussian kernel $G(x,y)$ at pixel $(x,y)$ is expressed as:

$$G(x,y) = WF.\frac{{{e^{ - \textrm{ }\frac{{{x^2} + {y^2}}}{{2{\delta ^2}}}}}}}{k}$$
with k being a normalization factor such that the sum of all elements inside the discrete Gaussian kernel is 1. The discrete Gaussian kernel is truncated at 4$\times$δ, which corresponds to the width of the receptive field; WF models the overall strength of the feedback from the receptive field with regard to the central contribution. Both δ and WF are parameters to be optimized. The resulting image is designated as ${I_{\alpha ,G}}(x,y)$

4.5 Adaptive shift and brightness output

To calculate a brightness correlate, a sigmoid function is applied as in the classical CAMs. The semi-saturation constant consists of a dark-adapted value ${\sigma _0}$ and the feedback signal strength is added as an adaptive shift, lowering the original central pixel output and modelling the inhibitory effect. This results in a brightness image ${I_Q}$ with brightness values between 0 and 1:

$${I_Q}(x,y) = \frac{{I_{\alpha ,c}^n(x,y)}}{{I_{\alpha ,c}^n(x,y) + {{({\sigma _0} + {I_{\alpha ,G}}(x,y))}^n}}}$$
with n a parameter modelling the steepness of the sigmoid.

4.6 Determining the model parameters

The parameters included in the model are the width of the receptive field δ, the strength of the feedback from the receptive field WF, the semi-saturation constant in dark conditions ${\sigma _0}$ and the steepness of the sigmoid n. All the steps of the model have been applied for both the reference scene and the 36 test scenes. As the experimental data were collected using the method of adjustment (brightness matching), an optimization should be performed to find the optimal δ, WF and ${\sigma _0}$ value - such that the mean brightness values of the pixels belonging to the stimulus in the output image ${I_Q}$ of the test scene would be as close as possible to those of the stimulus in the reference scene. Due to this approach, the value of n cannot be determined from the experiments. Given that the choice of n does not influence the final optimization result, the value of n is chosen as n = 0.58 as suggested from CAM18sl [16].

From Fig. 2, it is observed that even for the furthest ring with the thickness of 0.33 cm, there is still an impact of the ring on the brightness of the central stimulus. For this reason, the optimization range of δ is chosen such that the width of the Gaussian kernel can cover the furthest ring; the maximum width of the kernel was taken as wide as the smaller dimension of the image. The optimization was performed using the MATLAB built-in function and the result indicates that for a δ of 151 pixels, a WF of 1.9, and a ${\sigma _0}$ of 3.0, the model gives the best approximation to all the experimental data.

The mean ${I_Q}$ values of the pixels inside the stimulus for the reference and for the test scenes using the optimal parameters are plotted in Fig. 6.

 figure: Fig. 6.

Fig. 6. Output of IQ as a function of distance with regard to different ring luminance levels and ring widths

Download Full Size | PDF

Ideally, for a matching experiment, the ${I_Q}$ values of the test scenes in which gap distance, luminance and width of the ring are changing should be equal to the value of the reference scene. Overall, the model works very well, with a root mean squared error (RMSE) of 0.0018 (which counts for 0.4% of the reference stimulus brightness) indicating the root mean squared error in the brightness output of the test scene when compared to the reference scene brightness. However, for the highest ring luminance level and the closest ring distance, the model’s output generally underestimates the ring’s influence resulting in a slightly higher ${I_Q}$ value than the reference. Note that by adding the straylight correction to the model, a significant improvement in the model’s performance is found when predicting the impact of the closest ring at higher luminance levels (RMSE = 0.0029 without straylight correction). The impact of stray light will become even more important when studying scenes containing high luminance areas, which also justifies the inclusion of this step in the model. Nevertheless, this straightforward model with a limited number of free parameters is able to predict the matching experiments.

An example of the brightness output map is given in Fig. 7.

 figure: Fig. 7.

Fig. 7. The brightness output map: (a) The full image. (b) The value of each pixel from a line cut through the image.

Download Full Size | PDF

5. Discussion

5.1 Receptive field size

The optimized kernel standard deviation δ of 151 pixels implies that the width of the Gaussian kernel stopped at 4${\times}$δ is 604 pixels or approximately 45° in FOV. Typically, the receptive fields in the retina have a relatively small size, ranging from a few arc minutes (near the fovea) to a bit more than 10° in the periphery [62]. This does not fit with the large filter kernel size from the optimization, but the large filter size might correspond to the receptive field in the visual cortex, where the receptive field in the medial superior temporal area can reach to the size of 30°–50° [63,64]. The large receptive field width is also coherent with the conclusion from previous studies [7,8,13], where a filter with a large size ranging from a half to the full size of the input image is implemented to account for the luminance adaptation. This suggests that the process of perceiving brightness is determined by a large-scale adaptation, which considers the context of the whole environment in which the stimulus is viewed. With such a wide receptive field, the effect of involuntarily eye movements will be masked as this contribution involves blurring on a much smaller scale of typically 1° in FOV [56]. Note that the wide receptive field is also the reason that the ${I_G}$ and ${I_Q}$ values do not change very much over the stimulus area itself and no distinctive brightness jumps at the edges are observed, in line with the visual observations (Fig. 7).

From Fig. 6, a slight under-performance of the model when the rings are presented at the closest distance can be observed. This suggests that at such small distances, the impact might not simply be considered as an adaptation state but other effects such as simultaneous contrast should also be taken into consideration. Moreover, one should also consider that when the ring comes closer to the fovea, the receptive field size is also smaller near the fovea and the inhibition becomes stronger [62]

Additionally, it is noticeable from Fig. 7 that the ${I_Q}$ values for the pixels belonging to the dark gap between the ring and the stimulus are no longer completely dark and they receive a rather significant brightness value. However, the dark gap between the ring and the stimulus was not reported as luminous by the observers. Again, it appears that at the closer distance, a stronger contrast perception effect might be active, suggesting that an additional receptive field mechanism characterized by a smaller size should probably be considered.

5.2 Self-adaptation

The ${I_Q}$ value of the stimulus in the reference situation (complete dark background) is very close to 0.5, which points to the fact that the ${I_{\alpha ,c}}(x,y)$ values of the pixels of the reference are close to the sum of the dark semi-saturation value ${\sigma _0}$ and the feedback signal strength ${I_{\alpha ,G}}(x,y)$ which is only due to the stimulus itself. If one would ignore this feedback (by putting WF equal to zero), ${I_Q}$ values equal to around 0.55 would have been obtained. This illustrates the effect of adaptation to the stimulus itself.

5.3 Validation of the model

As the model is established based on rather basic and simple non-uniform backgrounds, i.e. a luminous ring in a dark area, it would be good to verify the performance of the model by applying the model to a few existing datasets from other experiments which were performed to study the perception of brightness for self-luminous stimuli. Sun et al. [24] presented a study about the influence of background luminance and background size on the brightness perception. In their study, a brightness evaluation experiment using magnitude estimation method was performed. Within the scope of the study, the background was defined as the area immediately adjacent to the stimulus and the surround is the remaining area of the screen used in the experiment, which was adjacent to the background. The experiment was set up with three different stimulus luminance levels (19, 88 and 227 cd/m2), 3 background luminance levels (0.09, 88 and 478 cd/m2), 2 surround luminance levels (0.09 and 478 cd/m2), 4 background sizes (0%, 12.5%, 50% and 100% of the screen size) and 3 background orientations (horizontal rectangle, vertical rectangle and 16:9 square).

Based on their experimental details, a set of virtual images were created with the resolution of 1280 675 pixels and a central circular stimulus with a diameter of 52 pixels corresponding to a FOV of 4° when viewed from a distance of 40 cm. As the stimuli and the backgrounds described in these papers are neutral grey, the ${L_\alpha }$ values for the input images were chosen to be equal to the luminance levels of the stimuli and the background described in the corresponding references. Based on the nature of the experiment in our study, only the situations with luminous background and dark surround are used in this evaluation. ${L_\alpha }$ of the stimulus was chosen as 19, 88 and 227, and ${L_\alpha }$ of the background was chosen as 478. The result of the simulation is shown in Fig. 8.

 figure: Fig. 8.

Fig. 8. The brightness output under different stimulus luminance and background sizes with a background luminance of 478 cd/m2: (Above) The psychophysics experimental result. (Below) The output of our proposed model.

Download Full Size | PDF

With a Spearman’s ranking correlation coefficient of 0.8281 when comparing the model’s output with the visual data of Sun et al., there seems to be a reasonably good agreement between the two datasets. It is observed that the model is generally capable of predicting the decrease in perceived brightness when the luminous background increases in size. The stabilization in the perceived brightness when the background size enlarges from 50% to 100% of the screen size is also predicted, however, in the original study, the most significant brightness decrease happened when the background size went from 0% to 12.5% of the screen, an effect which is less pronounced in the model prediction. This again suggests that an additional smaller receptive field-based interaction represented as a filter with a smaller standard deviation might be needed in the model, emphasizing short distance effects. According to Sun et al. [24], there is almost no effect of background orientation for all background sizes. This is confirmed by the model as long as the background is large (50% or 100% of the image). When the background size is at 12.5% of the screen size, the model predicts a brightness difference of 0.1 when changing from vertical to horizontal, and from horizontal to squared background. This is possibly due to the changes in the number of luminous pixels contributing to the kernel with changing background orientation when convolving the image with a large filter kernel.

The concept of a neutral luminous ring in a dark background can be easily extended to a neutral and uniform luminous background by increasing the width of the ring. Under these conditions, the model CAM18sl should be applied. In establishing this model, Hermans et al. [40] performed experiments in which a uniform stimulus of 10° was seen on a uniform self-luminous background. Six stimulus luminance levels were chosen as 50, 125, 250, 500, 750 and 900 cd/m2 and 15 background luminance levels were chosen between 0 and 960 cd/m2. The brightness of the stimulus is evaluated by magnitude estimation with respect to a reference stimulus; 20 observers participated in the experiment.

Based on the experimental details described in the paper, a set of virtual images were created with the resolution of 1280 ${\times}$ 675 pixels and a central circular stimulus with a diameter of 135 pixels corresponding to a FOV of 10° when viewed from a distance of 40 cm. The ${L_\alpha }$ values chosen for the stimulus and the background were the same values as the luminance values of the stimulus and the background in the experiment by Hermans et al. [40] . In total, 90 virtual scenes were used for this evaluation. The output of the model in relation to the background luminance was compared to the experimental brightness data Qobs (Fig. 9). In Fig. 10, the model prediction is compared to the outcome of the visual experiment results and to the prediction of CAM18sl.

 figure: Fig. 9.

Fig. 9. Brightness output for the tested scenes from: (a) Experimental results of Hermans et al. [40]; (b) Brightness prediction of our model

Download Full Size | PDF

 figure: Fig. 10.

Fig. 10. The brightness output of the model as a function of the brightness from (a) Experimental data; (b) CAM18sl prediction The icons inside the red circles correspond to the unrelated stimuli.

Download Full Size | PDF

From Fig. 9, it can be seen that the image-based model is able to showcase the dependency of perceived brightness on stimulus and background luminance: when the background luminance increases, the brightness of the stimulus decreases, and the stimulus brightness increases with increasing stimulus luminance. The model also succeeds in predicting the sharp fall in brightness when the background goes from completely dark to 50 cd/m2. From Fig. 10, the proposed model also has a good agreement with the output from the observers and from CAM18sl with the Spearman’s ranking correlation coefficients of 0.9870 and 0.9987, respectively. However, there is a slight overestimation in brightness for unrelated stimuli or, alternatively, a systematic underestimation of the scenes with a luminous background. It is worth noting that the model is developed based on rather small luminous background areas; applying the model to scenes with a very large background size, as is the case for the Hermans et al. data, is quite challenging.

It is also observed that there is no linear relationship between the output of our model and that of CAM18sl, which is believed to be the result of 2-step compression used in our model, while CAM18sl is only using a single compression step.

6. Conclusion

Various CAMs have been developed to predict how humans perceive colors from the optical input of the stimuli. However, with the traditional CAMs, the applications are still limited to a uniform stimulus and background [16]. To extend the applications of such CAMs to non-uniform backgrounds, image-based CAMs have been created, yet, there are still some shortcomings when they are applied to self-luminous scenes [14]. This leads to the need of a comprehensive image-based color appearance model which can overcome the limitations of current CAMs and image CAMs when working with self-luminous scenes; such a model could be called a Lighting Appearance Model (LAM). To move towards developing a LAM, we believe that the first step is to create a comprehensive brightness model for non-uniform backgrounds including self-adaptation.

In this paper, a series of visual brightness experiments have been conducted to study the impact of introducing a luminous ring-shaped area to a dark background on the brightness perception of a central stimulus. Three parameters of the luminous ring have been studied, including the distance from the central stimulus to the ring, the thickness of the ring and the luminance of the ring. In line with various studies, it is clearly shown that by adding a luminous ring to the scene, the brightness of the central stimulus decreases substantially, even for the smallest thickness (0.33 cm) and the largest stimulus-to-ring distance (16.1°). This phenomenon appears to be stronger when the ring is closer to the stimulus and the impact also increases with the increasing luminance of the ring. The study also confirmed the area effect [24], which was already reported in the literature: as the area of the luminous ring increases, the target stimulus appears to be darker.

An image-based model to simulate the observed phenomenon is proposed, which is highly inspired by the basic physiology of the retina. The model includes cube root compression, scattering in the eye, a receptive field concept, inhibition by neighboring pixels and sigmoid compression. The model has been applied to the experimental brightness data and three parameters are optimized: the width of a Gaussian kernel mimicking the surround signal of the receptive field, the overall weighting factor representing the inhibition strength and the semi-saturation constant for a dark-adapted environment. A standard deviation of 151 pixels representing a receptive field with the coverage up to 45° and a weighting factor of 1.9, shows the best performance of the model. The large receptive field width suggests that the brightness perception and the adaptation to the ring might be the result of the processing at the later stage of the visual pathway such as in the visual cortex, where the receptive field has a large size up to 50° [63,64]. This large filter size is also in line with previous findings [7,8,13], which supports the idea that the adaptation is mainly linked to the global context where the stimulus is viewed. The model provides a generally good performance in predicting the effect of the area, the distance and the intensity of the ring, though there is some underestimation of the effects with the closest ring distance, which suggests the need of having an additional but smaller receptive field mechanism in the model to simulate more local effects.

Additionally, the performance of the model is evaluated using self-luminous scenes created based on previous studies about brightness for self-luminous stimuli [24,40]. The results show that the model can predict the perceived brightness behavior of those studies, illustrating its robustness. As such, the proposed image-based model appears promising to deal with non-uniform stimuli and complex scenes and can be considered as an important step in search of a generic LAM.

Future work will concentrate on considering adapting the size and the weighting factor of the Gaussian kernel applied to both the center and the surround signal, according to the retinal position, to consider both short- and long-range receptive field sizes, to include a larger range of luminance levels and to include colored stimuli and backgrounds.

Funding

Onderzoeksraad, KU Leuven (C24/17/051); KU Leuven Internal Funds

Disclosures

The authors declare no conflicts of interest.

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. Y. Nayatani, K. Takahama, and H. Sobagaki, “Prediction of color appearance under various adapting conditions,” Color Res. Appl. 11(1), 62–71 (1986). [CrossRef]  

2. R. W. G. Hunt, The Reproduction of Colour (John Wiley & Sons, Ltd, 2004).

3. M. R. Luo and R. W. G. Hunt, “The structure of the CIE 1997 Colour Appearance Model (CIECAM97s),” Color Res. Appl. 23(3), 138–146 (1998). [CrossRef]  

4. N. Moroney, M. D. Fairchild, R. W. Hunt, C. Li, M. Ronnier Luo, and T. Newman, “The CIECAM02 color appearance model,” in 10th Color and Imaging Conference Final Program and Proceedings (2002), pp. 23–27.

5. C. Li, Z. Li, Z. Wang, Y. Xu, M. R. Luo, G. Cui, M. Melgosa, M. H. Brill, and M. Pointer, “Comprehensive color solutions: CAM16, CAT16, and CAM16-UCS,” Color Res. Appl. 42(6), 703–718 (2017). [CrossRef]  

6. M. D. Fairchild, Color Appearance Models, 2nd ed. (John Wiley & Sons, Ltd, 2013).

7. M. D. Fairchild and G. M. Johnson, “Meet iCAM: A next-generation color appearance model,” in 10th Color and Imaging Conference Final Program and Proceedings (2002), pp. 33–38.

8. J. Kuang, G. M. Johnson, and M. D. Fairchild, “iCAM06: A refined image appearance model for HDR image rendering,” Journal of Visual Communication and Image Representation 18(5), 406–414 (2007). [CrossRef]  

9. E. Reinhard, T. Pouli, T. Kunkel, B. Long, A. Ballestad, and G. Damberg, “Calibrated image appearance reproduction,” ACM Trans. Graph. 31(6), 1–11 (2012). [CrossRef]  

10. L. Meylan, D. Alleysson, and S. Süsstrunk, “Model of retinal local adaptation for the tone mapping of color filter array images,” J. Opt. Soc. Am. A 24(9), 2807 (2007). [CrossRef]  

11. A. Benoit, D. Alleysson, J. Herault, and P. Le Callet, “Spatio-temporal tone mapping operator based on a retina model,” Computational Color Imaging 2412–22.(2009). [CrossRef]  

12. Ø. Kolås, I. Farup, and A. Rizzi, “Spatio-temporal retinex-inspired envelope with stochastic sampling: A framework for spatial color algorithms,” Journal of Imaging Science and Technology Vol. 55, 4050310–4050311 (2011). [CrossRef]  

13. E. Provenzi, “Perceptual color correction: A variational perspective,” in Computational Color Imaging (Springer Berlin Heidelberg, 2009), 109–119.

14. T. H. Phung, F. B. Leloup, K. A. G. Smet, and P. Hanselaer, “Assessing the application of an image color appearance model to basic self-luminous scenes,” Color Res. Appl. 44(6), 848–858 (2019). [CrossRef]  

15. E. H. Land, “The Retinex Theory of Color Vision,” Sci. Am. 237(6), 108–128 (1977). [CrossRef]  

16. S. Hermans, K. A. G. Smet, and P. Hanselaer, “Color appearance model for self-luminous stimuli,” J. Opt. Soc. Am. A 35(12), 2000–2009 (2018). [CrossRef]  

17. M. Withouck, K. A. G. Smet, W. R. Ryckaert, and P. Hanselaer, “Experimental driven modelling of the color appearance of unrelated self-luminous stimuli: CAM15u,” Opt. Express 23(9), 12045 (2015). [CrossRef]  

18. S. Hermans, K. A. G. Smet, and P. Hanselaer, “Exploring the applicability of the CAM18sl brightness prediction,” Opt. Express 27(10), 14423–14436 (2019). [CrossRef]  

19. O. U. Preciado, A. Martin, E. Manzano, K. A. G. Smet, and P. Hanselaer, “CAM18sl brightness prediction for unrelated saturated stimuli including age effects,” Opt. Express 29(18), 29257–29274 (2021). [CrossRef]  

20. Y. Lin, Y. Liu, Y. Sun, X. Zhu, J. Lai, and I. Heynderickx, “Model predicting discomfort glare caused by LED road lights,” Opt. Express 22(15), 18056–71 (2014). [CrossRef]  

21. P. Whittle, “Brightness, discriminability and the “Crispening Effect,”,” Vision Res. 32(8), 1493–1507 (1992). [CrossRef]  

22. P. Whittle and P. D. C. Challands, “The effect of background luminance on the brightness of flashes,” Vision Res. 9(9), 1095–1110 (1969). [CrossRef]  

23. J. C. Stevens, “Brightness inhibition re size of surround’,” Percept. Psychophys. 2(5), 189–192 (1967). [CrossRef]  

24. P. L. Sun, H. C. Li, and M. Ronnier Luo, “Background luminance and subtense affects color appearance,” Color Res. Appl. 42(4), 440–449 (2017). [CrossRef]  

25. H. Leibowitz, F. A. Mote, and W. R. Thurlow, “Simultaneous contrast as a function of separation between test and inducing fields,” J. Exp. Psychol. 46(6), 453–456 (1953). [CrossRef]  

26. R. Carter, L. Sibert, J. Templeman, and J. Ballas, “Luminous backgrounds and frames affect gray scale lightness, threshold, and suprathreshold discriminations,” J. Exp. Psychol. Appl. 5(2), 190–204 (1999). [CrossRef]  

27. R. Clay Reid and R. Shapley, “Brightness induction by local contrast and the spatial dependence of assimilation,” Vision Res. 28(1), 115–132 (1988). [CrossRef]  

28. S. K. Shevell, I. Holliday, and P. Whittle, “Two separate neural mechanisms of brightness induction,” Vision Res. 32(12), 2331–2340 (1992). [CrossRef]  

29. F. Kingdom and B. Moulden, “A multi-channel approach to brightness coding,” Vision Res. 32(8), 1565–1582 (1992). [CrossRef]  

30. J. A. McArthur and B. Moulden, “A two-dimensional model of brightness perception based on spatial filtering consistent with retinal processing,” Vision Res. 39(6), 1199–1219 (1999). [CrossRef]  

31. J. J. McCann and R. Savoy, “Measurements of lightness: dependence on the position of a white in the field of view,” in Human Vision, Visual Processing, and Digital Display II (SPIE, 1991), Vol. 1453, pp. 402–411.

32. M. E. Rudd, “Lightness computation by the human visual system,” J. Electron. Imaging 26(3), 031209 (2017). [CrossRef]  

33. D. H. Brainard, “The Psychophysics Toolbox,” Spat. Vis. 10(4), 433–436 (1997). [CrossRef]  

34. D. G. Pelli, “The VideoToolbox software for visual psychophysics: Transforming numbers into movies,” Spat. Vis. 10(4), 437–442 (1997). [CrossRef]  

35. K. m, H. Brainard, and D. G. Pelli, “What’s new in Psychtoolbox-3?,” Perception 36, ECVP Abstract Supplement (2007).

36. MATLAB, 2019. version 9.6.0 (R2019a), Natick, Massachusetts: The MathWorks Inc.

37. International Commission on Illumination., Fundamental Chromaticity Diagram with Physiological Axes. Part 1. (Commission internationale de l’eclairage, 2006).

38. A. Stockman, L. T. Sharpe, and C. Fach, “The spectral sensitivity of the human short-wavelength sensitive cones derived from thresholds and color matches,” Vision Res. 39(17), 2901–2927 (1999). [CrossRef]  

39. A. Stockman and L. T. Sharpe, “The spectral sensitivities of the middle- and long-wavelength-sensitive cones derived from measurements in observers of known genotype,” Vision Res. 40(13), 1711–1737 (2000). [CrossRef]  

40. S. Hermans, K. A. G. Smet, and P. Hanselaer, “Brightness Model for Neutral Self-Luminous Stimuli and Backgrounds,” LEUKOS 14(4), 231–244 (2018). [CrossRef]  

41. P. A. García, R. Huertas, M. Melgosa, and G. Cui, “Measurement of the relationship between perceived and computed color differences,” J. Opt. Soc. Am. A 24(7), 1823 (2007). [CrossRef]  

42. M. Withouck, K. A. Smet, and P. Hanselaer, “Brightness prediction of different sized unrelated self-luminous stimuli,” Opt. Express 23, 13455–13466 (2015). [CrossRef]  

43. W. J. Huang, Y. Yang, and M. R. Luo, “Verification of the CAM15u colour appearance model and the QUGR glare model,” Light. Res. Technol. 51(1), 24–36 (2019). [CrossRef]  

44. C. Fu, C. Li, G. Cui, M. R. Luo, R. W. G. Hunt, and M. R. Pointer, “An investigation of colour appearance for unrelated colours under photopic and mesopic vision,” Color Res. Appl. 37(4), 238–254 (2012). [CrossRef]  

45. B. Koo and Y. Kwak, “Color appearance and color connotation models for unrelated colors,” Color Res. Appl. 40(1), 40–49 (2015). [CrossRef]  

46. A. Wohrer and P. Kornprobst, “Virtual Retina: A biological retina model and simulator, with contrast gain control,” J. Comput. Neurosci. 26(2), 219–249 (2009). [CrossRef]  

47. P. Martínez-Cañada, C. Morillas, J. L. Nieves, B. Pino, and F. Pelayo, “First stage of a human visual system simulator: The retina,” in Computational Color Imaging (Springer Verlag, 2015), pp. 118–127.

48. T. J. T. P. van den Berg, L. Franssen, and J. E. Coppens, “Ocular Media Clarity and Straylight,” in Encyclopedia of the Eye (Elsevier, 173–183, 2010).

49. J. J. Vos and T. J. T. P. van den Berg, “Report on disability glare,” in CIE Collection 135, 1–9 (1999).

50. J.J. Vos, “Disability Glare-A state of the art report,” CIE J. 3, 39–53 (1984).

51. J. E. Coppens, L. Franssen, and T. J. T. P. van den Berg, “Wavelength dependence of intraocular straylight,” Exp. Eye Res. 82(4), 688–692 (2006). [CrossRef]  

52. G. Spencer, P. Shirley, K. Zimmerman, and D. P. Greenberg, “Physically-based glare effects for digital images,” in Proceedings of the ACM SIGGRAPH Conference on Computer Graphics (1995), pp. 325–334.

53. R. Raskar, A. Agrawal, C. A. Wilson, and A. Veeraraghavan, “Glare aware photography,” in ACM SIGGRAPH 2008 Papers on - SIGGRAPH ‘08 (ACM Press, 2008), pp. 1–10.

54. H. Ando, N. Torigoe, K. Toriyama, and K. Ichimiya, “Real-time rendering of high quality glare images using vertex texture fetch on GPU,” inGRAPP 2006 - Proceedings of the 1st International Conference on Computer Graphics Theory and Applications (2006), pp. 19–25.

55. A. Yoshida, M. Ihrke, R. Mantiuk, and H. P. Seidel, “Brightness of the glare illusion,” in APGV 2008 - Proceedings of the Symposium on Applied Perception in Graphics and Visualization (ACM Press, 2008), pp. 83–89.

56. R. A. Normann, B. S. Baxter, H. Ravindra, and P. J. Anderton, “Photoreceptor Contributions to Contrast Sensitivity: Applications in Radiological Diagnosis,” IEEE Trans. Syst. Man Cybern. SMC SMC-13(5), 944–953 (1983). [CrossRef]  

57. D. C. Hood, T. Ilves, E. Maurer, B. Wandell, and E. Buckingham, “Human cone saturation as a function of ambient intensity: A test of models of shifts in the dynamic range,” Vision Res. 18(8), 983–993 (1978). [CrossRef]  

58. J. J. McCann and V. Vonikakis, “Calculating retinal contrast from scene content: A program,” Front. Psychol. 8, 2079 (2018). [CrossRef]  

59. T. Kunkel and E. Reinhard, “A neurophysiology-inspired steady-state color appearance model,” J. Opt. Soc. Am. A 26(4), 776 (2009). [CrossRef]  

60. M. Kamermans, D. A. Kraaij, and H. Spekreijse, “The cone/horizontal cell network: a possible site for color constancy,” Vis. Neurosci. 15(5), 787–797 (1998). [CrossRef]  

61. J. V. Forrester, A. D. Dick, P. G. McMenamin, F. Roberts, and E. Pearlman, “Anatomy of the eye and orbit,” in The Eye (Elsevier, 2016), pp. 1–102.e2.

62. L. J. Croner and E. Kaplan, “Receptive fields of P and M ganglion cells across the primate retina,” Vision Res. 35(1), 7–24 (1995). [CrossRef]  

63. K. Amano, B. A. Wandell, and S. O. Dumoulin, “Visual field maps, population receptive field sizes, and visual field coverage in the human MT+ complex,” J. Neurophysiol. 102(5), 2704–2718 (2009). [CrossRef]  

64. S. Raiguel, M. M. Van Hulle, D. K. Xiao, V. L. Marcar, L. Lagae, and G. A. Orban, “Size and shape of receptive fields in the medial superior temporal area (MST) of the macaque,” Neuroreport 8(12), 2803–2808 (1997). [CrossRef]  

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (10)

Fig. 1.
Fig. 1. Pictures of the experiment set-up. (a) A view of the experiment set-up showing the shield and the screen; the reference stimulus is displayed at the left half and the test stimulus at the right half. (b) The top view of the setup. The blue circles indicate where the observer should position their head when viewing the experimental scenes.
Fig. 2.
Fig. 2. The ratio of the average cone weighted spectral radiance of the matched stimuli as a function of angular distance from the stimulus edge to the ring with different ring thicknesses and ring luminance levels: (a) Without error bars. (b) With error bars. The error bars represent the standard error.
Fig. 3.
Fig. 3. The image-based brightness prediction framework illustrated by grayscale images (with the values in each image scaled between the minimum and the maximum pixel values of the image). The stimulus size and the ring size are adjusted for illustration purpose. The * symbol represents the convolution of the image with a filter kernel.
Fig. 4.
Fig. 4. An example of the input image: (a) The full image. (b) The value of each pixel from a line cut through the image.
Fig. 5.
Fig. 5. The effect of convolving the input image with a PSF kernel: The top images show the full images, and the bottom images show the changes throughout a line in the full image. (a): The original input; (b) After convolution with PSF.
Fig. 6.
Fig. 6. Output of IQ as a function of distance with regard to different ring luminance levels and ring widths
Fig. 7.
Fig. 7. The brightness output map: (a) The full image. (b) The value of each pixel from a line cut through the image.
Fig. 8.
Fig. 8. The brightness output under different stimulus luminance and background sizes with a background luminance of 478 cd/m2: (Above) The psychophysics experimental result. (Below) The output of our proposed model.
Fig. 9.
Fig. 9. Brightness output for the tested scenes from: (a) Experimental results of Hermans et al. [40]; (b) Brightness prediction of our model
Fig. 10.
Fig. 10. The brightness output of the model as a function of the brightness from (a) Experimental data; (b) CAM18sl prediction The icons inside the red circles correspond to the unrelated stimuli.

Equations (8)

Equations on this page are rendered with MathJax. Learn more.

L ρ = 686.7 390 830 L e , λ ( λ ) l 10 ¯ ( λ ) d λ L γ = 768.3 390 830 L e , λ ( λ ) m 10 ¯ ( λ ) d λ L β = 1366.1 390 830 L e , λ ( λ ) s 10 ¯ ( λ ) d λ
0.82 $ P S F ( ϑ ) = L e q ( ϑ ) E = [ 1 0.08 ( A / 70 ) 4 ] [ 9.2 10 6 [ 1 + ( ϑ / 0.0046 ) 2 ] 1.5 + 1.5 10 5 [ 1 + ( ϑ / 0.045 ) 2 ] 1.5 ] + + [ 1 + 1.6 ( A / 70 ) 4 ] { [ 400 1 + ( ϑ / 0.1 ) 2 + 3 10 8 ] + p [ 1300 [ 1 + ( ϑ / 0.1 ) 2 ] 1.5 + 0.8 [ 1 + ( ϑ / 0.1 ) 2 ] 0.5 ] } + 2.5 10 3 p $
E = L e q ( ϑ ) cos ϑ . d Ω s r c
1 = P S F ( ϑ ) cos ϑ . d Ω s r c
E α = x y L α , e q ( x , y ) cos 2 ϑ ( x , y ) A p i x D 2 ( x , y ) which is equivalent to 1 = x y P S F ( x , y ) cos 2 ϑ ( x , y ) A p i x D 2 ( x , y )
L α , c = L α , e q 1 / 3
G ( x , y ) = W F . e   x 2 + y 2 2 δ 2 k
I Q ( x , y ) = I α , c n ( x , y ) I α , c n ( x , y ) + ( σ 0 + I α , G ( x , y ) ) n
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.