We present a fully transparent, scalable, and flexible color image sensor that consists of stacked thin-film luminescent concentrators (LCs). At each layer, it measures a Radon transform of the corresponding LC’s spectral responses. Color images are then reconstructed through inverse Radon transforms that are obtained using machine learning. A high sampling rate in Radon space allows encoding multiple exposures to cope with under- and overexposed cases in one recording. Thus, our sensor simultaneously measures multiple spectral responses in different LC layers and multiple exposures in different Radon coefficients per layer. We also show that machine learning enables adequate three-channel image reconstruction from the response of only two LC layers.
© 2015 Optical Society of America
Single-chip color sensors usually employ a color filter array (CFA)  that is placed on top of a monochrome photosensor matrix. The CFA encodes color channels spatially, such as red, green and blue (RGB) primaries, as in the case of a Bayer filter . This normally leads to reduced quantum efficiency and spatial resolution. The interpolation methods required to recover a full color image, for instance, demosaicing , often result in artifacts such as color bleeding, Moiré and aliasing.
While CFAs are still the most widely applied approach for color imaging, alternative technologies seek to overcome their limitations. More sophisticated lateral color filters increase quantum efficiency and improve color separation. Micro color splitters , for instance, use light guides as deflectors that enable color separation without absorption. Plasmonic color filters [5–10] use nano-scale holes in a thin-film metal layer for color separation. The transmitted wavelength is determined by the pattern of the holes, while filters based on nano wires  achieve color separation that depends on the wires’ diameters. Vertical filter layers additionally remove the need for demosaicing and the resulting artifacts. The Foveon X3  is currently the most prominent commercial image sensor that uses vertical filters. It is based on the wavelength-dependent absorption depth of its silicon layers. The first layer absorbs blue light, while green and red light pass through and are absorbed by the second and third layers respectively. A different technology that utilizes stacked diodes  achieves color separation by applying different bias voltages. Sequential switching between three different voltage levels allows the red/green, green and blue color channels to be captured. A simpler version  requires only two different voltages to capture the green and the blue/red color channels. Stacked imaging sensors can also be implemented with organic photoconductive films  where three layers capture the green, red and blue color channels simultaneously. This enables effective color separation and offers a large degree of freedom in designing spectral responses. A hybrid approach  that uses one organic layer that captures green on top of a conventional CFA-based image sensor that measures blue and red was presented as a feasible method for doubling the amount of light captured.
Our image sensor applies a stack of thin-film luminescent concentrators (LCs) that are sensitive to different portions of the spectrum. It requires no integrated circuits and is therefore fully transparent, scalable, and flexible. In contrast to existing approaches, it does not take spatial measurements from the sensing surface, but records Radon transforms of the various spectral responses at the edges of each corresponding LC layer. The color channels are then reconstructed through inverse Radon transforms that are obtained by machine learning. A high sampling rate in Radon space allows encoding multiple exposures to cope with under- and overexposed cases in a single recording, and without affecting the spatial image resolution. Thus, our sensor measures multiple spectral responses in different LC layers and multiple exposures in different Radon coefficients per layer.
2. Multi-layer Radon sensor
In our previous work, we showed how single-layer thin-film LCs can be used for the reconstruction of gray-scale images [17,18], for multi-focal imaging and depth reconstruction , and for classification . In this article, we extend our sensor to multi-exposure color imaging.
For color imaging, we stack several layers of LC films. Our proof-of-concept prototype, shown in Fig. 1(c), uses two 216mm × 216mm × 300µm layers of Bayer’s Makrofol® LISA green and LISA red films in Fig. 1(a). The absorption and emission spectra of both LCs are shown in Fig. 4(a). While the top layer (LISA green) absorbs blue light and emits green, the bottom layer (LISA red) absorbs green and emits red. The light emitted by the fluorescent dyes inside the LCs is guided towards their edges by total internal reflection as depicted in Fig. 1(b). As explained in , we use a mask of 32 triangular apertures cut into each edge of both layers and ribbons of polymethylmethacrylate (PMMA) step-index multi-mode optical fibers (Jiangxi Daishing POF Co. C250, diameter: 250µm) to multiplex light integrals through the LCs and to transport them to eight contact image sensor (CIS) line-scan cameras (CMOS, each with 1728 photosensors measuring 125µm × 125µm behind an array of rod lenses), where they are measured.
Each measured light integral corresponds to one coefficient (with coordinates (x,ϕ) describing the integral’s projection position x and angle ϕ) in the Radon transform of the image focussed on the LC surface . A Radon coefficient is measured by one photodiode of our line-scan cameras. A total of 54 integrals per aperture triangle can be measured in our prototype, thus yielding a constant directional sampling resolution of 4 × 54 = 216 samples over ϕ = 1°..360° and a projection resolution of 32 samples in x per layer. See  for more details on the optical design of single layers and  for details on the differences between the measurements of our sensor and classical Radon transform (e.g. no redundancy for 180°≤ ϕ < 360°).
Note that the red portion of the spectrum cannot be measured by our prototype, since none of it’s LCs absorb wavelengths beyond 600nm. While a third, red-sensitive LC layer would overcome this problem, we will show below that the red color channel can be adequately reconstructed from our two-layer measurements using machine learning.
Figure 1(a) shows a large overlap of the emission spectrum of LISA green and the absorption spectrum of LISA red. However, the amount of green light that is emitted from the surface of the top layer (LISA green) due to cone loss  is only a fraction (in our case ~23%) of the total emitted light. It also depends on the intensity of the absorbed blue light. Although the overlap between the emission/absorption spectra is large the level of crosstalk is relatively small and will be largely compensated by the machine learning approach. However, for images with a bright blue channel and a dark green channel this will probably lead to reconstruction errors. A solution to this is discussed in section 5.
For one-time calibration, we project a training set of 60,000 color images (randomly selected from online image-databases, such as Flickr) onto the sensor, and measure the corresponding Radon transforms of each layer. Further, we measure the Radon transforms of an additional test set of 4,000 color images that are not part of the training set (also randomly selected). While the training set is used to estimate the inverse Radon transforms for color image reconstruction via machine learning, the test set is used for evaluation. We assume that training set and test set share dominantly the same image statistics (e.g., a natural 1/f image statistics ). The reconstruction results of all test images are available online in Dataset 1, .
3. Color image reconstruction
Let us first consider the reconstruction of gray-scale images with one layer. If P is the n × m matrix of n (n=60,000 in our case) vectorized training images with m (m = 64 × 64 in our case) pixels each, and L the n× l matrix of n measured Radon transforms of P with l (l = 54 × 32 × 4 in our case) vectorized Radon coefficients, then we need to determine the m × l matrix T−1×for image reconstruction:
Transposing all components in Eq. 1 leads to
Note that T−1 represents the filtered backprojection (TTT)−1TT as a combination of a high-pass filter and an unfiltered backprojection  that is learned by means of linear regression. With T−1 determined using the training images, and assuming that the set of training images is statistically representative, the reconstruction of arbitrary images can then be implemented as a simple matrix vector multiplication:
For the reconstruction of color images with multiple LC layers, the above equations apply in principle. However, one individual image matrix P exists for each color channel, while one individual Radon transform matrix exists for each LC layer. Thus, for the reconstruction of RGB color images (channels: r,g,b), for instance, with a three-layered sensor (layers: 1,2,3), Eq. 2 extends toEq. 2 extends to Eqs. 3 and 4 extend accordingly.
Figure 2 illustrates a selection of 15 images from our test set that have been captured with our two-layer prototype. They were reconstructed by matrix vector multiplication as explained in Eq. 4 where l is a column vector containing the measured Radon transforms of an image. The matrix T−1 is estimated by linear least squares according to Eqs. 3 and 6 from the measured Radon transforms of the 60,000 training images. The column vector p contains the reconstruction of the three color channels. Note that the red channel (although not physically measured) is adequately reconstructed in most cases. An explanation is provided in section 5.
4. Coded exposure imaging
Recording all Radon coefficients with a single exposure might lead to reconstruction artifacts in over- and underexposed cases (i.e., images or image regions that are too bright or too dark for the selected exposure). However, since the Radon transforms can be sampled at a rate that is higher than the sampling rate of the reconstructed images, multiple exposures can be encoded simultaneously in the measurements of the Radon coefficients without causing a loss in spatial image resolution. Our prototype, for example, records 54 × 32 × 4 = 6,912 Radon coefficients per layer to reconstruct images of 64 × 64 = 4,096 pixels, which is –in theory–oversampled. The question arises: What is the optimal way to multiplex the measured Radon transforms with different exposures (i.e., which Radon coefficient should be measured at what exposure)?
During one-time calibration, we record the Radon transforms of our 60,000 training images at each LC layer for multiple (six in our case) exposures. The lowest and highest exposures are chosen in such a way that the Radon coefficients of the brightest and darkest images of the training set fully cover the dynamic range of the line-scan cameras while still achieving an acceptable imaging speed (i.e., by restricting the maximal exposure). Intermediate exposures are chosen such that the Radon coefficients increase linearly with each exposure step. We chose a non-linearly increasing series of 6 exposures (10ms, 20ms, 30ms, 50ms, 80ms, 120ms) that compensates the non-linear response of our line-scan cameras and supports imaging at 8 frames per second.
From these Radon transforms, we determine the optimal exposure code for each LC layer individually as follows: Assuming that the signal-to-noise ratio drops with decreasing exposure, we start at the highest exposure and search for those coefficients in all Radon transforms recorded with this exposure that are never overexposed throughout all training images. These Radon coefficients are assigned to the highest exposure and are excluded from further assignments. We then continue with the Radon transforms recorded with the second-highest exposure and search through the remaining coefficients for those that are never overexposed and assign them to the second-highest exposure. This is repeated for all exposures, which leads to an exposure code that maximizes the overall S/N in a single recording. An individual exposure code must be determined for each LC layer. Note that we assume again that our 60,000 training images are statistically representative, and that the exposure codes are therefore applicable to arbitrary images. The exposure codes can be implemented in our sensor either by selecting individual exposure times or by applying individual ND filters to the photodiodes that measure the single Radon coefficients. Note also that, once the exposure codes are implemented, the training and the image reconstruction processes, as explained in section 3, both use the exposure-coded measurements. In fact, all results shown in Fig. 2 were achieved with the pre-determined exposure code. Since for our prototype we could implement neither individual exposure times nor individual ND filters for each photodiode, we achieved these results by measuring all 60,000 training images and all 4,000 test images at different exposure times and by selecting only the Radon coefficients from the assigned exposures for training and image reconstruction. All Radon coefficients are normalized by dividing them by their exposures.
Figure 3 illustrates the advantage of coded exposure imaging over single-exposure imaging. Fig. 3(a) shows that, while for darker images higher exposures lead to better reconstruction results, brighter images require lower exposures. Thus, there is no single optimal exposure for all images. Applying the derived exposure code, shown in Fig. 3(b), allows proper reconstruction of both bright and dark images. The statistics in Fig. 3(c) illustrate for how many of the 4,000 test images our exposure coding performed better than recordings with a single-exposure. The sinogram shown in Fig. 3(b) visualizes the exposure code (calibrated and simulated) that is applied to all Radon coefficients of the LISA green layer (the LISA red layer is similar). We discuss its structure in section 5.
5. Discussion and future work
The absorption-emission spectra of the two LC films in Fig. 4(a) show that our sensor prototype does not physically measure the red part of the spectrum. However, the red color channel is included in the training and reconstruction process, as explained in Eq. 6. Figure 4(b) illustrates the reconstruction histogram over all 4,000 test images, and plots the probability of an RGB image / a separate color channel being reconstructed with a particular structural similarity (SSIM) [25, 26] when compared to the ground truth. It indicates that, although the red color channel reconstruction is slightly worse than that of the green and the blue channels, it is still within a similar quality range. Note that, because of spatial reconstruction artifacts due to noise and sampling limitations, a SSIM near 1 cannot be reached in practice. The visual quality of reconstruction examples with an SSIM in the range of 0.5 to 0.8 can be compared in Fig. 2. We believe that the adequate reconstruction of the red color channel is a result of our regression being able to learn from a statistically representative training set the statistical likelihood of correlations between the measured green/blue channels (including their spectral overlap) and the unknown red channel. It does fail in extreme cases, for example in the reconstruction of purely red pixels. But these cases seem to be statistically unlikely in natural images. A third red-absorbing LC layer (such as PMMA doped with quantum dots that absorbs at 600nm and emits at 850nm , or Alexa Fluor 633 with absorption/emission at 632nm/647nm in a polymer carrier) would allow physical measurement of the red color channel, and Eq. 5 could be applied for training and reconstruction.
The fluorescent dyes that are used in LISA green and LISA red overlap in their emission and absorption spectra. This causes ~ 23% crosstalk between the layers due to cone loss emission from LISA green. Although the color channels can still be efficiently separated by our machine learning approach, the sensor can be further improved by integrating a green-blocking filter between the two sensing layers. In this case the layer order of our prototype must be reversed to avoid that the filter blocks all of the green light before it reaches the LISA red layer. Reversing the layer order, however, causes a low S/N in the LISA green layer due to the high absorption of blue light by LISA red. The application of fluorophores with a better spectral separation would be an alternative.
Figure 3(c) plots the sinogram of the exposure code determined for the LISA green layer of our sensor (LISA red is similar). Since our prototype lacks manufacturing precision and to emphasize the code structure, we compared the calibrated code with a simulation that considers an ideally manufactured sensor. It can be seen that the closer an integral’s angle ϕ is to a multiple of 90° (i.e., 90°, 180°, 270°, or 360°), the lower the selected exposures, while the selected exposures are higher the further ϕ diverges from these multiples. A similar – though noisy–structure is visible in the calibrated case. The reason for this is shown in Fig. 4(c), which highlights one single aperture triangle. Our sensor does not integrate over infinitely thin lines, but over areas with field-of-views that correlate to the aperture dimensions. These areas expand to a maximum for ϕ-angles that are perpendicular to the LC edges, and shrink for steeper angles. While more light is integrated over larger than over smaller areas, less exposure is required for perpendicular than for steeper angles. Note that the general difference between a theoretical Radon transform / sinogram and the Radon transform / sinogram that is measured in practice by our sensor is discussed in .
Depending on the chosen exposure time, the imaging speed (including reconstruction) of our prototype is 5–10 f ps at a maximum imaging resolution of 64 × 64 pixels. Both are constrained by construction limitations, such as the smallest cuttable size of apertures or dynamic range and S/N ratio of the applied line-scan cameras used. Improved manufacturing will lead to more advanced sensors.
We thank Robert Koeppe from isiQiri interface technologies GmbH for fruitful discussions and for providing LC samples. This work was supported by Microsoft Research under contract number 2012-030(DP874903) – LumiConSense.
References and links
1. R. Lukac and K. N. Plataniotis, “Color filter arrays: Design and performance analysis,” IEEE T. Consum. Electr. 51(4), 1260–1267 (2005). [CrossRef]
2. B. E. Bayer, “Color imaging array,” US Patent 3,971,065.
3. R. W. Schafer and R. M. Mersereau, “Demosaicking: color filter array interpolation,” IEEE Signal Proc. Mag. 22(1), 44–54 (2005). [CrossRef]
4. S. Nishiwaki, T. Nakamura, M. Hiramoto, T. Fujii, and M. Suzuki, “Efficient colour splitters for high-pixel-density image sensors,” Nat. Photonics 7(3), 240–246 (2013). [CrossRef]
5. G. Si, Y. Zhao, A. B. Chew, and Y J. Liu, “Plasmonic color filters,” J. Mol. Eng. Mater. 2(2), 1440009 (2014). [CrossRef]
6. S. P. Burgos, S. Yokogawa, and H. A. Atwater, “Color imaging via nearest neighbor hole coupling in plasmonic color filters integrated onto a complementary metal-oxide semiconductor image sensor,” ACS Nano 7(11), 10038–10047 (2013). [CrossRef] [PubMed]
7. Y. S. Do, J. H. Park, B. Y. Hwang, S-M. Lee, B-K. Ju, and K. C. Choi, “Plasmonic color filter and its fabrication for large-area applications,” Adv. Opt. Mater. 1(2), 133–138 (2013). [CrossRef]
9. Q. Chen, Y. Zhao, and D. R. S. Cumming, “High transmission and low color cross-talk plasmonic color filters using triangular-lattice hole arrays in aluminum films,” Opt. Express 18(13), 14056–14062 (2010). [CrossRef] [PubMed]
11. H. Park, Y. Dan, K. Seo, Y. J. Yu, P. K. Duane, M. Wober, and K. B. Crozier, “Filter-free image sensor pixels comprising silicon nanowires with selective color absorption,” Nano Lett. 14(4), 1804–1809 (2014). [CrossRef] [PubMed]
12. R. B. Merrill, “Color separation in an active pixel cell imaging array using a triple-well structure,” US Patent 5,965,875.
13. D. Knipp, R. A. Street, H. Stiebig, M. Krause, J-P. Lu, S. Ready, and J. Ho, “Vertically integrated thin film color sensor arrays for imaging applications,” Opt. Express 14(8), 3106–3113 (2006). [CrossRef] [PubMed]
14. A. Bablich, C. Merfort, H. Schäfer-Eberwein, P. Haring-Bolivar, and M. Boehm, “2-in-1 red-/green-/blue sensitive a-SiC:H/a-Si:H/a-SiGeC:H thin film photo detector with an integrated optical filter,” Thin Solid Films 552, 212–217 (2014). [CrossRef]
15. H. Seo, S. Aihara, T. Watabe, H. Ohtake, T. Sakai, M. Kubota, N. Egami, T. Hiramatsu, T. Matsuda, M. Furuta, and T. Hirao, “A 128× 96 pixel stack-type color image sensor: stack of individual blue-, green-, and red-sensitive organic photoconductive films integrated with a ZnO thin film transistor readout circuit,” Jpn. J. Appl. Phys. 50(2), 4103 (2011). [CrossRef]
16. S-J. Lim, D-S. Leem, K-B. Park, K-S. Kim, S. Sul, K. Na, G. H. Lee, C-J. Heo, K-H. Lee, X. Bulliard, R-I. Satoh, T. Yagi, T. Ro, D. Im, J. Jung, M. Lee, T-Y. Lee, M. G. Han, Y. W. Jin, and S. Lee, “Organic-on-silicon complementary metal-oxide-semiconductor colour image sensors,” Sci. Rep. 5, 7708 (2015). [CrossRef] [PubMed]
17. A. Koppelhuber and O. Bimber, “Towards a transparent, flexible, scalable and disposable image sensor using thin-film luminescent concentrators,” Opt. Express 21(4), 4796–4810 (2013). [CrossRef] [PubMed]
18. A. Koppelhuber, S. Fanello, C. Birklbauer, D. Schedl, S. Izadi, and O. Bimber, “Enhanced learning-based imaging with thin-film luminescent concentrators,” Opt. Express 22(24), 29531–29543 (2014). [CrossRef]
19. A. Koppelhuber, C. Birklbauer, S. Izadi, and O. Bimber, “A transparent thin-film sensor for multi-focal image reconstruction and depth estimation,” Opt. Express 22(8), 8928–8942 (2014). [CrossRef] [PubMed]
21. J. S. Batchelder, A. H. Zewail, and T. Cole, “Luminescent solar concentrators. 1: Theory of operation and techniques for performance evaluation,” Appl. Optics 18(18), 3090–3110, (1979). [CrossRef]
23. A. Koppelhuber and O. Bimber, “Reconstructions of the test set images,” figshare (2015) [retrieved 16 July 2015], http://dx.doi.org/10.6084/m9.figshare.1485612
24. T. M. Buzug, Computed Tomography from Photon Statistics to Modern Cone-Beam CT (Springer Verlag, 2008)
25. A. Kolaman and O. Yadid-Pecht, “Quaternion structural similarity: A new quality index for color images,” IEEE T. Image Process. 21(4), 1526–1536 (2012). [CrossRef]
26. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE T. Image Process. 13(4), 600–612 (2004). [CrossRef]
27. R. H. Inman, G. V. Shcherbatyuk, D. Medvedko, A. Gopinathan, and S. Ghosh, “Cylindrical luminescent solar concentrators with near-infrared quantum dots,” Opt. Express 19(24), 24308–24313 (2011). [CrossRef] [PubMed]