Conventional cameras obscure the scene that is being recorded. Here, we place an image sensor (with no lens) on the edge of a transparent window and form images of the object seen through that window. This is enabled first, by the collection of scattered light by the image sensor, and second, by the solution of an inverse problem that represents the light scattering process. Thereby, we were able to form simple images, and demonstrate a spatial resolution of about 0.1 line-pairs/mm at an object distance of 150mm with depth-of-focus of at least 10mm. We further show imaging of two types of objects: an LED array and a conventional LCD screen. Finally, we also demonstrate color and video imaging.
© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement
Conventional cameras obstruct the view of the object or scene that they are recording. It is interesting to explore the potential of overcoming this constraint, which could enable applications in driver-assistance technologies for automobiles, augmented reality headsets, etc. There have been some attempts at achieving this goal recently [1–4]. One notable work from Microsoft, Inc  uses a wedge-shaped optical slab (wedge optic) to simultaneously guide and focus light hitting the window to the bottom surface of the wedge. By maintaining the optical conjugate between the window and window edge, the wedge optic can be used both for camera and projector applications. However, the effective aperture of the wedge optic is limited to the top half of window, requiring the bottom half of the window to be covered for eliminating unwanted stray light. A second approach, referred to as LumiConSense [2–4], is based on a luminescent concentrator (LC). The LC is a flexible polymer sheet doped with green fluorescent dye. When a photon of sufficient energy hits the LC, the fluorescent dye is excited and isotropically re-emits a stoked shifted photon. A portion of this reemission is trapped within the LC film and guided to the edge, where a fiber-optic bundle captures the signal. Finally, images are reconstructed using computational techniques. This approach generally suffers from relatively low transparency in the LC panel due to the absorption of the fluorescent dies.
Here we propose an elegant method to overcome these limitations by placing a conventional image sensor on the edge of a transparent window as illustrated in Fig. 1(a). The edges of the window are covered with a reflective tape except for the portion that is facing the image sensor. This portion of the edge is roughened slightly as indicated in Fig. 1(b). A small portion of the rays from each point of the object scatters of this rough surface and is collected by the sensor. This scattered image, therefore, forms a point-spread function (PSF) of the system for that point. First, we experimentally measure the PSF of the system for all the points on a test object (an array of LEDs as described below). Next, we capture images of arbitrary objects. Finally, we solve the linear inverse problem to reconstruct the image of an arbitrary object using the measured PSF data. Previously, we exploited this idea to enable lensless photography  as well as microscopy through a surgical needle [6–9].
Our approach could enable applications including eye tracking in augmented-reality glasses or on the windshield of a car without obstructing the view of the user. Furthermore, by removing all lenses, we can enable ultra-compact and “see-through” cameras for applications where form factor is important.
2. Imaging an LED array
Photographs of our experimental setup in side-view and front-view are shown in Figs. 1(c) and 1(d), respectively. We used an LED array (SparkFun, COM-12584) as the calibration and test object. The window was made of plexiglass acrylic cut to a size of 200mm X 225mm. A blowtorch was used to thermally anneal the edge faces to ensure smooth surfaces. Pressing grit-80 sandpaper against the surface facing the image sensor, after heating, created the scattering roughness. After cooling, reflective (Aluminum) tape was placed on all edge surfaces except the roughened portion facing the image sensor (see Fig. 1(c)). We used a conventional CMOS image sensor (1/3” Aptina MT9V024, pixel size = 6μm, 640X480 pixels, 8 bits).
Calibration was performed by recording the signal on the image sensor when only a single LED in the array was on (representing a point source). For each LED, 20 frames were averaged to avoid the LED-flickering effect and to reduce noise. This was performed for the entire array of 32 X 32 LEDs, which represented the region within the area of the transparent window (diagonal = 190.5cm). All ambient light was turned off. Example frames for 3 different LED locations are shown in Figs. 1(e)-1(g).
Incoherent imaging implies that the image of any object is the linear sum of the images formed by its constituent points. We also apply a high-dynamic range (HDR) algorithm to correct for the nonlinear pixel response of the image sensor . Subsequently, the image on the sensor, b can be modeled as , where the object, x is transformed into the image, b via a the calibration matrix, A. To recover x, we use a regularized linear inverse solver , .
Note that α is a regularization parameter, which balances noise level and resolution of the solution.
Exemplary test images were displayed on the LED array as illustrated by the reference images in the leftmost column of Fig. 2. The corresponding raw data captured on the image sensor are shown in the center column, while the rightmost column shows the corresponding reconstructed images obtained by solving Eq. (2). Clearly, the caustics patterns captured by the image sensor include sufficient information about the object for effective image reconstruction. For these experiments, only the green LEDs and the green channel of the image sensor were used for calibration and testing.
We also captured videos at frame rate of 1 to 2 per second (to minimize the LED flickering effect). By reconstructing each frame of the raw video data, we can create reconstructed videos (see Visualization 1, Visualization 2, Visualization 3, Visualization 4, Visualization 5, Visualization 6, Visualization 7, Visualization 8, Visualization 9, Visualization 10, Visualization 11, and Visualization 12).
The calibration and testing in Fig. 2 were performed using the green LEDs in the array. In order to understand the impact of color, we tried to reconstruct images with red, blue and white colors on the LED array. The resulting images are summarized in Fig. 3. Note that the sizes of the image vector b, the object, x and the matrix A are 640 X 480 pixels, 32 X 32 pixels and 307,000 X 1024, respectively. The minimization problem laid out in Eq. (2) has a well-established closed-form solution via singular value decomposition of the calibration matrix A following the steps prescribed in ref .
It is expected that the distance between the object (LED array) and the transparent window plays a pivotal role in the quality of the image reconstruction. We empirically determined the optimal distance by calibrating at various distances and reconstructing test images. The results are summarized in Fig. 4 and we surmise that the optimal distance is about 150mm.
To further elucidate the imaging performance, we can consider the edge rays collected by the image sensor from each point on the object (see Fig. 8). Using this geometry, we can analytically solve a set of transcendental equations to compute the acceptance angle of rays from each point that are collected by the image sensor (see Fig. 9). This analysis firstly reveals that the acceptance angle decreases as the distance between the object plane and the window increases. Secondly, it also reveals that the acceptance angle is more uniform across the object plane for larger distances. Therefore, there is a trade-off between the size of the object and the distance between the object and the window. A third factor is clearly that at larger distances, the signal decreases. These three factors result in an optimum choice of distance between the object plane and the window.
In order to quantify the imaging performance, we measured the contrast of a line-space pattern that was reconstructed at various distances between the LED array and the transparent window (herein we refer to as D). The images from the experiment are shown in Fig. 10. The contrast of the reconstructed horizontal and vertical line-space pairs is plotted as a function D in Fig. 5(a). An optimum value of D (~150mm) where the contrast is maximum can be observed.
We can analyze this further by performing the singular value decomposition (SVD) of A at each value of D, which results in a singular-value vector, s. The complexity of solving the inverse problem is inversely proportional to the condition number of the PSF matrix, A, which is defined as the ratio of the first and last elements of s [12,13]. Figure 5(b) shows that the condition number decreases with increasing D until about 150mm, beyond which it is stable. The singular-value vectors at various values of D are plotted in Fig. 5(c) and reveal that the rate of decay of the singular values is slowest when D~150mm, which is another measure of the ease of inversion of the PSF matrix, A .
Finally, we also measured the modulation-transfer function (MTF) of our system by imaging line-space patterns of different periods using a liquid-crystal display (LCD) as described in the next section. This data, plotted in Fig. 5(d) reveals that the resolution of our system is approximately 0.1 line-pairs/mm for D = 150mm.
3. Imaging a liquid-crystal display (LCD)
In order to demonstrate the generality of our approach, next we replaced the LED array with a conventional liquid-crystal display (LCD) from a smartphone. A photograph of our experimental setup is shown in Fig. 6(a). Various images were displayed on the LCD and the corresponding data was captured on the image sensor. The distance between the LCD and the transparent window was 150mm and the calibration matrix at that distance obtained via the LED array (as described earlier) was used for image reconstructions.
Figure 6(b) shows various green images that were displayed on the smartphone screen. Note that since we used green color for reconstruction, we first tried green images on the LCD. The images seem to be reasonably reconstructed, although the quality is worse than in the case of the LED array. We further show reconstructions of red and blue displays as illustrated in Fig. 6(c). The quality of reconstruction is quite good for red, but somewhat worse for blue.
The experiments were conducted by placing the smartphone in front of the LED array. This means that even though the object plane is shifted from the plane of calibration (plane containing the LED array) by the thickness of the smartphone (~10mm), the images are reconstructed. From this experiment, we surmise that the depth of focus of our system is at least 10mm.
Finally, it is useful to consider the impact of the transparent window on the imaging process. In order to elucidate this point, we performed experiments without this window, everything else remaining the same. The results are summarized in Fig. 11 and our conclusion is that although image reconstructions seem feasible, the quality is severely degraded. Therefore, we surmise that it is important to have the rough edge of the window to redirect some of the rays from the object to the sensor.
4. Conclusions and future work
We demonstrated a “see-through” camera comprised of a transparent window with an image sensor placed facing into the edge of the window. Light rays from a scene are redirected by the rough surface of the window into the acceptance angle of the image sensor, thereby allowing computational methods to reconstruct the scene information. We showed that with judicious choice of the geometric parameters, it is possible to optimize for image contrast and angular field of view. Since there is no lens in our system, there is no possibility of optical focusing. In the future, it would be interesting to explore the potential for computational refocusing. This might be possible if calibration could capture the full light field information.
Furthermore, by applying deep neural networks it is possible to avoid the image reconstruction process for certain applications. We recently demonstrated this idea for digit classification using a lens-less camera . Similar techniques may be applied here to enable highly power-efficient, and effective techniques for what we call Application-Specific Imaging.
5 Appendix A1: impact of object size and distance from window
The signal collected by the image sensor from each point of the object can be quantitatively analyzed by looking at the edge rays as illustrated in Fig. 7. The acceptance angle of each point can, thereby, be analytically calculated using the set of equations described in Fig. 7.
The computed acceptance angle () as a function of the object location (x) is summarized for various distances (z) in Fig. 8. The sharp peak of that occurs at z=25mm implies that only a small portion of the object will contribute a significant signal, which, in turn implies that only a small number of object points will be reconstructed. As z increases, as a function of location in the object plane flattens out, allowing for a larger object to be reconstructed. However, this model does not account for the fact that the power carried by each ray decreases quadratically with path length, which implies that at larger z the signal contribution from each point will also decrease. Therefore, there is a compromise between the quality of image reconstruction (determined by the signal to noise ratio) and the size of the object that is reconstructed (an effective field of view of the camera). This is confirmed by our experiments summarized in Fig. 4, from which we select z=150mm as the optimum.
6 Appendix A2: measurement of contrast for line-space pairs on LED array
7 Appendix A3: results with no transparent window
Figure 11 shows a simple illustration of the scattering process. In the left-most case, where there is no window, most of the light rays from an object point miss the acceptance cone (shown in green) of the sensor pixels. In the case of a smooth window, majority of the rays are total-internally reflected into the window, again not reaching the sensor pixels. By making the surface facing the sensor rough, frustrated total-internal reflection occurs, which modifies the transmitted angle of light (ideally into an isotropic cone) such that a larger portion of the light is within the acceptance angle of the sensor pixels. These results are borne out by our experiments.
National Science Foundation (NSF) (10037833).
We thank Allison Kachel for assistance with an earlier preliminary version of the experiments.
The authors declare that there are no conflicts of interest related to this article.
1. A. R. Travis, T. A. Large, N. Emerton, and S. N. Bathiche, “Wedge optics in flat panel displays,” Proc. IEEE 101(1), 45–60 (2013). [CrossRef]
2. A. Koppelhuber and O. Bimber, “LumiConSense a transparent, flexible, scalable and disposable image sensor using thin-film luminescent concentrators,” Opt. Express 21(4), 4796–4810 (2013). [CrossRef] [PubMed]
4. D. C. Sims, Y. Yue, and S. K. Nayar, “Towards flexible sheet cameras: Deformable lens arrays with intrinsic optical adaptation.” Computational Photography (ICCP), 2016 IEEE International Conference on. IEEE, 2016. [CrossRef]
6. G. Kim, N. Nagarajan, E. Pastuzyn, K. Jenks, M. Capecchi, J. Shepherd, and R. Menon, “Deep-brain imaging via epi-fluorescence computational cannula microscopy,” Sci. Rep. 7(1), 44791 (2017). [CrossRef] [PubMed]
8. G. Kim, N. Nagarajan, M. Capecchi, and R. Menon, “Cannula-based computational fluorescence microscopy,” Appl. Phys. Lett. 106(26), 261111 (2015). [CrossRef]
9. G. Kim and R. Menon, “An ultra-small 3D computational microscope,” Appl. Phys. Lett. 105, 061114 (2014). [CrossRef]
10. P. E. Debevec and J. Malik, “Recovering high dynamic range radiance maps from photographs,” Proceedings of the 24th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 1997. [CrossRef]
11. P. C. Hansen, Discrete Inverse Problems: Insight and Algorithms (SIAM, 2010).
12. N. Antipa, “Single-shot diffuser-encoded light field imaging.” Computational Photography (ICCP), 2016 IEEE International Conference on. IEEE, 2016. [CrossRef]
13. N. Antipa, G. Kuo, R. Heckel, B. Mildenhall, E. Bostan, R. Ng, and L. Waller, “DiffuserCam: lensless single-exposure 3D imaging,” Optica 5(1), 1–9 (2018). [CrossRef]
15. G. Kim, S. Kapetanovic, R. Palmer, and R. Menon, Lensless-camera based machine learning for image classification,” arXiv:1709.00408 [cs.CV] (2017).