This paper presents a coded pinhole lens imaging system consisting of a pinhole array aperture, followed by a thin convex lens, mounted onto a standard DSLR camera body. The combination of pinhole and lens incurs two questions: (1) Is the new camera based on pinhole or lens imaging principle? (2) Can the lens improve the imaging quality? The study reveals that the camera is based on pinhole imaging, but the lens can improve its optical resolution from 0.44 mm (pinhole size) to 0.042 mm, leading to a significant improvement in imaging quality. The numerous pinholes on the aperture also improve the camera’s light throughput ability over single pinhole camera. The camera could be used for applications where a large depth of field is required while the illumination condition is poor.
© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement
Although lens based cameras dominate in both professional and amateur photography, there are a few weaknesses: (1) limited depth of field; (2) a delay time for auto or manual focus. By contrast, pinhole imaging has the advantages of (1) virtually infinite depth of field and (2) no focus needed, but suffers from low imaging quality and poor light-admitting capacity [1,2]. To address the above issues, numerous researches have been conducted.
To extend the depth of field in lens cameras, several coded aperture (CA) approaches have been proposed. The basic idea of CA is the use of specifically designed aperture patterns to preserve additional spatial information (e.g., orientations, phases, or frequencies) in a coded manner. The extra spatial information that can be retrieved at decoding is helpful to enhance the image fidelity or provide additional features. Levin et al.  presented an aperture with an irregular but symmetric shape to obtain clear images and additional depth information. Veeraraghavan et al.  found asymmetric apertures could preserve more high-frequency information and thus improve reconstructed image quality. A comprehensive investigation to different coded aperture patterns can be found in .
Except coded aperture, some researchers proposed to insert a pinhole  or micro-lens array  between the main camera lens and sensor array in the camera body, usually close to the sensor array, to record the light field to obtain multiple focal planes, although post-processing is needed. Such a prototype camera can use a f-stop of f/4 aperture to achieve a depth of field equivalent to a f-stop of f/22 .
Pinhole cameras have no need to concern the depth of field, but the challenges in light admitting ability and optical resolution apply. Intuitively, a pinhole array consisting of multiple pinholes can improve the light throughput. For example, a popular coded pinhole array of uniformly redundant arrays (URA) , proposed by Fenimore and Cannon in 1978, achieves a high light throughput of nearly 50% , such that the URA possesses the capability to image low-intensity, low-contrast sources. Furthermore, there is always a corresponding post-process array that can convolute with the URA array to obtain a delta function, easing the recovery of target images. By contrast, the autocorrelation of nonredundant arrays (NRA) produces a spike function yet with sidelobes, resulting in severe artifacts in the reconstruction . Other proposed pinhole arrays can be found in [9–15]. These coded pinhole arrays were mostly proposed for imaging of short wavelength (i.e., x- or gamma-rays) where diffraction and interference are negligible. The smaller the pinhole the better optical resolution the imaging system has. This benefit however does not apply to visible or infrared light imaging. For visible light imaging, if the hole size becomes smaller, diffraction (including interference) effects increasingly play a role. The pinholes in the above apertures are too small or dense to resist the diffraction noise. For instance, Chi and George  attempted to utilize the URA coded aperture to image a simple letter “N” illuminated by visual light, with a recognizable but noisy result achieved. The main drawback of the URA is that it is susceptible to diffraction and interference. Later, DeWeert and Farm  proposed a separable Doubly-Toeplitz mask to replace the URA. With the separable Doubly-Toeplitz mask the recorded image can be regarded as the latent image convoluted by two Toeplitz matrices sequentially. Then, by using the Toeplitz property, a novel decoding algorithm was developed to mitigate the diffraction noise, so that a clear image of natural scene was achieved. Following the similar idea of double masks, Asif et al.  built a lensless camera named FlatCam in which the mask was placed extremely close to the image sensor with a significant improvement in imaging quality reported. Slinger et al.  also proposed a coded aperture with lots of square pinholes randomly spread on the mask, to capture mid-wave infrared images of objects at far distance (around 1 km away). While the diffraction hampered the attainment of high quality imaging, they favored these merits of pinhole imaging: (1) reduction in mass and moment of inertia by elimination of the optical lens; (2) infinite depth of field; (3) negligible image aberrations and distortions; (4) low cost.
Motivated by the above researches, we previously proposed a random sparse coded aperture for lensless imaging . A few square pinholes are randomly distributed on a thin black paper cardboard (fabricated by a high precision laser cutter), and then mounted onto a DSLR camera body (to only use the sensor array) to form a camera. The sparse distribution of pinholes is to reduce the interference effect of inter-pinholes. The random arrangement of pinholes is to favor the use of the compressive sensing theory  for decoding and to offer an additional feature of encryption at imaging. The previous imaging system improves the light admitting capacity much but still obtains as poor imaging quality as a single pinhole camera. In the current study, we proposed to add a thin convex lens to previous camera structure to reduce the diffraction effect and thus to ameliorate the imaging quality. Although a minor modification is carried out, the diffraction changes from the Fresnel model to the Fraunhofer model, which brings the potential for the proposed camera to achieve comparable imaging quality as common lens cameras.
2. Materials and methods
2.1 Camera design and construction
The proposed coded pinhole lens imaging system is illustrated in Fig. 1. The coded pinhole array aperture is lithographed on a thin flat transparent film using the advanced micro-manufacturing techniques with a resolution of 128,000 dpi. Then, the mask is closely attached (use glue) to a low-cost Holga lens set (Holga Co., Hong Kong) with a fixed focal length of 120 mm, and then mounted onto a Canon EOS 600D camera body (Canon, Japan) to construct the proposed imaging system. The main parameters of the proposed imaging system are summarized in Table 1, with justifications covered in the following paragraphs.
The lens-to-sensor distance is estimated according to the pinhole imaging model. A single pinhole camera adapted from the Holga lens set was used to image an object (i.e., a book), then the object distance (book to the pinhole camera front surface) and the object size (book length) were measured manually using a caliper of 0.1 mm resolution. The image size was calculated from the product of the number of pixels of the object and the pixel pitch size. The estimated image distance is 55 mm, which is smaller than the lens focal length of 120 mm, so the imaging system cannot form a clear image if the lens imaging theorem is applied. The Canon camera set the lowest resolution with a binning pixel pitch of 30 μm because the proposed imaging has an optical resolution of 42 μm, which will be explained in next section.
For simplicity, we assume that there are only two identical pinholes with a separation of on the mask and that a point light source lies on the optical axis. First, if the lens is not considered, the displacement of the two pinhole images P1 and P2 is
Then, the thin lens theory is applied, and a lens image is supposed to be formed at position I, so we have
It should be noted that the lens-to-sensor distance is different from the lens focal length (actually ). Instead of forming an image I, the lens refracts the light rays toward the optical axis. As a result, P1 and P2 translate closer to each other. The new displacement of these two images Q1 and Q2 is triangulated as
In a similar way, it can be proven that objects off the optical axis follow the same conclusion as Eq. (3).
The displacement in Eq. (3) is dependent upon the object distance , and can be solved only if is known. In order to make the second term negligible, is set large enough so that where is one-pixel width (the smallest unit of the sensor array). Therefore, in the current study the test object was placed at approximately 6 meters away from the camera. As a result, the image displacement of a point source passing through any two pinholes is constant and equal to . In other words, the resulting imaging system is linear shift invariant (LSI).
A single pinhole camera usually adopts a compromising pinhole size so that the diffraction mostly falls within the scope of geometrical imaging, and the hole size gives the maximum optical resolution to the pinhole camera. The optimum diameter for a circle pinhole camera is suggested as1]. Therefore, for the proposed imaging system, the optimum pinhole size is 0.27 mm according to Eq. (4). In addition, in order to ensure each pinhole image to fall at the boundary of a pixel, the pinhole is square and its size is an integer multiple of the pixel size. Thus, the side length of the square pinhole selects mm after considering both Eq. (4) and the alignment of pinholes and sensors, which will produce a square geometric spot of 0.24 mm on the sensor array.
There are different geometric arrangements of mask and sensors . In each arrangement, there are usually two types of field of view (FOV): fully coded field of view (FCFV) and partially coded field of view (PCFV). In current study, we adopt a coded aperture with cyclic pattern, which can provide a wide FOV with a small sensor array [8,22]. The sensor size for a 2 × 2 cyclic version code aperture can be as small as one fourth of the aperture size to record complete information whereas the sensor size for a non-cyclic pinhole array aperture must be four times of the aperture size to record complete information . However, in a cyclic coded aperture light rays from the PCFV are difficult to analyze due to the incomplete coding, and they should avoid reaching the sensors. A feasible solution is to equip an entrance pupil to precisely block the undesired light rays from the PCFV. Alternatively, we capture the target images displayed on the LCD monitor in a dark environment to avoid the light rays from the PCFV.
Similar to , a sparse arrangement of pinholes is adopted in the study to reduce the interference effect. A caveat is that the distances between any two pinholes must be no less than . Once the above requirements are satisfied, the pinholes can be arranged freely. A prototype CA is shown in Fig. 2(a). A cyclic version is employed to ensure each point source contributes a complete cycle on the recorded picture. In other words, each pixel records a linear measurement of the entire image signal (details refer to Section 2.3). The middle part outlined by the red square presents a basic 16 × 16 pattern, and the whole CA is a cyclic 32 × 32 pattern. The implementation is shown in Fig. 2(b). In the prototype camera, the lens set was adapted from a commercial Holga lens set, and its field of view was limited by the physical structure [see the rear view of the lens set in Fig. 2(b)]. A sensor array with the same physical size as the basic pattern (3.84 × 3.84 mm) of the coded aperture is sufficient to recode the information for recovery. The field of view can be enlarged by employing a big aperture with more pinholes.
2.2 Point spread function analysis
Using the proposed imaging system, a superposition of multiple pinhole images is formed on the sensor array, so a decoding process is needed. Owing to the LSI property, the imaging system can be modelled as
If the diffraction is disregarded, the PSF for single pinhole can be assumed as a delta-like function. Thus, each white square in Fig. 2(a) means a pinhole, which corresponds to‘1,’s in the PSF. The black part means light proof, and each black square corresponds to ‘0’s in the PSF. The interval of two ‘1’s equals the separation of the corresponding pinholes in units of pinhole side length. Then, the PSF for the basic pattern is shown in Fig. 3(a). The whole PSF is too big to show, but it is easy to figure out as it is a cyclic version of the basic pattern.
In order to obtain a better optical resolution, the diffraction of the pinhole must be considered. If the convex lens is taken away, the proposed system will have a Fresnel number () of 1.6 where the diffraction satisfies the Fresnel model. After adding a lens, the diffraction will follow the Fraunhofer model given the image plane is on or beyond the focal plane (). However, in the proposed system the image distance was smaller than the focal length (), hence the diffraction pattern was investigated using both numerical analysis and physical test.
We used Eq. (7) to simulate a Fresnel diffraction model of square aperture,Eq. (8) to simulate a Fraunhofer diffraction model,Eq. (7) and (8) to simplify the computation.
On the other hand, the PSF can be obtained through physical test. A Logitech laser pointer with green light (Logitech, Switzerland) diffused by a small pinhole of diameter of 2 mm was used as point light source, and then imaged by a square pinhole lens camera with only a single pinhole on the mask.
The results in Fig. 4 reveals that the diffraction formed from the square pinhole lens aperture is closer to a Fraunhofer model than a Fresnel model. If a pure pinhole camera with pinhole size of 0.44 mm, it will have an optical resolution of 0.44 mm. If only the geometric property of the lens is considered, the single-pinhole lens camera will have an optical resolution of 0.24 mm. However, if the diffraction is taken into account, the single-pinhole lens camera can achieve an optical resolution of approximate 0.042 mm according to the simulation analysis. So, the proposed imaging system improves the imaging quality over a pure pinhole camera.
The geometric spot from a single pinhole lens (one square pinhole with a thin lens) is 0.24 mm. Each pixel on the sensor array has a width of 30μm, so the geometric scope is divided into 8 × 8 grids. By consideration of both Figs. 4(b) and 4(c), a synthesis diffraction pattern of an 8 × 8 array is showed in Fig. 3(b). Then, the PSF is extended 8 times in each dimension by replacing the ‘1’s with the diffraction pattern and stuffing the rest with ‘0’s. Hence, the prototype coded pinhole lens camera can capture an image of 128 × 128 pixels.
The imaging system can be formulated in convolution asEquation (10) also manifests that each sensor records a linear measurement of the entire image.
Autocorrelation is often used to recover the latent image from Eq. (9). For instance, for the specific URA coded aperture, a conjugate array can be constructed so that is an ideal delta function but closed to zero. However, it is difficult to find an ideal conjugate function for the proposed coded aperture. The autocorrelation of contains considerable sidelobes, so the reconstruction quality from autocorrelation is unacceptable. Besides, both and are sparse matrices containing degenerate zero eigenvalues, so direct inversion is impossible. Since Eq. (9) is a convolution, and intuitively a deconvolution can be used to solve it. Wiener deconvolution is chosen as it can perform deconvolution and Wiener denoising simultaneously. Wiener deconvolution usually works in the frequency domain to facilitate computation. The Wiener filter is constructed as Eq. (11),23] methods were adopted to decode the images for comparison. is a 128 × 128 matrix derived from Figs. 2(a) and 2(b), and is a 16384 × 16384 matrix formed by stacking the columns of into the first row and the following row obtained by shifting the previous row one pinhole size (details refer to ). and for the proposed camera are shown in Fig. 5.
Because the Canon sensor array is larger than the required size, cyclic versions of samples are recorded on it. The central one corresponding to the basic pattern is sufficient and necessary for reconstruction (refer to  for detailed explanation). An exhaustive search in an estimate scope was used to find (a 128 × 128 × 3 matrix) which is the one achieving the best reconstruction quality. The three color channels were decoded independently.
Due to the imperfection of opaque of film as well as dark current of image sensor, severe noise remained in the recorded images. A larger penalty coefficient can reduce the noise as well as lose fine image details. Therefore, a dark frame was captured by using an aperture with a fully black film to capture a point light source, and then was subtracted from subsequent images to reduce the above noise. This is an empirical way to reduce the transparency effect of the aperture printed on a film. However, this step will be unnecessary if the aperture is made from opaque materials such as metal.
The recovery algorithms were implemented with MATLAB 2016a. The penalty parameter for TVQC and the noise-to-signal power ratio of the additive noise for Wiener deconvolution were empirically tuned, with values of 0.01 and 0.01 respectively. The decoding process is assembled as below.
|Step 1:||= Raw 720 × 480-pixel photo – dark frame.|
|Step 2:||Choose one cycle image from the center of raw photo, i.e. .|
|Step 3(a):||Invoke the MATLAB function deconvwnr(, , 0.01) where is exactly the same as Fig. 5(a), then the reconstruction is obtained.|
|Step 3(b):||Call the TVQC algorithm where the observation is , the measurement matrix is as Fig. 5(b), the TV prior is used for penalty and the penalty parameter selects 0.01, then the reconstruction is obtained. The R, G and B channels are decoded independently.|
3. Results and discussion
3.1 Imaging point light source
The imaging results for a single light point source are shown in Fig. 6. The results show that (1) no interference noise between pinholes are observed; (2) the diffraction effect becomes more serious if the image is over-exposure; (3) green noise are observed spreading on the whole image, which is caused by the imperfect opaque of the black part of the film.
There are total 80 pinholes on the mask and the same number of point images are observed on the image as well. This phenomenon proves that the proposed imaging system is based on pinhole optics rather than lens imaging theorem. For each pinhole image, the diffraction pattern is closer to the Fraunhofer model than the Fresnel model. All observations are consistent with the analysis in Section 2.
3.2 Imaging practical scene
Images with complex texture were tested as well. In order to investigate the significance of the lens in the proposed imaging system, the pinhole array aperture both with and without a lens were examined. Without use of a thin lens, the proposed camera system has an optical resolution of 0.44 mm, which is too coarse to form a sharp image. An optimum pinhole size is around 0.27 mm for an image distance of 55 mm. Therefore, a commercial Holga pure pinhole camera (pinhole diameter of 0.25 mm) was used to take photos for comparison. On the basis of empirical test, the camera setting for the proposed camera selected ISO 3200 and Tv 1/8 while ISO 6400 and Tv 1s were set for the Holga pinhole camera. The pinhole images are slightly smaller because it has a shorter focal length (pinhole-to-sensor distance of 50mm). Two test cases are shown in Fig. 7 and Fig. 8. More results for different scenes can refer to the supplementary materials.
Because the sensor array in the Canon camera is much bigger than the requirement, reconstruction from a full recorded raw image will obtain a cyclic 2 × 2 version of target image (see Fig. 9), so only the central version of the recorded pixels is used for the above reconstructions. In the prototype camera each pixel recorded a superposition of images from a basic pattern which has 20 square pinholes, so the proposed system was expected to have a light admitting efficiency 20 times of a single pinhole. In practice, the proposed camera set ISO 3200 and Tv 1/8 s while the Holga pinhole camera set ISO 6400 and Tv 1 s, and roughly the former just used 1/16 scan time of the latter (the higher ISO speed of the Holga pinhole camera is considered as doubling the time). Even so, the Holga pinhole images still looked dimmer than the images taken by the proposed camera. Without the convergence of the lens, the pure pinhole array imaging achieved insignificant improvement in light gathering.
In terms of imaging quality, both pinhole and pinhole array suffer inferior imaging quality due to Fresnel diffraction. The pinhole size used in the pinhole array imaging was not optimum, so its images were even worse than those from the commercial pinhole camera. Both the pinhole array and the pinhole array lens imaging systems used the same PSF (Fig. 5) and decoding algorithms for reconstruction. A PSF derived from Fresnel diffraction analysis might improve the imaging quality of the pinhole array system. With the leverage of a thin lens, the reconstructed images from the pinhole array lens camera presented much finer details, seeing the sharp edges of the R letter and house. The optical resolutions for the three imaging systems were 0.44 mm (pure pinhole array camera), 0.25 mm (commercial single pinhole camera) and 0.042 mm (proposed pinhole array lens camera) respectively, which were just reflected by their visual imaging quality. Meanwhile, 80 independent images were observed in the raw images, which testifies the proposed imaging system follows the pinhole optics. Both TVQC and Wiener deconvolution achieved comparable reconstruction quality, although the TVQC reconstructions contained slightly less noise due to its use of total variation (TV) prior in the regularization. However, the Wiener deconvolution performs much faster than the TVQC with decoding time of 0.004 seconds versus 7.2 seconds, evaluated on the basis of MATLAB implementation and 10 trials’ average.
The study demonstrated a new coded pinhole lens imaging system. The proposed system is based on pinhole optics, thus neither auto nor manual focus operation is needed. As oppose to lens imaging, the image distance in the proposed system is smaller than the lens focal length, such that a compact design can be applied. The aperture with many pinholes makes the imaging system efficient in light throughput without sacrifice of depth of field. The system also has the benefits of simplicity and low cost. As pinhole optics, the system is supposed to have infinite depth of field. To enable the prototype camera to be used outdoor, a precise lens set pupil is desired to block light from PCFV. The square pinholes are used because of the alignment of geometric image and the square sensor cells, however, other pinhole shapes are worth exploring. Once the coded aperture pattern is defined, the recovery algorithms can be implemented prior to imaging such that the decoding can be run in real-time.
A coded pinhole lens camera was introduced, aiming to inherit the merits from both pinhole and lens optics. The combination of pinhole and lens was rarely studied previously. This study demonstrated that even if the lens-to-sensor distance was smaller than the lens focal length the proposed system still obtained clear images through pinhole imaging, and the lens improved its optical resolution (for a pinhole size of 0.44 mm) from 0.44 mm (pinhole resolution) to 0.042 mm (diffraction-limited resolution). The study also showed that the proposed imaging system suffered Fraunhofer diffraction rather than Fresnel diffraction, which means the proposed camera may be able to achieve a comparable optical resolution as a lens camera. Meanwhile, the numerous pinholes may enable the proposed camera to achieve better light admitting efficiency than common lens cameras. Besides, as the propose camera is based on pinhole imaging, it is presumed to have infinite depth of field. Therefore, the proposed system could be suitable for applications where a large depth of field is required while the illumination condition is weak.
Support of CQUniversity ECR fellowship to ZW is acknowledged. The authors also thank Dr. Ivan Lee of UniSA for his feedback and proofreading of the manuscript.
2. K. Sayanagi, “Pinhole Imagery,” J. Opt. Soc. Am. 57(9), 1091–1098 (1967). [CrossRef]
3. A. Levin, R. Fergus, F. Durand, and W. T. Freeman, “Image and depth from a conventional camera with a coded aperture,” ACM Trans. Graph. 26(3), 701–709 (2007). [CrossRef]
4. A. Veeraraghavan, R. Raskar, A. Agrawal, A. Mohan, and J. Tumblin, “Dappled photography: mask enhanced cameras for heterodyned light fields and coded aperture refocusing,” ACM Trans. Graph. 26(3), 6901–6912 (2007). [CrossRef]
5. C. Zhou and S. Nayar, “What are good apertures for defocus deblurring?” IEEE International Conference on Computational Photography (ICCP)2009, pp. 1–8. [CrossRef]
6. E. H. Adelson and J. Y. Wang, “Single lens stereo with a plenoptic camera,” IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 99–106 (1992). [CrossRef]
7. R. Ng, M. Levoy, M. Brédif, G. Duval, M. Horowitz, and P. Hanrahan, “Light field photography with a hand-held plenoptic camera,” Computer Science Technical Report CSTR2, 11 (2005).
10. R. H. Dicke, “Scatter-hole cameras for X-rays and gamma rays,” Astrophys. J. 153, L101–L106 (1968). [CrossRef]
11. S. R. Gottesman and E. J. Schneid, “PNP - a new class of coded aperture arrays,” IEEE Trans. Nucl. Sci. 33(1), 745–749 (1986). [CrossRef]
14. C. Thomas, G. Rehm, F. Ewald, and J. Flanagan, “Large aperture X-ray monitors for beam profile diagnostics”, in: G. R. Ian Martin (Ed.) Proceedings of International Beam Instrumentation Conference, Oxford, UK, 2013.
15. W. Wen, S. Wang, Y. Zou, H. Li, S. Liu, G. Tang, Y. Lu, and Z. Guo, “Design of moderator of a compact accelerator-driven neutron source for coded source imaging,” Phys. Procedia 60, 144–150 (2014). [CrossRef]
17. M. J. DeWeert and B. P. Farm, “Lensless coded aperture imaging with separable doubly-toeplitz masks,” Proc. SPIE 91090, Q91091 (2014).
18. M. S. Asif, A. Ayremlou, A. Veeraraghavan, R. Baraniuk, and A. Sankaranarayanan, “FlatCam: Replacing Lenses with Masks and Computation”, Proceedings of IEEE International Conference on Computer Vision Workshop (ICCVW), 2015, pp. 663–666. [CrossRef]
19. C. Slinger, M. Eismann, N. Gordon, K. Lewis, G. McDonald, M. McNie, D. Payne, K. Ridley, M. Strens, G. De Villiers, and R. Wilson, “An investigation of the potential for the use of a high resolution adaptive coded aperture system in the mid-wave infrared,” Proc. SPIE 6714(8), 671408 (2007). [CrossRef]
20. Z. Wang and I. Lee, “Random sparse coded aperture for lensless imaging,” J. Keyser, and P. Wonka (Ed.) the 22nd Pacific Conference on Computer Graphics and Applications, Seoul, 2014, pp. 49–54.
21. E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inf. Theory 52(2), 489–509 (2006). [CrossRef]
22. E. Caroli, J. B. Stephen, G. Dicocco, L. Natalucci, and A. Spizzichino, “Coded aperture imaging in X-ray and gamma-ray astronomy,” Space Sci. Rev. 45(3–4), 349–403 (1987). [CrossRef]
23. E. Candès and J. Romberg, “l1-magic: Recovery of sparse signals via convex programming,” URL: http://statweb.stanford.edu/~candes/l1magic/downloads/l1magic.pdf, available on March 4, 2018.
24. M. Figueiredo, R. Nowak, and S. Wright, “Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems,” IEEE J. Sel. Top. Signal Process. 1(4), 586–597 (2007). [CrossRef]
25. S. R. Gottesman, “Coded apertures: past, present, and future application and design,” Proc. SPIE 6714(5), 671405 (2007). [CrossRef]