Extended depth of field for Fresnel zone aperture camera via fast passive depth estimation

Chen Yang; Cong Ni; Xinye Zhang; Yusen Li; Yusheng Zhai; Weiji He; Wenwen Zhang; Wenwen Zhang; Qian Chen; Qian Chen

doi:10.1364/OE.519871

1. Introduction

Recently, lensless cameras have garnered significant attention due to their liberation from traditional camera architectures [1]. In addition to the inherent advantages of compact size, light weight and low cost, lensless cameras also offer the unique ability to calculate the light field information combined with specialized mask structures. Meanwhile, the design flexibility paves the way for creating miniaturized and multifunctional cameras, making them highly versatile. Therefore, lensless cameras show great potential for a range of applications, including photography [2], wearables [3], microscopy [4], Internet of Things (IoT) [5], and virtual/augmented reality [6].

Lensless imaging breaks the one-to-one mapping relation in traditional imaging, and captures highly multiplexed measurements instead [7]. Since point light sources at different depths yield distinct modulation patterns, the PSF of lensless imaging is depth-dependent [8]. In order to reconstruct accurate scene, sophisticated algorithms and calibrated PSF are indispensable. Recent studies have obtained superior results on single planar targets with a predetermined depth [9–14], and highlight the importance of utilizing a matching PSF during reconstruction. The accuracy of PSF directly impacts the quality of image reconstruction, and inaccurate PSF leads to blurs and artifacts in the reconstructed images [15]. The depth-dependent PSF poses challenges for resolving scenes across a large DoF, thereby limiting the practical application in real-world scenarios [16].

DoF is a pivotal parameter in imaging, profoundly influencing the quality of acquired spatial information. In conventional imaging, the DoF can be controlled by adjusting the aperture size and shape, or by changing the focal length [17,18]. Alternatively, cubic phase mask can be utilized to produce a nearly defocus-invariant PSF [19]. A similar concept has been introduced to lensless imaging. Currently, mainstream research has focused on designing masks with defocus-invariant PSF. End-to-end optimization frameworks have been utilized to optimize the design details and improve the reconstruction process [20–23]. However, specific structural design and complex manufacturing processes are required in these methods, limiting the versatility and widespread applicability.

In fact, if the depth of target plane relative to the mask is determined in advance, the scene can be accurately reconstructed using an appropriate PSF. By utilizing focus stacking methods, the challenge of DoF extension can be addressed effectively [24,25]. Inspired by autofocus algorithm, target depth estimation becomes feasible. For example, Liu et al. proposed an autofocusing method to address QR code recognition problem for FZA lensless imaging [26]. This method assesses the sharpness of reconstructed images using PSF within a defined search range to determine the target depth. Furthermore, the standardized structure of FZA facilitates the PSF computation at various depths rapidly [27], significantly reducing the demand for PSF calibration. While increasing the number of reconstructed images can indeed enhance depth estimation accuracy, it may also compromise computational efficiency.

In this paper, we propose a single-shot DoF extension method for lensless imaging using FZA. Our approach incorporates an improved ternary search method to efficiently compress the search range and rapidly identify the optimal depth by evaluating the reconstructed images with the nuclear norm of the gradient (NoG) [28]. To enhance reconstruction quality, a physical constraint of positive reconstruction intensity is integrated into imaging model [29]. Combining the alternating direction method of multipliers (ADMM) algorithm [30], scene re-focusing and noise suppression are achieved effectively. Finally, the DoF is extended through focus stacking. The DoF at 130 mm is 3 cm through experimental calibration in advance. And our experimental results demonstrate an 8-fold DoF extension at the depth of 130mm, highlighting the practical utility of our approach. Furthermore, our method for depth estimation offers a significant speedup, being 5-fold faster than traversal method maintaining the same level of accuracy. Therefore, our proposed method facilitates better performance in various applications of lensless imaging systems, including photography, microscopy, surveillance, and facial recognition.

2. Methodology

2.1 FZA imaging

The pipeline of the proposed single-shot DoF extension method is shown in Fig. 1. A simple FZA mask is placed a few millimeters in front of the sensor. FZA is a binary mask with a regular structure determined by the innermost zone radius $r_0$. Unlike random masks, its structure and transmittance function can be calculated accurately, thereby reducing the complex calibration process of PSF. The structure of FZA in $x-y$ plane can be represented as:

(1)$$FZA(x,y) = \left\{{\frac{1}{2} + \frac{1}{2} sgn \left[{\cos \left( {\boldsymbol{\pi}\frac{{x^2} + {y^2}}{{r_0^2}}} \right)} \right]} \right\} circ(x,y,R),$$

where $R$ is the pupil radius, $circ$ is circle function, and $sgn$ is the signum function. As shown in Fig. 2(a-b), FZA camera exhibits a clear dependence on depth, indicating that point sources at different depths will generate PSFs with similar structures and different sizes. This enables depth information of the target scene to be embedded into encoded image [31].

Fig. 1. Overview of our proposed single-shot DoF extension method. The encoded image is captured by the FZA camera. According to estimated depth, the corresponding PSFs are involved into ADMM algorithm to reconstruct targets at different depths. The DoF is extended through focus stacking.

Download Full Size | PDF

Fig. 2. (a),(b) The PSF varies with the depth of the point source. (c) The structural similarity index measure (SSIM) between PSFs at varying depths and the reference PSFs at 100mm, 200mm, 300mm and 400mm respectively. As depth increases, the difference of PSF decreases.

Download Full Size | PDF

A point source target propagated to the FZA surface over a distance $z$ is represented as $U_1$, ${U_1} = \frac {1}{{i\lambda z}}{e^{i\frac {{2\boldsymbol {\pi } }}{\lambda }\sqrt {{x^2} + {y^2} + {z^2}} }}$. After being encoded by FZA, $U_1$ propagates a very short distance $d$ to the sensor surface. The PSF can be calculated in the frequency domain $(u,v)$ by angular spectrum method [32].

(2)$${PSF}(x,y) = \int\limits_\lambda {\left| {{\mathcal{F}^{ - 1}}\left[{\mathcal{F}({U_1} \cdot FZA) \cdot {e^{i2\boldsymbol{\pi} d\sqrt {\frac{1}{{{\lambda ^2}}} - {u^2} - {v^2}} }}} \right]} \right|^2} \boldsymbol{d}\lambda,$$

where $\lambda$ is wavelength, $\mathcal {F}$ and $\mathcal {F}^{ - 1}$ are Fourier transform operator and inverse Fourier transform operator, respectively. According to the lensless imaging process, the encoded image $I$ can be represented as the convolution of the target image $O$ and PSF,

(3) $$I(x,y) = O(x,y) * PSF(x,y) + n(x,y),$$

where $n(x,y)$ is system noise. Using the PSF calculated by Eq. (2), the target image can be reconstructed with high quality through iterative algorithms such as ADMM. In our subsequent algorithms, target depth estimation necessitates multiple reconstructions with PSFs at varying depths, for which the back propagation (BP) method is suitable for rapid reconstruction [27]. As FZA is a binary approximation of the continuously varying transmittance in the Gabor zone plate (GZP), we employ the transmission function of GZP to approximate the PSF of FZA, facilitating the derivation of the BP method. The transmission function can be represented as Eq. (4).

(4)$$T(x,y) = \frac{1}{2} + \frac{1}{2}\cos \left( {\boldsymbol{\pi}\frac{{{x^2} + {y^2}}}{{r_1^2}}} \right),$$

where $r_1$ can be expressed as ${r_1} = (1 + d/z){r_0}$. Then by expanding Eq. (4) using Euler’s formula and subsequently substituting the result into Eq. (3), we can derive $I(x,y)$ as

(5)$$I(x,y) = C + \frac{1}{2}{\mathop{\rm Re}\nolimits} \left[ { O(x,y) * {e^{i(\boldsymbol{\pi} /r_1^2)({x^2} + {y^2})}}} \right] + n(x,y),$$

where $C$ is a constant component. The convolution kernel in frequency domain is expressed as Eq. (6),

(6)$$H(u,v) = i\exp \left[ { - i\boldsymbol{\pi} r_0^2({u^2} + {v^2}){(1 + d/z)^2}} \right].$$

Then, reconstruction model based on back propagation is expressed as Eq. (7),

(7)$$O(x,y) = {\mathop{\rm Re}\nolimits} \{ {\mathcal{F}^{ - 1}}\{\mathcal{F}[I(x,y)]/H(u,v)\}\}.$$

Equation (6) shows that the depth of target $z$ has a significant impact on PSF. And the clarity of reconstructed image depends on the accuracy of PSF. Therefore, the target depth can be estimated by assessing the clarity of reconstructed images. Furthermore, the accuracy of PSF can be evaluated using SSIM, as depicted in Fig. 2(c). As the depth of the PSF used for reconstruction approaches the actual target depth, SSIM of PSF increases, resulting in clearer reconstructed images. This depth-dependence PSF suggests that the DoF in lensless imaging is related to the SSIM of PSF.

2.2 Depth estimation

Due to the absence of straight reference images for target depth estimation, it is necessary to rely on no-reference image quality assessment metrics to evaluate the reconstructed images. These metrics typically extract image sharpness features from common characteristics among pixels in both spatial and frequency domains. The metric is selected from popular metrics, such as NoG [28], tamura of the gradient (ToG) [33], laplacian (LAP), gradient (GRA) [34], and sum of modulus of gray difference (SMD) [26]. To ensure accurate and rapid target depth estimation, the chosen metric needs to be unimodal, sensitive to image sharpness, and robust to noise preferably. To determine the most suitable metric, we simulate the encoded images with a target at 150mm and 300mm, and reconstruct these images with PSF corresponding to different depths. The normalized sharpness valuation curves for different metrics are presented in Fig. 3.

Fig. 3. The sharpness valuation curves of different metrics. (a),(b) are the encoded images at the object distances of 150mm and 300mm. (c),(d) are the sharpness valuation curves of encoded images without noise. (e),(f) are the sharpness valuation curves of encoded images with Gaussian noise added at a variance of 0.01. (g)-(j) are zoom-in curves.

Download Full Size | PDF

Intriguing results emerge from the analysis of Fig. 3. The limiting width of the curve narrows as the target approaches. This aligns well with the rule that the difference between PSF increases when the depth decreases, as shown in Fig. 2(c). Among the options considered, LAP and SMD are susceptible to noise. And the depth corresponding to the peak of the GRA curve deviates from the actual depth. Then the limiting width of ToG is broader than others, suggesting a lower sensitivity of ToG compared to NoG. Thus, NoG is chosen as the sharpness valuation metric for subsequent algorithms due to its balanced attributes of unimodality, sensitivity, and robustness. The NoG can be formulated as

(8)$$\text{NoG} ={\| {{{(\nabla {f_x})}^2} + {{(\nabla {f_y})}^2}} \|_F},$$

where $f$ is the image to be evaluated, $\nabla {f_x}$ and $\nabla {f_y}$ are the image gradients along the $x$ and $y$ directions, respectively. Here ${\| \cdot \|_F}$ denotes the Frobenius norm. Combined with Eq. (6), the passive depth estimation model can be expressed as

(9)$$z = \arg \max {\rm{NoG}} \{ {\mathcal{F}^{ - 1}}\left[{ \mathcal{F}(I)/H} \right] \}.$$

The curve of $z$ exhibits a similar structure to a concave function curve. The ternary search method is employed to expedite the search process for global maximum and enhance autofocus efficiency. There are four potential scenarios in Fig. 4(a-d) that may arise during the depth estimation process. By comparing the clarity of the reconstructed images at $z_1$ and $z_2$, we discard the less clear side to quickly compress the search range.

Fig. 4. (a)-(d) Four possible cases of ternary search method. $z_0$-$z_3$ is the whole depth range. $z_1$ and $z_2$ are third equinoxes and the red slash denotes discard. The global maximums in (a) and (d) are at the right of $z_2$ and at the left of $z_1$, respectively. The global maximums in (b),(c) are in the middle of $z_1$ and $z_2$. We drop out the depth range of $z_0-z_1$ when NoG($z_1$)<NoG($z_2$), and drop out the depth range of $z_2-z_3$ when NoG($z_1$)>NoG($z_2$). (e) Flow chart of improved ternary search method. The unit in this figure is mm.

Download Full Size | PDF

The NoG curve may fluctuate due to the presence of noise and diffraction, as shown in Fig. 3. And the accuracy of depth estimation may be impacted. To address this matter, an improved ternary search method is proposed to enhance its robustness. As shown in Fig. 4(e), the results from previous iterations are stored in $Z$, and the maximum and minimum of NoG are selected as the new ternary search positions within the latest depth range. By incorporating information from previous iterations, this method transcends the limitations of current cycle’s tripartite information, and improves fault tolerance to enhance the accuracy of target depth estimation. Moreover, in scenarios involving multi-object scenes, the assumption that a global best PSF exists becomes untenable. This is primarily due to the fact that objects may be located at different depths, resulting in varying degrees of blur and requiring distinct PSF for accurate reconstruction. To address this challenge, preprocessing emerges as a pivotal step before estimating the target distance. This preprocessing step is an adaptive object region extraction technique including edge detection and cluster analysis. By isolating specific target areas, the improved ternary search method determines the intended target depth more accurately, enhancing its overall effectiveness in complex visual environments.

2.3 DoF extension

Combining the depth of targets from the improved ternary search method, we calculate the PSF based on diffraction process. Image reconstruction can be achieved by minimizing Eq. (10). A Total Variation (TV) regularization term is added after the least squares constraint term to enhance the clarity of image. A regularization term is also added to constrain the intensity of reconstructed scene to be positive values [29], aiming to remove background noise from the reconstructed image.

(10)$$x = \mathop {\arg \min }_x \frac{1}{2}\left\| {Hx - b} \right\|_2^2 + \tau {\left\| \varphi x \right\|_1} + {1_ + }(x),$$

where $\varphi$ and ${\left \| \cdot \right \|_1}$ represent the 2D gradient operator and the ${l_1}$ norm, respectively. $H$ is the PSF, $b$ is the encoded image and $x$ is target image. ${{1_ + }( \cdot )}$ is the nonnegativity barrier function that returns 0 when the argument is nonnegative and infinity in other cases.

Due to the challenge posed by solving the problem Eq. (10), we apply ADMM approach to handle this issue. Firstly, we introduce two auxiliary variables $u=\varphi x$ and $w=x$, and formulate the augmented Lagrangian function as

(11)$$\mathcal{L}(x,u,w,y_1,y_2)= \frac{1}{2}\left\| {Hx - b} \right\|_2^2 + \tau {\left\| u \right\|_1} + \frac{{{\mu _1}}}{2}\left\| {\varphi x - u + \frac{{{y_1}}}{{{\mu _1}}}} \right\|_2^2+ {1_ + }(w) + \frac{{{\mu _2}}}{2}\left\| {x - w + \frac{{{y_2}}}{{{\mu _2}}}} \right\|_2^2,$$

where $y_1, y_2$ are dual variables and $\mu _1,\mu _2$ are penalty parameters. Based on the ADMM method, the solution of problem Eq. (10) can then be acquired by solving the following subproblems about $x,u,w$ and updating the dual variables in a iteration manner until convergence.

(12)$$\left\{ \begin{array}{l} \begin{aligned} x^{k+1} & =\arg\min \mathcal{L}(x^{k},u^{k},w^{k},y^{k}_1,y^{k}_2)\\ u^{k+1} & =\arg\min \mathcal{L}(x^{k+1},u^{k},w^{k},y^{k}_1,y^{k}_2)\\ w^{k+1} & =\arg\min \mathcal{L}(x^{k+1},u^{k+1},w^{k},y^{k}_1,y^{k}_2)\\ {y_1}^{k + 1} & = {y_1}^k + \beta {\mu _1}(\varphi {x^{k + 1}} - {u^{k + 1}})\\ {y_2}^{k + 1} & = {y_2}^k + \beta {\mu _2}({x^{k + 1}} - {w^{k + 1}}) \end{aligned} \end{array} \right.\ ,$$

where the subscript denotes the iteration number and ${\beta }$ is an appropriately chosen step length. The updating rule of the above subproblems can be derived as

(13)$$\left\{ \begin{array}{l} {x^{k + 1}} = \frac{{{H^T}b + {\mu _1}{\varphi ^T}{u^k} - {\varphi ^T}{y_1}^k + {\mu _2}{w^k} - {y_2}^k}}{{{H^T}H + {\mu _1}{\varphi ^T}\varphi + {\mu _2}}}\\ {u^{k + 1}} = {S_{\frac{\tau }{{{\mu _1}}}}}(\varphi {x^{k+1}} + \frac{{{y_1}^k}}{{{\mu _1}}})\\ {w^{k + 1}} = \max \{ {x^{k+1}} + \frac{{{y_2}^k}}{{{\mu _2}}},0\}\\ \end{array} \right..$$

In Eq. (13), ${S_{\frac {\tau }{{{\mu _1}}}}}$ is a soft threshold function. The iteration steps are summarized as Algorithm 1.

Algorithm 1. ADMM for image reconstruction

View Table | View all tables in this article

Combining the segmented target areas obtained during the preprocessing of depth estimation, a preliminary alpha blending mask can be constructed. Subsequently, by analyzing the image gradients of reconstructed images focused at specific depths, the edges of targets can be further refined through filtering to form a more precise alpha blending mask [35,36]. And DoF extension can be achieved by employing a focus stacking algorithm,

(14)$$I = {I_{foreground}} \cdot {\alpha _{mask}} + {I_{background}} \cdot (1 - {\alpha _{mask}}).$$

3. Experiment

The experimental setup is depicted in Fig. 5. The target images are displayed on the surface of a screen. The radius of FZA is 4.55 mm and the innermost zone radius of FZA is 0.325 mm. We utilize the QHY163M camera, equipped with an image sensor boasting a pixel size of 3.75 $\mathrm{\mu}$m. To ensure precise alignment and spacing, the FZA and the image sensor are integrated using a homemade structure. This arrangement maintains a consistent distance of approximately 3.41 mm between the two components. A stitched three-axis adjustable displacement stage is used to control the relative motion between the target and the imaging system. This stage facilitates fine adjustments in all directions, ensuring optimal alignment for accurate image acquisition.

Fig. 5. Experimental setup. (a) The experimental setup including the FZA, sensor, display screen and guide rail. (b) The front of homemade structure for FZA and sensor. (c) The detail of FZA. The innermost zone radius of FZA is 0.325mm.

Download Full Size | PDF

3.1 Depth estimation

We use the experimental setup to capture the encoded images. The distance between targets and FZA camera is altered from 70-300mm. Then, we estimate the target depth with the ternary search method, improved ternary search method, traversal method, and W-T-N [26]. W-T-N is the autofocusing method proposed by Liu et al., which is based on the weighted mean metric between the ToG and NoG. The outcomes of our depth estimation are presented in Fig. 6 and Table 1.

Fig. 6. Depth estimation results. (a), (b), (c) are comparisons of depth estimation results and errors among different algorithms. (d), (e), (f) are runtime comparisons of different algorithms.

Download Full Size | PDF

Table 1. Summary of depth estimation results for single targets

View Table | View all tables in this article

The experimental results align well with the expectation. Through a comprehensive analysis of the aforementioned results, NoG effectively distinguishes the clarity of reconstructed images. The proposed method demonstrates similar accuracy compared to the traversal method and W-T-N. Furthermore, the improved ternary search method significantly enhances computational efficiency by 5 times. As the number of pixels and image complexity escalate, the associated increase in processing time is much smaller compared to the traversal method. The substantial improvement in efficiency contributes to more efficient passive depth estimation for targets.

Although the ternary search method takes the shortest time, it may encounter serious errors in certain cases. Conversely, the improved ternary search method exhibits better performance in depth estimation accuracy, and effectively mitigates disturbances arising from noise, diffraction, and the frequency distribution within the image. Particularly, the improved ternary search method can address significant errors caused by the Talbot effect and ensure the accuracy of passive depth estimation, especially when dealing with regular binary images such as QR codes. Additionally, the depth estimation results are utilized to calculate the PSFs, enabling image reconstruction at different depths. The reconstructions are shown in Fig. 7. The fringes around the reconstructions by Wiener filter are related to the missing of the outermost zone information of the FZA in the Wiener filter kernel, which is computed using the idealized convolution kernel in Eq. (6). To obtain high-quality reconstructions, we utilize the ADMM algorithm and calculate more precise PSFs using Eq. (2). The ${1_ + }(x)$ regularization term effectively suppresses fringes in the reconstructed image.

Fig. 7. The reconstruction of targets at different depths. (a) are the target images. (b) are reconstructions by Wiener filter. (c) are reconstructions by ADMM with only TV. (d) are reconstructions by ADMM with ${1_ + }(x)$ based on (c).

Download Full Size | PDF

3.2 DoF extension

In lensless imaging, it is indeed possible to refocus on targets at any position by utilizing a suitable PSF. We calibrate the DoF of FZA camera in advance. The calibration process can be carried out using the experimental setup in Fig. 5. The DoF is defined as the range of depths within which the reconstructed image can resolve the stripes corresponding to 1/8 line pairs per pixel [15,37]. Specifically, this pertains to the group 0, element 1 of our target. As shown in Fig. 8, as the inaccuracy of PSF increases, the fringe contrast of the reconstructed target gradually diminishes. Consequently, the resolution of reconstructed images also decreases correspondingly. It is worth emphasizing that the image reconstructed using a precisely accurate PSF yields the sharpest appearance. The DoF of FZA camera at 130 mm is approximately 3 cm.

Fig. 8. The reconstructions of resolution target at 115 mm-145 mm using PSF at 130 mm. For group 0, element 1, the DoF is 3 cm (115 mm-145 mm). (a1)-(a8) are reconstructed images. (b1)-(b8) are zoom-in images and intensity cross sections of group 0, element 1. More experimental results are shown in Fig. S1 of Supplement 1.

Download Full Size | PDF

We replace the display screen in Fig. 5 with two targets positioned at different depths and estimate the depth with improved ternary search method. The depth of target is determined as the focused region gets the highest NoG. There are five relative positions of the two targets and the results of DoF extension in Fig. 9. Due to the mutual interference of light field information between the two targets, the error of target depth estimation shown in Table 2 is around 4 mm. From the refocusing results in Fig. 10, a noticeable change in the front and back focus points can be observed. As the distance between two targets increases, the diffraction rings become more pronounced and noise is aggravated. Combined with focus stacking algorithm, we obtain the DoF extension results. This process effectively extends the DoF of FZA camera. Specifically, the maximum distance between the two targets is about 25 cm, and the DoF is extended by 8 times at the depth of 130mm. This enhancement demonstrates the capacity of our approach in improving the imaging capabilities of lensless cameras.

Fig. 9. DoF extensions. (a1)-(a5) are scenes with targets at different depths. (b1)-(b5) are the encoded images. (c1)-(c5) are the images with extended DoF. The heights of grimace target and ghost target measure approximately 2 cm and 10 cm, respectively.

Download Full Size | PDF

Fig. 10. Targets re-focusing. (a1)-(a5) are reconstructed images focused on the front target. (b1)-(b5) are reconstructed images focused on the back target. (c1)-(c5) are reconstructed images focused at 130 mm.

Download Full Size | PDF

Table 2. Summary of depth estimation results for multi-targets

View Table | View all tables in this article

4. Conclusion

In this paper, we present a single-shot DoF extension method for lensless imaging using FZA. The image sharpness metrics are investigated to assist depth estimation. And improved ternary search method with NoG is used to determine the target depth passively and rapidly. Then ADMM and focus stacking algorithm are utilized to achieve DoF extension. The proposed method offers significant DoF extension without resorting to complex mask designs, and also obtains the spatial 3D information of scene. Furthermore, with the simplicity and convenience, lensless cameras can emerge as a viable complement for accurate target depth estimation at close ranges, akin to the rangefinder functionality on smartphones. Future research will be dedicated to addressing complex scenes with potential occlusions. The accuracy of depth information extraction and the effectiveness of DoF extension in such challenging environments are intended to be enhanced through the application of advanced techniques, such as deep learning. Thereby, the proposed method paves the way for intelligent ultra-thin cameras.

Funding

National Natural Science Foundation of China (62375129).

Acknowledgments

The work is supported by the National Natural Science Foundation of China (62375129).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. V. Boominathan, J. T. Robinson, L. Waller, et al., “Recent advances in lensless imaging,” Optica 9(1), 1–16 (2022). [CrossRef]

2. M. S. Asif, A. Ayremlou, A. Sankaranarayanan, et al., “Flatcam: Thin, lensless cameras using coded aperture and computation,” IEEE Trans. Comput. Imaging 3(3), 384–397 (2017). [CrossRef]

3. M. Cornacchia, K. Ozcan, Y. Zheng, et al., “A survey on activity detection and classification using wearable sensors,” IEEE Sens. J. 17(2), 386–403 (2017). [CrossRef]

4. A. Ozcan and E. McLeod, “Lensless imaging and sensing,” Annu. Rev. Biomed. Eng. 18(1), 77–102 (2016). PMID: 27420569. [CrossRef]

5. J. Tan, L. Niu, J. K. Adams, et al., “Face detection and verification using lensless cameras,” IEEE Trans. Comput. Imaging 5(2), 180–194 (2019). [CrossRef]

6. F. Zhou, F. Zhou, Y. Chen, et al., “Vector light field display based on an intertwined flat lens with large depth of focus,” Optica 9(3), 288–294 (2022). [CrossRef]

7. V. Boominathan, J. K. Adams, J. T. Robinson, et al., “Phlatcam: Designed phase-mask based thin lensless camera,” IEEE Trans. Pattern Anal. Mach. Intell. 42(7), 1618–1629 (2020). [CrossRef]

8. M. S. Asif, “Toward depth estimation using mask-based lensless cameras,” in 51st Asilomar Conference on Signals, Systems, and Computers (2017), pp. 1467–1470.

9. S. S. Khan, V. Sundar, V. Boominathan, et al., “Flatnet: Towards photorealistic scene reconstruction from lensless measurements,” IEEE Trans. Pattern Anal. Mach. Intell. 44(4), 1934–1948 (2020). [CrossRef]

10. J. Wu, L. Cao, and G. Barbastathis, “Dnn-fza camera: a deep learning approach toward broadband fza lensless imaging,” Opt. Lett. 46(1), 130–133 (2021). [CrossRef]

11. Y. Ma, J. Wu, S. Chen, et al., “Explicit-restriction convolutional framework for lensless imaging,” Opt. Express 30(9), 15266–15278 (2022). [CrossRef]

12. O. Kingshott, N. Antipa, E. Bostan, et al., “Unrolled primal-dual networks for lensless cameras,” Opt. Express 30(26), 46324–46335 (2022). [CrossRef]

13. Y. Zhang, Z. Wu, Y. Xu, et al., “Dual-branch fusion model for lensless imaging,” Opt. Express 31(12), 19463–19477 (2023). [CrossRef]

14. J. Soltau, P. Meyer, R. Hartmann, et al., “Full-field x-ray fluorescence imaging using a fresnel zone plate coded aperture,” Optica 10(1), 127–133 (2023). [CrossRef]

15. J. Tan, V. Boominathan, A. Veeraraghavan, et al., “Flat focus: depth of field analysis for the flatcam lensless imaging system,” in International Conference on Acoustics, Speech and Signal Processing (IEEE, 2017), pp. 6473–6477.

16. Q. Fan, W. Xu, X. mei Hu, et al., “Trilobite-inspired neural nanophotonic light-field camera with extreme depth-of-field,” Nat. Commun. 13(1), 2130 (2022). [CrossRef]

17. D. Hong and H. Cho, “Depth-of-field extension method using variable annular pupil division,” IEEE/ASME Trans. Mechatron. 17(2), 390–396 (2012). [CrossRef]

18. C. Zhou, S. Lin, and S. Nayar, “Coded aperture pairs for depth from defocus,” in 12th International Conference on Computer Vision (IEEE, 2009), pp. 325–332.

19. E. R. Dowski and W. T. Cathey, “Extended depth of field through wave-front coding,” Appl. Opt. 34(11), 1859–1866 (1995). [CrossRef]

20. U. Akpinar, E. Sahin, M. Meem, et al., “Learning wavefront coding for extended depth of field imaging,” IEEE Trans. on Image Process. 30, 3307–3320 (2021). [CrossRef]

21. S.-H. Baek, H. Ikoma, D. S. Jeon, et al., “Single-shot hyperspectral-depth imaging with learned diffractive optics,” in IEEE/CVF International Conference on Computer Vision (2021), pp. 2631–2640.

22. Y. Liu, C. Zhang, T. Kou, et al., “End-to-end computational optics with a singlet lens for large depth-of-field imaging,” Opt. Express 29(18), 28530–28548 (2021). [CrossRef]

23. S. Pinilla, J. E. Fröch, S. R. M. Rostami, et al., “Miniature color camera via flat hybrid meta-optics,” Sci. Adv. 9(21), eadg7297 (2023). [CrossRef]

24. S. Nazir, L. Vaquero, M. Mucientes, et al., “Depth estimation and image restoration by deep learning from defocused images,” IEEE Trans. Comput. Imaging 9, 607–619 (2023). [CrossRef]

25. C. Wang, Q. Huang, M. Cheng, et al., “Deep learning for camera autofocus,” IEEE Trans. Comput. Imaging 7, 258–271 (2021). [CrossRef]

26. F. Liu, J. Wu, and L. Cao, “Autofocusing of fresnel zone aperture lensless imaging for qr code recognition,” Opt. Express 31(10), 15889–15903 (2023). [CrossRef]

27. J. Wu, H. Zhang, W. Zhang, et al., “Single-shot lensless imaging with fresnel zone aperture and incoherent illumination,” Light: Sci. Appl. 9(1), 53 (2020). [CrossRef]

28. C. Guo, F. Zhang, X. Liu, et al., “Lensfree auto-focusing imaging using nuclear norm of gradient,” Opt. Lasers Eng. 156, 107076 (2022). [CrossRef]

29. N. Antipa, G. Kuo, R. Heckel, et al., “Diffusercam: lensless single-exposure 3d imaging,” Optica 5(1), 1–9 (2018). [CrossRef]

30. S. H. Chan, R. Khoshabeh, K. B. Gibson, et al., “An augmented lagrangian method for total variation video restoration,” IEEE Trans. on Image Process. 20(11), 3097–3111 (2011). [CrossRef]

31. J. Chen, F. Wang, Y. Li, et al., “Lensless computationally defined confocal incoherent imaging with a fresnel zone plane as coded aperture,” Opt. Lett. 48(17), 4520–4523 (2023). [CrossRef]

32. J. W. Goodman and M. E. Cox, “Introduction to Fourier Optics,” Phys. Today 22(4), 97–101 (1969). [CrossRef]

33. P. Memmolo, C. Distante, M. Paturzo, et al., “Automatic focusing in digital holography and its application to stretched holograms,” Opt. Lett. 36(10), 1945–1947 (2011). [CrossRef]

34. P. Langehanenberg, B. Kemper, D. Dirksen, et al., “Autofocusing in digital holographic phase contrast microscopy on pure phase objects for live cell imaging,” Appl. Opt. 47(19), D176–D182 (2008). [CrossRef]

35. X. Qiu, M. Li, L. Zhang, et al., “Guided filter-based multi-focus image fusion through focus region detection,” Signal Process. Image Commun. 72, 35–46 (2019). [CrossRef]

36. S. Li, X. Kang, J. Hu, et al., “Image matting for fusion of multi-focus images in dynamic scenes,” Inf. Fusion 14(2), 147–162 (2013). [CrossRef]

37. M. Broxton, L. Grosenick, S. Yang, et al., “Wave optics theory and 3-d deconvolution for the light field microscope,” Opt. Express 21(21), 25418–25439 (2013). [CrossRef]

Target	Results	W-T-N	Traversal method	Ternary search method	Improved ternary search method
QR code	Mean error(mm)	0.96	0.83	2.13	0.83
QR code	Mean time(s)	95.94	72.79	7.24	15.44
School badge	Mean error(mm)	0.91	0.91	1.09	0.96
School badge	Mean time(s)	96.40	81.37	8.28	15.43
Peppers	Mean error(mm)	1.22	1.17	1.57	1.09
Peppers	Mean time(s)	96.45	78.70	8.18	16.84

	Aim	Scene a1	Scene a2	Scene a3	Scene a4	Scene a5
Grimace	Real depth(mm)	110	100	90	80	70
	Estimation(mm)	114	100	89	82	73
	error(mm)	4	0	1	2	3
Ghost	Real depth(mm)	150	170	220	260	320
	Estimation(mm)	154	174	224	257	317
	error(mm)	4	4	4	3	3

Target	Results	W-T-N	Traversal method	Ternary search method	Improved ternary search method
QR code	Mean error(mm)	0.96	0.83	2.13	0.83
QR code	Mean time(s)	95.94	72.79	7.24	15.44
School badge	Mean error(mm)	0.91	0.91	1.09	0.96
School badge	Mean time(s)	96.40	81.37	8.28	15.43
Peppers	Mean error(mm)	1.22	1.17	1.57	1.09
Peppers	Mean time(s)	96.45	78.70	8.18	16.84

	Aim	Scene a1	Scene a2	Scene a3	Scene a4	Scene a5
Grimace	Real depth(mm)	110	100	90	80	70
	Estimation(mm)	114	100	89	82	73
	error(mm)	4	0	1	2	3
Ghost	Real depth(mm)	150	170	220	260	320
	Estimation(mm)	154	174	224	257	317
	error(mm)	4	4	4	3	3

Extended depth of field for Fresnel zone aperture camera via fast passive depth estimation

Abstract

1. Introduction

2. Methodology

2.1 FZA imaging

2.2 Depth estimation

2.3 DoF extension

3. Experiment

3.1 Depth estimation

3.2 DoF extension

4. Conclusion

Funding

Acknowledgments

Disclosures

Data availability

Supplemental document

References

Supplementary Material (1)

Data availability

Cited By

Figures (10)

Tables (3)

Equations (14)

Optics Express