## Abstract

Three-dimensional (3D) shape measurement based on the fringe projection technique has been extensively used for scientific discoveries and industrial practices. Yet, one of the most challenging issues is its limited depth of field (DOF). This paper presents a method to drastically increase DOF of 3D shape measurement technique by employing the focal sweep method. The proposed method employs an electrically tunable lens (ETL) to rapidly sweep the focal plane during image integration and the post deconvolution algorithm to reconstruct focused images for 3D reconstruction. Experimental results demonstrated that our proposed method can achieve high-resolution and high-accuracy 3D shape measurement with greatly improved DOF in real time.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Over the years, many 3D imaging technologies have been developed for different applications. Among these techniques, fringe projection profilometry (FPP) techniques have set the benchmark for performance with regard to the accuracy, accessibility, and operating range [1]. FPP is performed by illuminating the scene with spatial or temporal encoded patterns, and recovering depth information from observed deformed patterns [2].

Most FPP systems use a fixed focal length lens with a large aperture to maximize light throughput efficiency, resulting in degenerated accuracy and lost details for measuring large depth variation scenes. The state-of-the-art methods for extending the depth of field (DOF) of FPP systems can be classified into two major categories: single-focusing method and multi-focusing method. Single-focusing methods do not alter the focal plane of the system, but change pattern designs [3–6], optimize projection strategies [7], or introduce phase compensation [8]. These methods aim to increase the tolerance to lens blur. They are fast compared with multi-focusing methods. However, they cannot handle a very large depth variation if the camera lens is significantly defocused and the captured image details are lost. In contrast, multi-focusing methods set multiple focal planes to cover a very large depth range at the cost of time. For example, Achar and Narasimhan [9] proposed to mechanically change the focal plane of the camera to multiple positions, characterize lens blur of each focal position by analyzing a large number of captured images, and then compute the all-in-focus images. Though works well, this method is difficult to realize and may not produce consistent results.

Different from those methods that change focal planes mechanically, there are some recent developments in extending DOF of FPP systems by utilizing electrically tunable lens (ETL). One technique used an ETL to rapidly acquire multiple discrete focus images, unwrapped the phase by focal length geometric constraints, obtained the 3D informations of the entire scene by merging results from different focus settings [10,11]. Hu et al. [12] employed the continuously variable focus reconstruction model based on ETL, and adopted a focal plane detection algorithm to realize autofocusing. Alternatively, Zhong et al. [13] proposed to use two projectors for triangulation and the camera with ETL to realize autofocusing. Overall, the state-of-the-art ETL-based methods extend the DOF through multi-focusing; require complicated pre-calibration procedures and bulky post image registration, segmentation and fusion; and could have significant speed loss due to the start-stop mode of ETL for multi-focusing.

In parallel, extending the DOF of 2D imaging has been extensively studied especially in the field of computational photography. According to the blur kernel used, these computational methods can be classified as depth-dependent or depth-invariant methods. Depth-dependent method includes blind deconvolution (or deblurring) [14], coded aperture [15] and focal stack [16]; and depth-invariant method refers to wavefront coding [17], diffusion coded [18] and focal sweep [19]. Despite the success of extending DOF for 2D computational photography, there are limited study to employ such interesting methods to enhance 3D shape measurement with fringe projection techniques. One important reason is that 2D photography does not need to accurately know the value of focal length, while FPP does need that. Another possible reason might be that such study was carried out in a completely different community, computer science, that researchers in optical metrology did not pay attention to.

This paper adopts one of the techniques called focal sweep to drastically extend the DOF of 3D shape measurement systems. The proposed method continuously sweeps the focal plane over a wide range of depth range using ETL during the integration period of each fringe pattern capture, applies the deconvolution kernel to the resulting fringe image, and analyzes the computed fringe images for 3D reconstruction. As proved in 2D photography, the fringe image obtained by focal sweep is uniformly blurred and the blurring effect is depth independent, and thus using the same deconvolution kernel for the entire scene can avoid computation complexity of traditional multi-focusing methods. Our experimental results demonstrated that these simplifications do not compromise measurement quality, and our proposed method can achieve high measurement accuracy. Furthermore, because the number images acquired and the exposure time used for each fringe pattern acquisition are the same as the traditional single-focusing method, the proposed method does not sacrifice measurement speed for extended DOF.

Section 2 explains the principles of the proposed method. Section 3 shows experimental verifications. Section 4 summarizes this work.

## 2. Principle

This section explains the phase-shifting algorithms we employed, the basic principle of realizing focal sweep with ETL, the deconvolution (or deblurring) algorithm, and the computational framework we developed.

#### 2.1 Multi-step phase-shifting algorithm

Phase-shifting algorithms are among the extensively employed in three-dimensional imaging because of its achievable accuracy, resolution and speed. The intensity function of the *n*-th pattern for an *N-*step phase-shifting algorithm can be described as,

*n*-th image. If $N\geq ~3$, the unknown phase $\phi (x,y)$ can be calculated by

#### 2.2 Principle of focal sweep with electrically tunable lens

As discussed in the introduction section, one of the approaches that can extend DOF without sacrificing 2D imaging speed is the focal sweep [19,20]. The key of focal sweep is that the focal plane is swept over a large depth range to generate a single image during the image exposure (or integration). This method was used to extend the DOF of a microscope by translating the specimen along the optical axis [19]. It was also successfully implemented for consumer photography by translating the imaging sensor [20]. Kuthirummal et al. [20] proved that the point spread function (PSF) of the focal sweep technique is nearly invariant to depth, thus a single PSF can be used to deblur the focal sweep image to generate an all-in-focus image without knowing the scene depth distributions. Since the mechanical translation stage is slow and often unstable, Miau et al. [21] proposed to use an ETL to extend the DOF of videography.

There are two main approaches to implement ETL [22]. The first approach is to electrically control the local variations of refractive index to change the focal plane of the lens [23]. It has the advantages of low driving voltage, low power consumption, and easy miniaturization. However, this method is sensitive to polarization and has slow response. The second approach is to control the shape of a lens, including electrowetting [24] and shape-changing polymers [25]. One of the issues associated with shape changing polymer lenses is the coma, although such an effect is negligible when the lens is in the horizontal position. The advantages of using a liquid lens are: less mechanic motion, more compact and robust design, less weight, less power consumption, and fast response time (e.g., in ms). Therefore, realizing the focal sweep technique with an ETL is significantly superior to other methods in terms of speed, stability, accuracy.

Assume the focal length of ETL $f(t)$ is controlled by the time-varying input current $I(t)$, where $t$ is time, then the PSF can be parameterized as $P(r,z,f(t))$, where $z$ denotes points at depth $z$ and $r$ is the radial position on the sensor plane. When the ETL is operated at focal sweep model during exposure, the integrated point spread function (IPSF) can be modeled as

where $T$ is the exposure time (or integration time). Obviously, the shape of $IPSF(\cdot )$ is determined by the focal length function $f(t)$. Thus, the depth-independent $IPSF(\cdot )$ can be realized by optimizing $f(t)$. The key to achieving the optimal IPSF is to ensure that the blur diameter linearly changes with time $t$ [19,20]. When the depth variation is not extremely large, the optimal IPSF can be obtained by driving the ETL with triangular waves [21]. Figure 1 illustrates the timing of the image acquisition and the ETL driving signal. Note that the trigger signal and the camera does not have to be precisely synchronized as along as the period of the triangular wave is the same as the period (also very close to the exposure time) for the camera image acquisition and pattern projection.#### 2.3 Deconvolution for focal sweep image

After focal sweep image is captured, deconvolution algorithm needs to be applied to obtain in-focused image. Ususally, PSF estimation is an ill-posed problem due to the observed blurred image only provides a partial constraint on the solution. There are numerous methods proposed in the past, such as sharp edge prediction [26], gradient domain correlation [27] or optimization techniques [28]. Levin et al. [29] summarized these methods back then. If the IPSF (or PSF) of focal sweep technique is depth-independent and is fixed once lens parameters are set, the problem can be greatly simplified. The IPSF we used can be mathematically described as

As for the deconvolution algorithm used for images captured by the focal sweep technique, we found that, for our system setup, different deconvolution algorithms produce almost identical results if the same IPSF is applied. As such, we chose the Wiener deconvolution to deblur our images, and the built-in deconvwnr() function in Matlab 2017a for post image processing.

We tried focal sweep technique for 2D imaging first. We captured an image of the scene with resolution charts at three different locations (approximately 450 mm, 650 mm, and 1200 mm). The camera we used is attached with 8 mm lens, and the ETL (model: Optotune EL-16-40-TC ). The camera exposure time was set as 10 ms and F/1.4. We first set the driving current of ETL as 30 mA, 20 mA and 11 mA, respectively. At each value, we only see a clear image for one of the resolution chart, as shown in Figs. 2(a)–2(c). Then we drive the ETL with triangular signal with a period of 10 ms. Figure 2(d) shows the captured image. We then applied a $IPSF(\sigma _l,\sigma _h)$ with $\sigma _l=0.1$ and $\sigma _h=3.0$. Figure 2(e) shows the resultant image that is overall well focused for the entire scene.

#### 2.4 Proposed large DOF 3D shape measurement technique

We propose to employ the focal sweep method to enlarge the DOF of high-resolution 3D shape measurement systems. Our system consists of a digital projector that projects time-varying structured patterns, and the camera captures the structured patterns reflected by the object surface, as shown in Fig. 3. Since the projector typically has a large DOF and the amount of defocusing has less effect on measurement quality [8], the projector is attached with a fixed focal length lens. The camera is attached with a standard lens and an ETL that can vary the focal plane. The projector and the camera are precisely synchronized with an external timing controller. Since the focal sweep technique is adopted, IPSFs at different depths have a similar shape (i.e., IPSF1 = IPSF2 = IPSF3). The binary defocusing technique [30] was employed to maximize the DOF of pattern projection.

3D shape can be reconstructed from the captured fringe pattern, and Fig. 4 shows our algorithm flow. Each fringe pattern applies the same deconvolution kernel to obtain the deblurred fringe images. The deblurred fringe images are used to reconstruct 3D shape following the standard process using for a typical fringe projection technique (i.e., phase wrapping, unwrapping, and 3D reconstruction).

## 3. Experimental results

We developed a prototype system to verify the performance of our proposed method. Figure 5 shows the photograph of the system. The hardware system includes a CMOS camera (model: PointGrey Grasshopper GS3-U3-32S4M) equipped with a lens system consisting of an 8 mm lens (model: Computar M0814-MP2) and ETL (model: Optotune EL-16-40-TC); and a DLP projector (model: Lightcrafter 4500). The optical power of the ETL can be tuned by the high-precision electrical lens driver (model: Optotune Lens driver 4i) from -2 to +3 dpt. ETL was controlled by the computer. We used a microprocessor (model: Arduino Uno) to generate the trigger signals for projector and the camera.

For all experiments, the camera resolution was set as 512$\times$512 pixels while the projector resolution was 912$\times$1140 pixels. The camera aperture was set to be the maximum ($f/1.4$) and the exposure time was always set as 10 ms. For the fixed focus measurement, we set three focal planes at a distance of approximately 450 mm, 650 mm and 1200 mm, corresponding to ETL driving currents of 30.03 mA, 20.02 mA and 10.01 mA, respectively. For focal sweep method, we set the period of the triangular wave as 10 ms, and the current value varying from 10.01 to 40.05 mA to cover enough depth range. For calibration, camera in focal sweep model is regarded as the normal discrete focus model. and the system was calibrated using the procedure described in reference [12]: each time the calibration board pose was changed, images for different discrete focus settings were collected firstly; then the ETL lens was set to focal sweep model and collecting corresponding images; the same procedure was performed for next board pose after last step was finished. Since the projector’s lens was not changed during the whole process, its intrinsic parameters were fixed for all focus settings. Experimental hardware setup remained unchanged for all experiments. For all deblurring, we applied the $IPSF(\sigma _l,\sigma _h)$ with $\sigma _l=0.1$ and $\sigma _h=3.0$ to all captured fringe patterns.

First, we measured an ideal sphere with a diameter of 60 mm to evaluate the performance of our method. Figure 6 shows the measurement results. Figure 6(a) shows one of the captured fringe patterns when the sphere was placed at 450 mm and the focal sweep method was employed. We deblurred all fringe fringe patterns using the deconvolution kernel described earlier. Figure 6(b) shows the corresponding resultant fringe image. These deblurred fringe patterns were then used to compute the wrapped phase, shown in Fig. 6(c). The wrapped phase wrap was then unwrapped, and the unwrapped phase was further converted to 3D shape of the object after applying the calibration parameters. Figure 6(e) shows the reconstructed 3D shape. We further evaluated the accuracy of the measurement by fitting the reconstructed point cloud with an idea sphere having a diameter of 60 mm, and then took the difference between the measured data and the ideal sphere. We compared the proposed method with the camera focus fixed at 450 mm. Figure 6(e) shows the error map of focal sweep and Fig. 6(f) shows the error map when camera is fixed focused at 450 mm. The mean error and root-mean-square (rms) error of focal sweep is 0.062 mm and 0.079 mm, respectively, both are quite small and very close to those of fixed focus condition (mean error: 0.052 mm and rms error: 0.070 mm).

We then placed the same sphere at approximately 650 mm and 1200 mm and performed measurements, similarly. Figure 7 shows the corresponding measurement results. When the sphere was placed at 650 mm, the mean and rms error of focal sweep method are 0.060 mm and 0.075 mm, respectively, while 0.047 mm and 0.060 mm for camera focus fixed at 450 mm; when the sphere was placed at 1200 mm, the mean and rms error are 0.050 mm and 0.064 mm of focal sweep method, while 0.094 mm and 0.125 mm for camera focus fixed at 450 mm. It can be found that the error of focal sweep method is quite small and better than fixed focus at 450 mm at long distance. These clearly demonstrated the deconvolution algorithm does not introduce obvious measurement artifacts, and does not significantly affect the measurement accuracy. Therefore, the proposed algorithm can be used for high-accuracy 3D shape measurement applications.

Next, we measured a complex statue to further evaluate the performance of the proposed method. We first placed the statue at a distance of approximately 450 mm away from the system, and the camera was fixed focused at 450 mm. Figure 8 shows the results. Figure 8(a) shows the captured photograph when the camera was fixed focused at 450 mm. Clearly the image is clear and sharp. We then captured the object again using our focal sweep method, and Fig. 8(b) shows the computed image. These two images show no obvious difference, as expected. We then performed 3D shape measurements for both cases, and Figs. 8(c) and 8(d) show the results. Both 3D results are of high quality without obvious difference. These experiments demonstrated that our proposed method works equally well as the traditional method when the object is within the DOF of the camera.

We then moved the object away from the focal plane (450 mm) to 650 mm. Figure 9 shows the corresponding measurement results. Since the camera is not well focused when the fixed focal plane is used, Fig. 9(a) shows that the captured image is blurred, and the captured 3D features are not sharp, as shown in Fig. 9(c). In contrast, the computed image with the focal sweep method still produces clearly focused image with sharp 3D features, as shown in Fig. 9(b) and Fig. 9(d), respectively.

We moved the object further way from the focal plane (450 mm) to 1200 mm and measured the object again, and Fig. 10 shows the measurement results. This experiment demonstrated that if the focal plane is fixed at 450 mm, the object far away will be significantly blurred and the corresponding 3D measurement quality is degraded substantially. In contrast, our proposed method can still produce high-quality 2D and 3D results.

These previous experiments demonstrated that our proposed method could generate always-in-focused 2D images and detailed 3D reconstructions regardless the scene depth range. We then evaluated this capability by simultaneously measuring three objects at different depths (approximately 450 mm, 650 mm, and 1200 mm). The experimental set up is shown in Fig. 5. If the camera was focused at 450 mm, only the object at 450 mm was clearly captured with details, as shown in the first row images of Fig. 11. The second and the third row images of Fig. 11 show that if the camera is focused at 650 mm or 1200 mm, there will always be one or two objects be blurred and the reconstructed 3D results do not have clear details. The last row images of Fig. 11 show results from our proposed method. Clearly, our method can simultaneously generate always-in-focused 2D images and detailed 3D reconstructions for all three objects. It is important to note that all these measurements used exactly the same time as any of the previous results except the focal plane ETL was dynamically changing when the camera exposures each fringe image. As such, the achievable measurement speed is the same as the traditional method assuming that the ETL response is fast enough.

Lastly, we evaluated our proposed method by measuring one dynamically moving hand. The exposure time for both the camera and the projector was set as 5 ms. The ETL was driven by a triangular wave at 200 Hz with the current being the same as previous experiment. Two sets of three phase-shifted patterns in different frequencies were used, the high-frequency phase map was unwrapped using the geometric constraint based two-frequency phase unwrapping method proposed by Hyun and Zhang [31]. Since only six patterns are needed to reconstruct one 3D frame, the equivalent 3D shape measurement speed is 33.3 Hz. Figure 12 shows five representative 3D frames and the associated Visualization 4 shows the entire video sequence. This experiment clearly demonstrated that our proposed method can achieve real-time performance and be used for measuring large depth range dynamic changing scenes.

## 4. Summary

We have presented a technique to drastically extend the depth of field (DOF) of 3D shape measurement systems based on fringe projection techniques. The method we employed was called focal sweep: a method that dynamically change focal planes of the camera during image exposure (or integration). By applying a deconvolution (or deblurring) kernel to the captured images, the all-in-focus image can be generated. The computed images were then used for 3D reconstruction. Experimental results successfully demonstrated that our proposed method can achieve high accuracy, high resolution 3D shape measurement in real time.

## Funding

Beijing Innovation Center for Future Chip (ICFC); Directorate for Computer and Information Science and Engineering (IIS-1637961); National Science Foundation (IIS-1763689).

## Disclosures

XWH: The author declares no conflicts of interest; SZ: Vision Express Optics LLC (I,E,P), Orbbec (C); YZ: The author declares no conflicts of interest; YL: The author declares no conflicts of interest; GW: The author declares no conflicts of interest.

## References

**1. **S. Zhang, * High-Speed 3D imaging with digital fringe projection techniques* (CRC University, 2016).

**2. **M. Gupta and S. K. Nayar, “Micro phase shifting,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2012), pp. 813–820.

**3. **G. Rao, L. Song, S. Zhang, X. Yang, K. Chen, and J. Xu, “Depth-driven variable-frequency sinusoidal fringe pattern for accuracy improvement in fringe projection profilometry,” Opt. Express **26**(16), 19986–20008 (2018). [CrossRef]

**4. **Y. Wang and S. Zhang, “Three-dimensional shape measurement with binary dithered patterns,” Appl. Opt. **51**(27), 6631–6636 (2012). [CrossRef]

**5. **J. Sun, C. Zuo, S. Feng, S. Yu, Y. Zhang, and Q. Chen, “Improved intensity-optimized dithering technique for 3d shape measurement,” Opt. Lasers Eng. **66**, 158–164 (2015). [CrossRef]

**6. **H. Kawasaki, S. Ono, Y. Horita, Y. Shiba, R. Furukawa, and S. Hiura, “Active one-shot scan for wide depth range using a light field projector based on coded aperture,” in Proceedings of the IEEE International Conference on Computer Vision, (IEEE, 2015), pp. 3568–3576.

**7. **L. Zhang and S. Nayar, “Projection defocus analysis for scene capture and image display,” ACM Trans. Graph. **25**(3), 907–915 (2006). [CrossRef]

**8. **S. Achar and S. G. Narasimhan, “Multi focus structured light for recovering scene shape and global illumination,” in Proceedings of European Conference on Computer Vision, (Springer, 2014), pp. 205–219.

**9. **X. Hu, G. Wang, Y. Zhang, H. Yang, and S. Zhang, “Large depth-of-field 3d shape measurement using an electrically tunable lens,” Opt. Express **27**(21), 29697–29709 (2019). [CrossRef]

**10. **Y. Xiao, G. Wang, X. Hu, C. Shi, L. Meng, and H. Yang, “Guided, fusion-based, large depth-of-field 3d imaging using a focal stack,” Sensors **19**(22), 4845 (2019). [CrossRef]

**11. **X. Hu, G. Wang, J.-S. Hyun, Y. Zhang, H. Yang, and S. Zhang, “Autofocusing method for high-resolution three-dimensional profilometry,” Opt. Lett. **45**(2), 375–378 (2020). [CrossRef]

**12. **M. Zhong, X. Hu, F. Chen, C. Xiao, D. Peng, and S. Zhang, “Autofocusing method for a digital fringe projection system with dual projectors,” Opt. Express **28**(9), 12609–12620 (2020). [CrossRef]

**13. **A. Levin, Y. Weiss, F. Durand, and W. T. Freeman, “Understanding blind deconvolution algorithms,” IEEE Trans. Pattern Anal. Mach. Intell. **33**(12), 2354–2367 (2011). [CrossRef]

**14. **A. Levin, R. Fergus, F. Durand, and W. T. Freeman, “Image and depth from a conventional camera with a coded aperture,” ACM Trans. Graph. **26**(3), 70 (2007). [CrossRef]

**15. **G. Wang, W. Li, X. Yin, and H. Yang, “All-in-focus with directional-max-gradient flow and labeled iterative depth propagation,” Pattern Recogn. **77**, 173–187 (2018). [CrossRef]

**16. **E. R. Dowski and W. T. Cathey, “Extended depth of field through wave-front coding,” Appl. Opt. **34**(11), 1859–1866 (1995). [CrossRef]

**17. **O. Cossairt, C. Zhou, and S. Nayar, “Diffusion coded photography for extended depth of field,” in ACM SIGGRAPH, (ACM, 2010), pp. 1–10.

**18. **G. Häusler, “A method to increase the depth of focus by two step image processing,” Opt. Commun. **6**(1), 38–42 (1972). [CrossRef]

**19. **S. Kuthirummal, H. Nagahara, C. Zhou, and S. K. Nayar, “Flexible depth of field photography,” IEEE Transactions on Pattern Analysis Mach. Intell. **33**(1), 58–71 (2010).

**20. **D. Miau, O. Cossairt, and S. K. Nayar, “Focal sweep videography with deformable optics,” in IEEE International Conference on Computational Photography, (IEEE, 2013), pp. 1–8.

**21. **M. Blum, M. Büeler, C. Grätzel, and M. Aschwanden, “Compact optical design solutions using focus tunable lenses,” in Optical Design and Engineering IV, vol. 8167 (International Society for Optics and Photonics, 2011), p. 81670W.

**22. **H.-C. Lin, M.-S. Chen, and Y.-H. Lin, “A review of electrically tunable focusing liquid crystal lenses,” Transactions on Electr. Electron. Mater. **12**(6), 234–240 (2011). [CrossRef]

**23. **R. Shamai, D. Andelman, B. Berge, and R. Hayes, “Water, electricity, and between*ldots* on electrowetting and its applications,” Soft Matter **4**(1), 38–45 (2008). [CrossRef]

**24. **H. Ren and S.-T. Wu, * Introduction to adaptive lenses*, vol. 75 (John Wiley & Sons, 2012).

**25. **N. Joshi, R. Szeliski, and D. J. Kriegman, “Psf estimation using sharp edge prediction,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2008), pp. 1–8.

**26. **W. Hu, J. Xue, and N. Zheng, “Psf estimation via gradient domain correlation,” IEEE Trans. on Image Process. **21**(1), 386–392 (2012). [CrossRef]

**27. **Q. Shan, J. Jia, and A. Agarwala, “High-quality motion deblurring from a single image,” ACM Trans. Graph. **27**(5), 1–7 (2008). [CrossRef]

**28. **A. Levin, Y. Weiss, F. Durand, and W. T. Freeman, “Understanding and evaluating blind deconvolution algorithms,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2009), pp. 1964–1971.

**29. **S. Lei and S. Zhang, “Flexible 3-d shape measurement using projector defocusing,” Opt. Lett. **34**(20), 3080–3082 (2009). [CrossRef]

**30. **Y. An, J.-S. Hyun, and S. Zhang, “Pixel-wise absolute phase unwrapping using geometric constraints of structured light system,” Opt. Express **24**(16), 18445–18459 (2016). [CrossRef]