## Abstract

Phase retrieval from fringe images is essential to many optical metrology applications. In the field of fringe projection profilometry, the phase is often obtained with systematic errors if the fringe pattern is not a perfect sinusoid. Several factors can account for non-sinusoidal fringe patterns, such as the non-linear input–output response (e.g., the gamma effect) of digital projectors, the residual harmonics in binary defocusing projection, and the image saturation due to intense reflection. Traditionally, these problems are handled separately with different well-designed methods, which can be seen as “one-to-one” strategies. Inspired by recent successful artificial intelligence-based optical imaging applications, we propose a “one-to-many” deep learning technique that can analyze non-sinusoidal fringe images resulting from different non-sinusoidal factors and even the coupling of these factors. We show for the first time, to the best of our knowledge, a trained deep neural network can effectively suppress the phase errors due to various kinds of non-sinusoidal patterns. Our work paves the way to robust and powerful learning-based fringe analysis approaches.

© 2021 Chinese Laser Press

## 1. INTRODUCTION

Three-dimensional (3D) measurement plays an essential role in many fields, e.g., industrial manufacturing [1], medical treatment [2], entertainment [3], and identity recognition [4]. In convention, coordinate measuring machines provide users with accurate 3D data by way of point-by-point measurements [5]. However, its measuring speed is limited due to the point-wise and contact inspection. By contrast, optical 3D measurement techniques can obtain full-field geometric measurements within a single or several shots [6,7]. Among current optical 3D measurement techniques, structured light illumination profilometry has received extensive attention and is becoming one of the most promising 3D shape measurement techniques [8,9].

In structured light illumination profilometry, one illuminates test objects with patterns of various structures, such as sinusoidal fringes [10], de Bruijn patterns [11], speckle patterns [12], and aperiodic fringes [13]. For high-accuracy 3D measurements, sinusoidal fringe patterns are often preferred. Many fringe analysis methods have been proposed for extracting the object’s phase from sinusoidal fringes. They can be broadly classified into two categories: spatial-demodulation methods [14–18] and temporal-demodulation methods [19–24]. For spatial-demodulation approaches, one can compute the phase by using a single fringe image, demonstrating the advantage of high efficiency. Nevertheless, they tend to compromise for complex surfaces since high-frequency details are difficult to retrieve with only a single image. For temporal-demodulation methods, pixel-wise measurements with higher resolution and accuracy can be achieved. By representative phase-shifting (PS) algorithms [10], one captures several sinusoidal fringe images with a given phase shift and calculates the phase using a least-square method. As multiple images can provide more information about the same measured point, the phase of complex structures can be recovered with high accuracy. However, the main limitation of temporal-demodulation approaches is the reduced efficiency as several images have to be recorded. It is noteworthy that we need to ensure that the sinusoidal fringe patterns are captured with high quality for either spatial-demodulation or temporal-demodulation techniques.

Several inherent factors in structured light illumination can account for the collection of non-sinusoidal patterns. The first one is the gamma distortion of digital projectors. For visual quality, digital projectors or displays are often manufactured with specific gamma distortion, leading to a non-linear relationship between the output intensity and the input intensity that is ${I}_{\mathrm{out}}={I}_{\mathrm{in}}^{\gamma}$. Researchers have proposed many approaches that can be roughly classified into system-based methods [8,25–27] and algorithm-based methods [28–36] to relieve the gamma distortion. The system-based approaches suggest replacing commercial projectors with illumination units free from gamma effect, e.g., coherent light illumination setups [25] and programmable digital light processing (DLP) modules [8]. Although effective, they may increase the cost or the complexity of the whole system. To eliminate the gamma distortion without changing the system hardware, one can record the input and the output light intensity and predict the gamma value using the non-linear model [28–30]. Then, to counteract the gamma effect, one can pre-distort the input intensity using ${({I}_{\mathrm{in}})}^{\frac{1}{\gamma}}$, which can recover the true output intensity ${I}_{\mathrm{out}}={[{({I}_{\mathrm{in}})}^{\frac{1}{\gamma}}]}^{\gamma}={I}_{\mathrm{in}}$. Also, gamma-induced phase errors can be compensated by lookup tables that depict the relationship between the phase difference and the actual phase [31,32]. In addition, the weights of harmonic errors duo to the gamma effect can be predicted through some iteration algorithms, which can then be used for error compensation [33].

The second cause for captured non-sinusoidal fringes is the residual high-order harmonics in binary defocusing projection. In high-speed fringe projection, binary defocusing techniques have the advantages of fast image projection [37]. For projectors using digital micromirror devices, 8-bit fringe images are usually projected at the speed limit of 120 Hz as a relatively long integration time is required. For 1-bit binary fringes, however, the integration time of projection can be reduced to the minimum, allowing the projector to operate at kilohertz to tens of kilohertz. By defocusing the projector, we can have the binary stripe patterns transformed into gray-scale sinusoidal patterns. In practice, users should carefully adjust the defocusing degree of the projector. When the projector is defocused excessively, the fringe images are captured with a low contrast. On the opposite, if the defocusing degree is not enough systematic errors would occur, since harmonics in the binary fringes have not been filtered completely by the defocusing process. In practice, people prefer to defocus the projector slightly and then try to remove the systematic errors with well-designed algorithms, such as pulse width modulation [38,39], sinusoidal pulse width modulation (SPWM) [40], tripolar SPWM [41], optimal pulse width modulation [42], and dithering methods [43,44]. The main idea of these methods is to shift harmonics in the binary fringe from low-frequency areas to high-frequency sections of its spectrum, facilitating the low-pass filtering effect induced by the defocusing projection.

The third cause of non-sinusoidal fringes is the image saturation in high dynamic range (HDR) 3D shape measurements. For fringe projection profilometry, it is challenging to measure objects with a considerable variation in surface reflectivity, e.g., a scenario contains both dark and bright objects. The fringe patterns reflected from the dark regions are often captured with a low signal-to-noise ratio, whereas the pixels are usually saturated for the reflective surfaces. When dark objects are captured with proper fringe patterns, bright areas in the same scene are often measured with saturated (pure white) fringes. As object details have been covered up with the saturated fringes, it is hard to retrieve the phase. Various approaches to HDR fringe projection techniques have been proposed [45]. In general, these techniques can be classified into two groups: equipment-based techniques [46–56] and algorithm-based techniques [57–62]. In the group of equipment-based methods, researchers try to acquire ideal fringe images by adjusting the imaging system, such as the exposure time [47], the intensity of projected light [50], the polarization states of illumination [46], and the number of camera views [46]. As to the algorithm-based methods, researchers concentrate on the design of phase retrieval algorithms instead of changing the imaging system’s hardware, allowing the phase to be measured directly from saturated fringe images.

Further, the case will be more complicated if some of the non-sinusoidal factors are coupled together, which is seldom discussed in the current literature. For example, fringe images are captured with both the gamma effect and the image saturation, or with both the insufficient defocusing projection and the image saturation. This paper shows that the causes of these kinds of individual/coupling non-sinusoidal problems are similar, which can boil down to a superposition of the original sine wave of the fundamental frequency with several unknown sine waves at high frequencies (high-order harmonics). In practice, stochastic factors, e.g., the random noise, may also affect the captured fringe pattern but they are not discussed here as they will not change the main profile of a sinusoid.

Deep learning is a powerful machine learning technique that uses artificial neural networks with deep layers to fit complex mathematical functions. Compared with traditional algorithms that rely on physical models completely, deep learning approaches handle problems by searching and establishing sophisticated mapping between the input and the target data owing to the powerful computation capability. In many applications, learning-based methods have shown superiority to classic physical-model-based methods. In the field of image denoising, denoising autoencoders have been trained to obtain high level features for robust reconstruction of clean images [63,64]. In the field of nanophotonics, artificial intelligence has been applied to knowledge discovery, which shows great potential in understanding of the physics of electromagnetic nanostructures [65]. In the field of optical imaging, recent years have witnessed great successes of deep learning assisted applications. First, the deep neural network can significantly improve optical microscopy and increase its spatial resolution over a large field of view and depth of field [66]. Then, the deep learning techniques can be used for phase recovery and holographic image reconstruction in digital holography [67]. With only one hologram image, the twin-image and self-interference-related artifacts can be removed. Also, deep-learning-based ghost imaging techniques have shown much better performance than conventional ghost imaging in terms of different noise and measurement ratio conditions [68]. Furthermore, researchers have utilized deep learning strategies to build powerful models that can fit all scattering media within the same class, which improves the scalability of imaging through scattering [69]. Lastly, in optical coherence tomography (OCT), deep neural networks can be used to identify clinical features similar to how clinicians interpret an OCT image, allowing successful automated segmentations of clinically relevant image features [70].

In recent years, researchers have demonstrated that deep neural networks can be used to improve the performance of fringe projection profilometry effectively. In fringe analysis, deep convolutional neural networks can be trained to retrieve the phase information from a single fringe image with favorable accuracy [71–74]. In phase unwrapping, learning-based temporal phase unwrapping [75] and stereo phase unwrapping methods [76] were developed to suppress noise effects and unwrap dense fringe patterns robustly. To handle complex surfaces, our previous work has shown that the deep learning technique can recover the phase from saturated fringe images [57]. Here, we show that more non-sinusoidal issues can benefit from deep learning. We demonstrate for the first time, to our knowledge, a generalized neural network can cope with various kinds of non-sinusoidal fringes that are caused by either single or multiple non-sinusoidal factors. Experimental results show that compared with traditional three-step phase-shifting algorithms, the proposed method can substantially improve the reconstruction accuracy by more than 50% without reducing the measurement efficiency.

## 2. PRINCIPLE

#### A. Phase-Shifting Algorithm

In fringe projection profilometry, a projector illuminates test objects with pre-designed fringe images and a camera captures the images simultaneously from a different angle. The fringe patterns are distorted due to the varying height of measured areas. The phase retrieved from captured patterns serves as temporary textures of test objects and can be converted into the object’s height. The $N$-step PS algorithm is widely applied to phase retrieval as it has the advantages of high accuracy, insensitivity to ambient light, and pixel-wise phase measurement. The captured $N$-step PS fringe image can be expressed as

where $\varphi (x,y)$ is the phase, $A(x,y)$ is the background intensity, $B(x,y)$ is the modulation, and ${\delta}_{n}$ is the phase shift that is equal to $2\pi n/N$, where $n=\mathrm{0,}\text{\hspace{0.17em}}\mathrm{1,}\text{\hspace{0.17em}}2,\text{\hspace{0.17em}}\mathrm{...,}\text{\hspace{0.17em}}N-1$. When there are at least three images ($N\ge 3$), the phase can be solved by#### B. Phase-Shifting Algorithm with Non-Sinusoidal Fringe Images

Non-sinusoidal PS images are often captured as a result of the projectors’ gamma effect, the binary defocusing illumination, or the image saturation. A generalized model can be used to represent the captured images as

We illustrate different kinds of non-sinusoidal fringe images, as shown in Fig. 1. They are fringe patterns simulated under the gamma distortion, binary square wave with slight defocusing, image saturation, and two coupling cases, respectively. In the case of $\gamma =2.2$, the sinusoidal wave’s peaks become narrow while the valleys wide, giving rise to narrowed bright stripes compared with the ideal sinusoidal wave. For the defocused pattern as shown in Fig. 1(c), the intensity distribution looks like a triangular wave due to the presence of residual harmonics. For the case of image saturation, the intensity that exceeds the maximum dynamic range (i.e., 255 in this simulation) is truncated, while the rest keeps unchanged. Last, in the coupling cases as shown in Figs. 1(e) and 1(f), the image saturation further modifies the shape of the original non-sinusoidal waves by cutting off the intensity that exceeds the dynamic range, which further increases the non-sinusoidal characteristic of the wave.

Fourier analysis is then implemented to investigate the harmonics of these patterns. The results are shown in Figs. 2 and 3. The fundamental frequency ${f}_{0}$ is three in our simulation. For the ideal sine wave, only the fundamental frequency ${f}_{0}$ exists. For the case of $\gamma =2.2$, the frequency components $2{f}_{0}$ and $3{f}_{0}$ begin to appear. For the case of the defocused binary pattern, as shown in Fig. 2(f), we can observe harmonics of $3{f}_{0}$ and $5{f}_{0}$ that survive the defocusing. Although their amplitudes are small, they can still destroy the phase retrieval. In Figs. 3(a) and 3(d), we can see that there are four additional frequency components that are from $2{f}_{0}$ to $5{f}_{0}$ in the saturated sine wave except for the fundamental frequency. Last, two coupling cases are discussed, which are the gamma effect coupled with the image saturation and the defocused pattern coupled with the image saturation, respectively. In Figs. 3(e) and 3(f), we can find that more harmonics have been introduced into the coupled gamma distorted pattern and the coupled defocused pattern due to the influence of image saturation, which has further destroyed the shape of the sinusoidal wave.

Next, we analyze the phase errors owing to the non-sinusoidal issues. Assume that the intensity difference for each PS image is

The phase error caused by additional harmonics can be written as

By substituting Eqs. (2) and (4) into Eq. (5), we have

Figure 4 illustrates the performance of PS algorithms in analyzing different non-sinusoidal fringes. Here, the ground-truth phase is calculated with ideal sinusoidal fringes. The phase error is obtained by computing the standard deviation of the phase difference. In the simulation of gamma distortion, we set $\gamma =2.2$. As can be seen, the phase error induced by the gamma effect decreases rapidly with the increase of the number of phase shift. For the case of defocused binary square pattern, the phase error reduces but with small fluctuations. The reason is that for an $N$-step PS algorithm, it is sensitive to $(s+1)N\pm 1$th harmonics (where $s$ is an integer) [41]. For example, the four-step PS algorithm is sensitive to all of the odd harmonics present in the defocused pattern, showing the largest phase error among all of the PS algorithms. However, from the whole trend, the phase error still decreases with a large $N$. For the case of saturation, we truncated 20% of the maximum light intensity. Like the defocusing technique, its phase error also shows a trend that the error decreases with an increasing $N$. For the coupling case of image saturation and $\gamma =2.2$, the phase error is larger than that of the case of pure gamma. For the coupling of defocusing projection and the image saturation, a more serious error is also observed than the one of the pure defocusing case. From the results, although different non-sinusoidal factors are superimposed in the coupling cases, the phase can still be robustly computed with a large step PS algorithm. In practice, however, the phase-shifting algorithm with a large number of steps requires many fringe images to be captured for a single phase measurement, which limits the efficiency significantly.

## 3. ARCHITECTURE OF THE DEEP NEURAL NETWORK

From the previous section, the phase error due to the non-sinusoidal patterns can be reduced by increasing the number of phase steps. However, the efficiency of the 3D imaging will decrease obviously. To handle this issue, we resort to deep learning techniques to retrieve the phase accurately from the non-sinusoidal patterns without increasing PS images. In this work, our deep neural network is constructed following the architecture of U-net [77].

U-net is a fully convolutional network with an encoder–decoder architecture, which is widely used in image segmentation. As shown in Fig. 5, the input images are captured non-sinusoidal PS fringe images. We take the three-step PS algorithm as an example as it requires the minimum images. With the non-sinusoidal PS fringe images, the network learns to predict ideal numerator $M(x,y)$ and denominator $D(x,y)$, which can be represented as

According to Eq. (2), $M(x,y)$ and $D(x,y)$ can be fed into the arctan function to calculate the final wrapped phase. At the beginning, the input fringe images are processed by the encoder to obtain 50-channel feature tensors with 1/2 resolution reduction along both the $x$ and $y$ directions. Then, these feature tensors successively go through three convolutional blocks to capture the multi-level feature information.

Contrary to the encoder subnetwork, the decoder subnetwork then performs up-sampling operations to restore results of the input image’s original size. It is implemented by bilinear interpolation and is followed by two convolution layers. In the U-net, at every step of the decoder, a skip connection is used to concatenate the convolution layers’ output with feature maps from the encoder at the same level. This structure helps obtain low-level and high-level information at the same time and weakens the typical gradient vanishing in deep convolutional networks, which is beneficial to achieve accurate results. The last layer of the network is a convolutional layer activated by a linear activation function and outputs two-channel data consisting of the numerator and the denominator. The objective of the neural network is to minimize the following loss function:

## 4. EXPERIMENTS

To validate the proposed method, we built a structured light illumination system that consisted of a projector (DLP 4500, Texas Instruments) and an industrial camera (acA640-750 μm, Basler). The camera was equipped with a lens of 8 mm focal length. The distance between the test object and the imaging system is about 1 m.

Non-sinusoidal fringe images due to five different causes were captured, respectively: (1) the pure gamma distortion (where $\gamma $ was set as 2.2 during the pattern projection), (2) the pure binary defocusing projection, (3) the pure image saturation, (4) the coupling of gamma effect $\gamma =2.2$ with image saturation, and (5) the coupling of binary defocusing projection with image saturation. To collect the training data, we captured 750 sets of non-sinusoidal three-step PS fringe images from different objects. To obtain the ground-truth data, Eqs. (7) and (8) were applied, where $N$ was selected as 12 to remove the influence of the harmonics as much as possible. The pixel depth of captured three-step fringe images is 8-bit in our experiments. Before being fed into the neural network, they were divided by 255 for normalization, which can make the learning easier for the network. The neural network was implemented using the TensorFlow framework (Google) and was computed on a GTX Titan graphics card (NVIDIA). For each non-sinusoidal scenario, we trained and tested the neural network using only the data belonging to the same scenario. All of the objects used in the testing process were not present in the training stage.

First, we investigated the neural network’s efficacy in the correction of gamma distortion. Figure 6(a) shows one of the captured three-step PS images. Figure 6(b) is the 3D reconstruction (depth map) obtained by the traditional three-step PS algorithm, in which obvious periodic ripple errors can be observed on the face of the retrieved model. Figures 6(c) and 6(d) demonstrate the 3D reconstructions by the proposed method and the 12-step PS algorithm, respectively. By comparison, these ripple errors have been suppressed effectively by the neural network. For quantitative evaluation, first we measured a pair of ceramic spheres. One of the captured gamma-distorted fringe images is as shown in Fig. 7(a). Figures 7(b)–7(d) demonstrate the 3D models obtained by the three-step PS method, the proposed method, and the 12-step method, respectively. The measurement error maps of the three-step PS method and the proposed method are shown in Figs. 7(e) and 7(f). The errors were calculated by referring to the high-accuracy profile obtained by the 12-step PS algorithm. With the trained neural network, the mean absolute error (MAE) and standard deviation error (STD) can be significantly reduced to 0.056 mm and 0.054 mm. Then, we measured a ceramic plate. Figures 8(a) and 8(b) show the 3D reconstruction of the traditional three-step PS algorithm and our method, respectively. The cross-section error of the plate is shown in Fig. 8(d). For the three-step PS algorithm, the MAE is 0.11 mm, and the STD is 0.075 mm. For our method, the MAE and the STD have been reduced to 0.045 mm and 0.034 mm, respectively, indicating the reduction of 59% for the MAE and 55% for the STD.

Then, the neural network was tested to obtain the phase from binary defocused fringe images. Here, we used the dithering technique to generate the binary fringes projected with a slightly defocused projector [78]. Figure 9(a) shows one of the three-step PS patterns. The 3D reconstruction of the three-step PS method is shown in Fig. 9(b), where the surfaces have been measured with obvious stripe noise. Figures 9(c) and 9(d) demonstrate the 3D results of our method and the 12-step PS algorithm, respectively. We can see that these errors have been removed and smooth 3D reconstructions have been acquired. For the quantitative analysis, Fig. 10 shows the measurement results of a pair of ceramic spheres. The 3D result shown in Fig. 10(c) and the reconstruction error maps shown in Fig. 10(f) demonstrate that the proposed method has effectively removed the periodic errors induced by non-sinusoidal components. Then, a ceramic plate was tested. Figures 11(a)–11(c) show the 3D images of the tested object. High-frequency ripple errors can be seen on the surface recovered by the three-step PS algorithm, indicating some harmonics of the projected pattern survived the defocused projection. The measurement errors are shown in Fig. 11(d). For the traditional three-step PS algorithm, the MAE and the STD are 0.12 mm and 0.096 mm, respectively. The MAE and the STD decreased to 0.046 mm and 0.034 mm, respectively, when our method was applied, demonstrating the proposed method reduced the MAE and the STD by 62% and by 65%, respectively.

In the third experiment, the proposed neural network was used to analyze saturated fringe images. One of captured PS images is shown in Fig. 12(a), where some fringes have been captured with pure white on the two models’ faces. Figure 12(b) demonstrates the 3D reconstruction by the traditional three-step PS method. Many ripple artifacts can be observed at the recovered faces of the two objects. In Fig. 12(c), with the assistance of deep learning, these errors were eliminated effectively by the proposed method. This reconstruction is very close to the one obtained by the 12-step PS method, as shown in Fig. 12(d). Figure 13 shows the measurement results of a pair of ceramic spheres. From the error maps demonstrated by Figs. 13(e) and 13(f), we can see that the MAE and STD have been reduced to 0.043 mm and 0.039 mm, respectively. In addition, Figs. 14(a)–14(c) show the 3D reconstructions of a ceramic plate by the three-step PS method, the proposed method, and the 12-step PS algorithm, respectively. The measurement errors are demonstrated in Fig. 14(d). Due to the image saturation, the 3D reconstruction was distorted severely for the traditional method. Its MAE and STD are 0.34 mm and 0.19 mm, respectively. For our method, by contrast, these errors have been reduced to 0.052 mm and 0.039 mm, respectively, indicating the error reduction by 84.7% for MAE and by 79.5% for STD.

Next, we tested the performance of our method for a more complicated situation where the gamma distortion ($\gamma =2.2$) was coupled with the image saturation. Figure 15(a) shows one of the captured three-step PS patterns where the head of the left model was captured under effects of both the gamma distortion and the pixel saturation issues. As two non-sinusoidal factors work together, many wave artifacts can be seen in the 3D model reconstructed with the traditional three-step PS method [Fig. 15(b)]. In contrast, Figs. 15(c) and 15(d) display the 3D results of our method and the 12-step PS method, respectively. We can see that the deep learning framework has successfully removed the influence of the gamma effect and the image saturation at the same time. In quantitative evaluation, Fig. 16 demonstrates the measurement results of a pair of ceramic spheres. Benefited from the deep learning, the coupling non-sinusoidal errors can be removed effectively. With the proposed strategy, the MAE and STD of the measured sphere can be decreased to 0.047 mm. Then, a ceramic plate was also measured. The results are shown in Figs. 17(a)–17(c). From the error distribution shown in Figs. 17(d), we can see that the proposed approach can eliminate the periodic artifacts and recover the shape of the plate correctly. Numerically, the MAE and the STD of the three-step PS method are 0.28 mm and 0.16 mm, respectively. When the proposed method was applied, the MAE and the STD were reduced by 84% and 79% to 0.044 mm and 0.033 mm, respectively.

Last, we tested the second coupling non-sinusoidal case where the fringe images were captured under the slightly defocusing projection and the image saturation. Figure 18(a) shows one of the captured three-step PS patterns in which the face was captured with defocused and saturated fringes. Figure 18(b) shows the 3D result obtained by the traditional three-step PS method; wrinkle errors due to both non-sinusoidal factors can be observed clearly. Figures 18(c) and 18(d) show the 3D reconstructions of the proposed deep neural network and the 12-step PS algorithm, respectively. As shown in Fig. 18(c), these non-sinusoidal errors can be compensated with the proposed method effectively. A pair of ceramic spheres was then tested, and the results are shown in Fig. 19. We can see that the deep neural network is able to eliminate the ripple errors successfully and reduce the MAE and STD to 0.041 mm and 0.038 mm, respectively. Further, Figs. 20(a)–20(c) demonstrate 3D reconstructions of a ceramic plate by the traditional three-step PS method, our method, and the 12-step PS method, respectively. From the error distribution shown in Fig. 20(d), the MAE and the STD of the three-step PS method are 0.33 mm and 0.21 mm, respectively. When our method was applied, the MAE and the STD have been reduced to 0.047 mm and 0.039 mm, showing an accuracy improvement of more than 80%.

## 5. CONCLUSION

The fringe analysis is important to fringe projection profilometry, which has a high requirement on captured sinusoidal fringes. When the fringe is not a perfect sinusoid, the phase accuracy and the 3D reconstruction suffer. This paper focuses on several frequently encountered non-sinusoidal issues, including the gamma effect of digital projectors, residual high-order harmonics in binary defocusing projection, the image saturation, and more complex cases where the image saturation is coupled with the gamma effect and with the binary defocusing projection. Conventionally, these non-sinusoidal issues are seldom considered in a unified framework. Also, approaches that can handle the coupling cases are rarely discussed. Here, we have demonstrated that these kinds of non-sinusoidal patterns can be represented by a generalized model and the corresponding phase errors can be relieved by increasing the number of phase shift in the PS algorithms. We proposed a unified deep learning technique that can analyze fringe images from all of the mentioned non-sinusoidal causes and their coupling scenarios. More importantly, to remove these phase errors without increasing the number of phase shift, we train a deep neural network that can mimic the phase correction of PS algorithms with many steps (e.g., the 12-step PS method) by using PS fringe images captured with a few-step PS method (e.g., the three-step PS method). Experimental results have shown that compared with the traditional PS algorithm, the proposed method can effectively suppress the phase error due to the gamma effect of projectors, insufficient defocusing of binary fringe projection, the image saturation, and two complex coupled non-sinusoidal cases without increasing the fringe images. We believe this method shows great potential for robust and accurate phase retrieval and 3D measurements.

## Funding

National Natural Science Foundation of China (61722506, 62075096); Leading Technology of Jiangsu Basic Research Plan (BK20192003); Jiangsu Provincial “One Belt and One Road” Innovation Cooperation Project (BZ2020007); Final Assembly “13th Five-Year Plan” Advanced Research Project of China (30102070102); Equipment Advanced Research Fund of China (61404150202); Jiangsu Provincial Key Research and Development Program (BE2017162); Outstanding Youth Foundation of Jiangsu Province of China (BK20170034); National Defense Science and Technology Foundation of China (2019-JCJQ-JJ-381); “333 Engineering” Research Project of Jiangsu Province (BRA2016407); Fundamental Research Funds for the Central Universities (30920032101).

## Disclosures

The authors declare no conflicts of interest.

## REFERENCES

**1. **K. Harding, “Industrial metrology: engineering precision,” Nat. Photonics **2**, 667–669 (2008). [CrossRef]

**2. **J. M. Schmitt, “Optical coherence tomography (OCT): a review,” IEEE J. Sel. Top. Quantum Electron. **5**, 1205–1215 (1999). [CrossRef]

**3. **J. Han, L. Shao, D. Xu, and J. Shotton, “Enhanced computer vision with Microsoft Kinect sensor: a review,” IEEE Trans. Cybern. **43**, 1290–1334 (2013). [CrossRef]

**4. **P. Paysan, R. Knothe, B. Amberg, S. Romdhani, and T. Vetter, “A 3D face model for pose and illumination invariant face recognition,” in *6th IEEE International Conference on Advanced Video and Signal Based Surveillance* (IEEE, 2009), pp. 296–301.

**5. **M. M. P. A. Vermeulen, P. Rosielle, and P. Schellekens, “Design of a high-precision 3D-coordinate measuring machine,” CIRP Ann. **47**, 447–450 (1998). [CrossRef]

**6. **F. Chen, G. M. Brown, and M. Song, “Overview of 3-D shape measurement using optical methods,” Opt. Eng. **39**, 10–22 (2000). [CrossRef]

**7. **R. Leach, *Optical Measurement of Surface Topography* (Springer, 2011), Vol. 14.

**8. **J. Geng, “Structured-light 3D surface imaging: a tutorial,” Adv. Opt. Photonics **3**, 128–160 (2011). [CrossRef]

**9. **J. Salvi, J. Pagès, and J. Batlle, “Pattern codification strategies in structured light systems,” Pattern Recogn. **37**, 827–849 (2004). [CrossRef]

**10. **C. Zuo, S. Feng, L. Huang, T. Tao, W. Yin, and Q. Chen, “Phase shifting algorithms for fringe projection profilometry: a review,” Opt. Lasers Eng. **109**, 23–59 (2018). [CrossRef]

**11. **L. Zhang, B. Curless, and S. Seitz, “Rapid shape acquisition using color structured light and multi-pass dynamic programming,” in *1st International Symposium on 3D Data Processing Visualization and Transmission* (IEEE Computer Society, 2002), pp. 24–36.

**12. **M. Schaffer, M. Grosse, B. Harendt, and R. Kowarschik, “High-speed three-dimensional shape measurements of objects with laser speckles and acousto-optical deflection,” Opt. Lett. **36**, 3097–3099 (2011). [CrossRef]

**13. **S. Heist, P. Lutzke, I. Schmidt, P. Dietrich, P. Kühmstedt, A. Tünnermann, and G. Notni, “High-speed three-dimensional shape measurement using GOBO projection,” Opt. Lasers Eng. **87**, 90–96 (2016). [CrossRef]

**14. **M. Takeda and K. Mutoh, “Fourier transform profilometry for the automatic measurement of 3-D object shapes,” Appl. Opt. **22**, 3977–3982 (1983). [CrossRef]

**15. **X. Su and Q. Zhang, “Dynamic 3-D shape measurement method: a review,” Opt. Lasers Eng. **48**, 191–204 (2010). [CrossRef]

**16. **Q. Kemao, “Two-dimensional windowed Fourier transform for fringe pattern analysis: principles, applications and implementations,” Opt. Lasers Eng. **45**, 304–317 (2007). [CrossRef]

**17. **J. Zhong and J. Weng, “Spatial carrier-fringe pattern analysis by means of wavelet transform: wavelet transform profilometry,” Appl. Opt. **43**, 4993–4998 (2004). [CrossRef]

**18. **L. Huang, Q. Kemao, B. Pan, and A. K. Asundi, “Comparison of Fourier transform, windowed Fourier transform, and wavelet transform methods for phase extraction from a single fringe pattern in fringe projection profilometry,” Opt. Lasers Eng. **48**, 141–148 (2010). [CrossRef]

**19. **V. Srinivasan, H.-C. Liu, and M. Halioua, “Automated phase-measuring profilometry of 3-D diffuse objects,” Appl. Opt. **23**, 3105–3108 (1984). [CrossRef]

**20. **P. S. Huang, Q. J. Hu, and F.-P. Chiang, “Double three-step phase-shifting algorithm,” Appl. Opt. **41**, 4503–4509 (2002). [CrossRef]

**21. **P. Hariharan, B. Oreb, and T. Eiju, “Digital phase-shifting interferometry: a simple error-compensating phase calculation algorithm,” Appl. Opt. **26**, 2504–2506 (1987). [CrossRef]

**22. **S. Zhang and S.-T. Yau, “High-speed three-dimensional shape measurement system using a modified two-plus-one phase-shifting algorithm,” Opt. Eng. **46**, 113603 (2007). [CrossRef]

**23. **P. Jia, J. Kofman, and C. E. English, “Two-step triangular-pattern phase-shifting method for three-dimensional object-shape measurement,” Opt. Eng. **46**, 083201 (2007). [CrossRef]

**24. **P. S. Huang, S. Zhang, and F.-P. Chiang, “Trapezoidal phase-shifting method for three-dimensional shape measurement,” Opt. Eng. **44**, 123601 (2005). [CrossRef]

**25. **T. Anna, S. K. Dubey, C. Shakher, A. Roy, and D. S. Mehta, “Sinusoidal fringe projection system based on compact and non-mechanical scanning low-coherence michelson interferometer for three-dimensional shape measurement,” Opt. Commun. **282**, 1237–1242 (2009). [CrossRef]

**26. **Y. Guan, Y. Yin, A. Li, X. Liu, and X. Peng, “Dynamic 3D imaging based on acousto-optic heterodyne fringe interferometry,” Opt. Lett. **39**, 3678–3681 (2014). [CrossRef]

**27. **S. Yoneyama, Y. Morimoto, M. Fujigaki, and M. Yabe, “Phase-measuring profilometry of moving object without phase-shifting device,” Opt. Lasers Eng. **40**, 153–161 (2003). [CrossRef]

**28. **C. Zuo, Q. Chen, G. Gu, S. Feng, and F. Feng, “High-speed three-dimensional profilometry for multiple objects with complex shapes,” Opt. Express **20**, 19493–19510 (2012). [CrossRef]

**29. **S. Ma, C. Quan, R. Zhu, L. Chen, B. Li, and C. Tay, “A fast and accurate gamma correction based on Fourier spectrum analysis for digital fringe projection profilometry,” Opt. Commun. **285**, 533–538 (2012). [CrossRef]

**30. **K. Liu, Y. Wang, D. L. Lau, Q. Hao, and L. G. Hassebrook, “Gamma model and its analysis for phase measuring profilometry,” J. Opt. Soc. Am. A **27**, 553–562 (2010). [CrossRef]

**31. **S. Zhang and S.-T. Yau, “Generic nonsinusoidal phase error correction for three-dimensional shape measurement using a digital video projector,” Appl. Opt. **46**, 36–43 (2007). [CrossRef]

**32. **Z. Li, Y. Shi, C. Wang, and Y. Wang, “Accurate calibration method for a structured light system,” Opt. Eng. **47**, 053604 (2008). [CrossRef]

**33. **B. Pan, Q. Kemao, L. Huang, and A. Asundi, “Phase error analysis and compensation for nonsinusoidal waveforms in phase-shifting digital fringe projection profilometry,” Opt. Lett. **34**, 416–418 (2009). [CrossRef]

**34. **H. Guo, H. He, and M. Chen, “Gamma correction for digital fringe projection profilometry,” Appl. Opt. **43**, 2906–2914 (2004). [CrossRef]

**35. **T. Hoang, B. Pan, D. Nguyen, and Z. Wang, “Generic gamma correction for accuracy enhancement in fringe-projection profilometry,” Opt. Lett. **35**, 1992–1994 (2010). [CrossRef]

**36. **C. Jiang, S. Xing, and H. Guo, “Fringe harmonics elimination in multi-frequency phase-shifting fringe projection profilometry,” Opt. Express **28**, 2838–2856 (2020). [CrossRef]

**37. **B. Li, Y. Wang, J. Dai, W. Lohry, and S. Zhang, “Some recent advances on superfast 3D shape measurement with digital binary defocusing techniques,” Opt. Lasers Eng. **54**, 236–246 (2014). [CrossRef]

**38. **H. Fujita, K. Yamatan, M. Yamamoto, Y. Otani, A. Suguro, S. Morokawa, and T. Yoshizawa, “Three-dimensional profilometry using liquid crystal grating,” Proc. SPIE **5058**, 51–60 (2003). [CrossRef]

**39. **T. Yoshizawa and H. Fujita, “Liquid crystal grating for profilmetry using structured light,” Proc. SPIE **6000**, 60000H (2005). [CrossRef]

**40. **G. A. Ayubi, J. A. Ayubi, J. M. Di Martino, and J. A. Ferrari, “Pulse-width modulation in defocused three-dimensional fringe projection,” Opt. Lett. **35**, 3682–3684 (2010). [CrossRef]

**41. **C. Zuo, Q. Chen, S. Feng, F. Feng, G. Gu, and X. Sui, “Optimized pulse width modulation pattern strategy for three-dimensional profilometry with projector defocusing,” Appl. Opt. **51**, 4477–4490 (2012). [CrossRef]

**42. **Y. Wang and S. Zhang, “Optimal pulse width modulation for sinusoidal fringe generation with projector defocusing,” Opt. Lett. **35**, 4121–4123 (2010). [CrossRef]

**43. **J. Sun, C. Zuo, S. Feng, S. Yu, Y. Zhang, and Q. Chen, “Improved intensity-optimized dithering technique for 3D shape measurement,” Opt. Lasers Eng. **66**, 158–164 (2015). [CrossRef]

**44. **W. Lohry and S. Zhang, “Genetic method to optimize binary dithering technique for high-quality fringe generation,” Opt. Lett. **38**, 540–542 (2013). [CrossRef]

**45. **S. Feng, L. Zhang, C. Zuo, T. Tao, Q. Chen, and G. Gu, “High dynamic range 3D measurements with fringe projection profilometry: a review,” Meas. Sci. Technol. **29**, 122001 (2018). [CrossRef]

**46. **S. Feng, Y. Zhang, Q. Chen, C. Zuo, R. Li, and G. Shen, “General solution for high dynamic range three-dimensional shape measurement using the fringe projection technique,” Opt. Lasers Eng. **59**, 56–71 (2014). [CrossRef]

**47. **S. Zhang and S.-T. Yau, “High dynamic range scanning technique,” Opt. Eng. **48**, 033604 (2009). [CrossRef]

**48. **Z. Song, H. Jiang, H. Lin, and S. Tang, “A high dynamic range structured light means for the 3D measurement of specular surface,” Opt. Lasers Eng. **95**, 8–16 (2017). [CrossRef]

**49. **S. Feng, Q. Chen, C. Zuo, and A. Asundi, “Fast three-dimensional measurements for dynamic scenes with shiny surfaces,” Opt. Commun. **382**, 18–27 (2017). [CrossRef]

**50. **C. Waddington and J. Kofman, “Saturation avoidance by adaptive fringe projection in phase-shifting 3D surface-shape measurement,” in *International Symposium on Optomechatronic Technologies* (IEEE, 2010), pp. 1–4.

**51. **L. Zhang, Q. Chen, C. Zuo, and S. Feng, “High dynamic range 3D shape measurement based on the intensity response function of a camera,” Appl. Opt. **57**, 1378–1386 (2018). [CrossRef]

**52. **H. Lin, J. Gao, Q. Mei, Y. He, J. Liu, and X. Wang, “Adaptive digital fringe projection technique for high dynamic range three-dimensional shape measurement,” Opt. Express **24**, 7703–7718 (2016). [CrossRef]

**53. **Z. Cai, X. Liu, X. Peng, Y. Yin, A. Li, J. Wu, and B. Z. Gao, “Structured light field 3D imaging,” Opt. Express **24**, 20324–20334 (2016). [CrossRef]

**54. **V. Suresh, Y. Wang, and B. Li, “High-dynamic-range 3D shape measurement utilizing the transitioning state of digital micromirror device,” Opt. Lasers Eng. **107**, 176–181 (2018). [CrossRef]

**55. **H. Jiang, H. Zhao, and X. Li, “High dynamic range fringe acquisition: a novel 3-D scanning technique for high-reflective surfaces,” Opt. Lasers Eng. **50**, 1484–1493 (2012). [CrossRef]

**56. **L. Zhang, Q. Chen, C. Zuo, and S. Feng, “Real-time high dynamic range 3D measurement using fringe projection,” Opt. Express **28**, 24363–24378 (2020). [CrossRef]

**57. **L. Zhang, Q. Chen, C. Zuo, and S. Feng, “High-speed high dynamic range 3D shape measurement based on deep learning,” Opt. Lasers Eng. **134**, 106245 (2020). [CrossRef]

**58. **J. H. Bruning, D. R. Herriott, J. Gallagher, D. Rosenfeld, A. White, and D. Brangaccio, “Digital wavefront measuring interferometer for testing optical surfaces and lenses,” Appl. Opt. **13**, 2693–2703 (1974). [CrossRef]

**59. **Y. Yin, Z. Cai, H. Jiang, X. Meng, J. Xi, and X. Peng, “High dynamic range imaging for fringe projection profilometry with single-shot raw data of the color camera,” Opt. Lasers Eng. **89**, 138–144 (2017). [CrossRef]

**60. **M. Wang, G. Du, C. Zhou, C. Zhang, S. Si, H. Li, Z. Lei, and Y. Li, “Enhanced high dynamic range 3D shape measurement based on generalized phase-shifting algorithm,” Opt. Commun. **385**, 43–53 (2017). [CrossRef]

**61. **Y. Chen, Y. He, and E. Hu, “Phase deviation analysis and phase retrieval for partial intensity saturation in phase-shifting projected fringe profilometry,” Opt. Commun. **281**, 3087–3090 (2008). [CrossRef]

**62. **Z. Qi, Z. Wang, J. Huang, C. Xing, and J. Gao, “Error of image saturation in the structured-light method,” Appl. Opt. **57**, A181–A188 (2018). [CrossRef]

**63. **P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P.-A. Manzagol, and L. Bottou, “Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion,” J. Mach. Learn. Res. **11**, 3371–3408 (2010).

**64. **D. J. Im, S. Ahn, R. Memisevic, and Y. Bengio, “Denoising criterion for variational auto-encoding framework,” in *AAAI Conference on Artificial Intelligence* (2017), pp. 2059–2062.

**65. **Y. Kiarashinejad, M. Zandehshahvar, S. Abdollahramezani, O. Hemmatyar, R. Pourabolghasem, and A. Adibi, “Knowledge discovery in nanophotonics using geometric deep learning,” Adv. Intell. Syst. **2**, 1900132 (2020). [CrossRef]

**66. **Y. Rivenson, Z. Göröcs, H. Günaydin, Y. Zhang, H. Wang, and A. Ozcan, “Deep learning microscopy,” Optica **4**, 1437–1443 (2017). [CrossRef]

**67. **Y. Rivenson, Y. Zhang, H. Günaydn, D. Teng, and A. Ozcan, “Phase recovery and holographic image reconstruction using deep learning in neural networks,” Light Sci. Appl. **7**, 17141 (2018). [CrossRef]

**68. **M. Lyu, W. Wang, H. Wang, H. Wang, G. Li, N. Chen, and G. Situ, “Deep-learning-based ghost imaging,” Sci. Rep. **7**, 17865 (2017). [CrossRef]

**69. **Y. Li, Y. Xue, and L. Tian, “Deep speckle correlation: a deep learning approach toward scalable imaging through scattering media,” Optica **5**, 1181–1190 (2018). [CrossRef]

**70. **C. S. Lee, A. J. Tyring, N. P. Deruyter, Y. Wu, A. Rokem, and A. Y. Lee, “Deep-learning based, automated segmentation of macular edema in optical coherence tomography,” Biomed. Opt. Express **8**, 3440–3448 (2017). [CrossRef]

**71. **S. Feng, Q. Chen, G. Gu, T. Tao, L. Zhang, Y. Hu, W. Yin, and C. Zuo, “Fringe pattern analysis using deep learning,” Adv. Photonics **1**, 025001 (2019). [CrossRef]

**72. **J. Qian, S. Feng, Y. Li, T. Tao, J. Han, Q. Chen, and C. Zuo, “Single-shot absolute 3D shape measurement with deep-learning-based color fringe projection profilometry,” Opt. Lett. **45**, 1842–1845 (2020). [CrossRef]

**73. **J. Shi, X. Zhu, H. Wang, L. Song, and Q. Guo, “Label enhanced and patch based deep learning for phase retrieval from single frame fringe pattern in fringe projection 3D measurement,” Opt. Express **27**, 28929–28943 (2019). [CrossRef]

**74. **T. Yang, Z. Zhang, H. Li, X. Li, and X. Zhou, “Single-shot phase extraction for fringe projection profilometry using deep convolutional generative adversarial network,” Meas. Sci. Technol. **32**, 015007 (2020). [CrossRef]

**75. **W. Yin, Q. Chen, S. Feng, T. Tao, L. Huang, M. Trusiak, A. Asundi, and C. Zuo, “Temporal phase unwrapping using deep learning,” Sci. Rep. **9**, 20175 (2019). [CrossRef]

**76. **J. Qian, S. Feng, T. Tao, Y. Hu, Y. Li, Q. Chen, and C. Zuo, “Deep-learning-enabled geometric constraints and phase unwrapping for single-shot absolute 3D shape measurement,” APL Photonics **5**, 046105 (2020). [CrossRef]

**77. **O. Ronneberger, P. Fischer, and T. Brox, “U-net: convolutional networks for biomedical image segmentation,” in *International Conference on Medical Image Computing and Computer-Assisted Intervention* (Springer, 2015), pp. 234–241.

**78. **C. Zuo, T. Tao, S. Feng, L. Huang, A. Asundi, and Q. Chen, “Micro Fourier transform profilometry (μftp): 3D shape measurement at 10,000 frames per second,” Opt. Lasers Eng. **102**, 70–91 (2018). [CrossRef]