Abstract
We propose an easy-to-implement yet accurate calibration method for large-scale 3D measurements that makes use of a regular-sized phase target and two planar mirrors. Being insensitive to severe defocus, the phase target is placed as to span a large depth within the field of view (FOV) of each camera for accurate intrinsic calibration. Extrinsic calibration is achieved by placing the phase target in the FOV of a short-range virtual stereo-system generated by the mirrors. Results from 3D shape and deformation measurements demonstrate that the proposed method is capable to operate within a working volume of 3 m × 2 m × 1.8 m with an error < 0.1% of the FOV thus opening to new possibilities for large-scale measurements in mechanical and civil engineering applications.
© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement
Corrections
8 January 2020: A typographical correction was made to Fig. 3.
1. Introduction
In the last few decades, stereo-Digital Image Correlation (stereo-DIC) has been widely accepted as a powerful and versatile tool for non-contact full-field 3D shape and surface deformation measurement in experimental solid mechanics [1,2]. The method relies on the analysis of pairs of images obtained from a calibrated stereo-vision system, which employs two rigidly mounted cameras to capture a common region of interest (ROI) from two angled viewing directions. The surface of the test object needs to be provided with a natural or synthetic stochastic pattern to enable image registration that consists in matching homologous point pairs in the two stereo-views of the ROI [3]. The versatility and flexibility of DIC allows its application to a large variety of materials and samples, experimental conditions and imaging techniques [2]. As regards the length scale, Stereo-DIC applications in experimental mechanics ranges from micro to macro scale [2] with a recent ever-increasing interest towards large-scale measurements [4–6]. For this latter application, however, stereo-DIC is still far from reaching its fully potential mainly due to three major challenges that Sutton and associates [4] identified as: i) surface patterning, ii) imaging of the structure (i.e. appropriately selecting lens and stereo-angle) and iii) calibrating the stereo-system (see [4] for an extensive discussion on the above issues). This work aimed to address in particular the latter challenge by developing a generalized and easy-to-implement stereo-camera calibration method for large-scale applications.
Stereo-rig calibration aims at finding the intrinsic parameters (principle point, distortion parameters and focal length) of each camera and the extrinsic parameters (rotation and translation) defining the relative pose between cameras. Both sets of parameters are needed to attain 3D information from the 2D coordinates of corresponding image points through stereo-triangulation [1]. Among the various calibration techniques used in the computer vision community, the two methods presented in the two seminal papers by Zhang [7] and Tsai [8] are commonly taken as key-references for stereo-DIC system calibration with 2D and 3D targets, respectively. For both approaches, the three basic requirements for stereo-camera calibration are:
- 1) the calibration target needs to be placed in the common field of view (FOV) of the two cameras;
- 2) the calibration target should cover at least 1/3 of the common FOV of cameras [5,7];
- 3) the features of the calibration pattern (either checkerboard or dots array) need to be in focus and sufficiently spatially resolved to allow accurate centroids detection.
As it can be easily understood, it is unpractical to adopt a regularly sized calibration target when attempting to perform wide FOV (> 1 m width) measurements as it would be difficult and expensive to accurately manufacture a large sized calibration target.
A practical solution for satisfying both the requirements 2 and 3 listed above is to place a regular sized target at a close distance from the camera such as to fill most of its FOV. A consequential drawback, however, is the blurring of the image due to the severe defocus in the camera close-range that disallows the use of a typical passive target like the printed checkerboard. To address this issue, the use of active calibration targets have been recently proposed in the literature [9–12]. An active target is a synthetic array of marks coded into the phase map generated by a set of phase-shifted fringe patterns [13] displayed by a standard video monitor. Bell et al. [12] demonstrated that camera intrinsic parameters can be accurately retrieved with active phase targets regardless of the amount of defocusing. However, when an active target is placed in the close-range area of a camera, the calibration pattern cannot be simultaneously seen by both cameras (i.e. the condition 1 listed above is not satisfied) and hence it is not possible to retrieve their relative pose. An et al. [11] addressed this problem by using an active target to calibrate the intrinsic parameters of camera and projector of a large-range structured light (SL) system, and later retrieving the extrinsic parameters of the system with the assistance of a low-accuracy 3D sensor (e.g., Microsoft Kinect). Most recently, Wang et al. [14] proposed an out-of-focus calibration method for large-scale SL systems that makes use of a regular sized passive target and a planar mirror. In particular, the planar mirror is used to deviate the line of sight of the camera such as to generate a virtual camera that shares a close-range FOV with the projector. Although this expedient allows the use of a regular sized circle dots array, an additional extra in-focus camera and two sets of projected fringe patterns are needed for extracting the circle centers location from the severely blurred images of the passive target [15]. Some other examples of mirror-based extrinsic calibration are available in the Computer Vision literature for applications to those cases in which the stereo-cameras cannot observe the calibration target directly or possess no-overlapping FOV [16–18]. However, all these methods adopt a regular sized checkerboard pattern, even when relatively large FOVs are entailed with a possible negative effect on the accuracy of the calibration results.
In this work, we present a novel long-range stereo-rig calibration method that joins the advantages of using active targets (i.e. insensitivity to the defocus in the close-range) and planar mirrors (i.e. possibility to create a virtual camera with tailored working distance). The paper is organized as follows. First, the rationale of the proposed method along with a detailed description of the optical layout and the stereo-camera calibration procedure is reported. Then, the methods and the results of the experimental campaign carried out to assess the metrological performances of a long-range stereo-DIC system are detailed and discussed. In particular, accuracy and uncertainty in 3D-DIC measurements of shape, displacement and deformation were estimated over a working volume (WV) of 3 m × 2 m × 1.8 m (width × height × depth). Finally, the potential of such calibrated long-range stereo-system is illustrated by reconstructing a 3D scene with objects of different size and geometry placed at different poses over the considered WV.
2. Rationale
In this work, we propose to calibrate a long-range stereo-camera system by generating a virtual stereo-system (hereafter indicated as VS) with a working distance (WD) much shorter than the real stereo-system (RS). To this aim, two mirrors are placed in such a way as to symmetrically deviate backwards the viewing directions of the long-range RS, as illustrated in Fig. 1. This allows a regularly sized calibration target to fill most of the reduced field of view (FOV) of the VS thus enabling an accurate modeling and correction of the lens distortions [5]. The main problem associated to the shortening of the working distance, however, is the severe defocus that disallows the use of a passive calibration target (e.g. a planar checkerboard). A video monitor is hence used as active calibration target, i.e. it is used to sequentially display a sequence of phase-shifted sinusoidal circular fringe patterns. Being notoriously insensitive to image blurring [11], the fringe patterns generate a sharp phase map encoding a grid of fiducial points such for a typical calibration target. This is the main idea of the proposed stereo-calibration method for large-scale DIC measurement.
Figure 2 shows a picture of the experimental setup adopted in this work to validate the proposed procedure. Following the layout depicted in Fig. 1, two scientific graded cameras (Dalsa Falcon 4M30, pixels CMOS sensor, 8 bits) equipped with 28-105 mm Nikkor zoom lenses (and ~30 mm focal length) were placed at a distance of ~1650 mm and best-focused on their axes crossing point at a working distance of ~5400 mm. Two 185 mm × 115 mm planar mirrors mounted on kinematic mounts were then symmetrically positioned and adjusted in order to have a 27” video monitor (Philips model 273V5LHAB/00, 1920 × 1080 pixels resolution, pixel pitch 0.311 mm) filling about a half of the close-range working volume of the VS (out-of-focus area in Fig. 1).
Figure 3 shows the mirrored images of the four synthetically generated phase-shifted circular fringe patterns ( pixels) and the corresponding phase map. In particular, the four displayed images were obtained each from eighteen pixels sub-images generated with the equation where is the intensity greyscale value at pixel ,and are the background intensity and the intensity modulation amplitude, respectively [13], is the pixel radial position, is the inverse of the fringe pitch (in this case pixels−1) and is the fringe phase step (). Note that, although the acquired images are severely blurred due to the strong defocus in the close-range FOV (Figs. 3(a)-3(d)), a sharp phase map with a regular array of circular dots can yet be identified (Fig. 3(e)). From this map, after a simple grayscale thresholding segmentation, it is possible to extract the grid of features with well-known positions that serves for camera calibration (dots pitch 300 pixels = 93.3 mm, Fig. 3(f)).
With this simple and inexpensive layout, by keeping both cameras and monitor fixed, from a minimum set of three (per camera) tilted poses of the mirror obtained by acting on the kinematic mount, it is possible to retrieve the pose () of the real right (left) camera {R} ({L}) with respect to the local coordinate system {T} associated to the active target. From and it is then possible to calculate the relative pose between the two real cameras as needed for subsequent stereo-triangulation. The model and notation used for the extrinsic parameters calibration is illustrated for the left camera in Fig. 4 (an analogous scheme can be drawn for the right camera) and the developed procedure is described as follows.
At each mirror position , the virtual image of the fiducial point on the calibration target is projected into the point of the left camera sensor. The relationship existing between the coordinates of the fiducial point in the camera and in the target reference system is:
The extrinsic parameters calibration aims at estimating the rotation matrix and the translation vector from . The calibration procedure relies on the fundamental relationship existing between a physical point and its reflection by a mirror also known as the Householder transformation (see e.g [16]. for equation derivation):
where is the normal vector of the mirror and is its distance from {L} (Fig. 2).
If two mirrored 3D point positions and of the same reference point of the calibration target are considered, it is possible to demonstrate that the following orthogonality constraint holds true [16]:
where is the axis vector which lies along the intersection of the two mirror planes. By exploiting the orthogonality constraints for three not aligned reference points at three different mirrored positions, the following relation is obtained:
By left multiplying Eq. (4) by , it becomes:
where is a positive semidefinite matrix containing the coordinates of the mirrored points and that can be computed from image points and through common calibration procedures [7]. From Eq. (5), it is hence possible to calculate as the eigenvector corresponding to the smallest eigenvalue of (in the noiseless ideal case being equal to zero). Although only three points would suffice to calculate, to minimize the effect of experimental errors, in this work we considered a large number of different combinations of three not aligned fiducial points of the calibration target (Fig. 3(f)) and sorted out the minimum computed eigenvalue (O(10−13)) from the entire set.
Once , and have been computed with the procedure above for the three considered mirror positions, it is possible to estimate the three normal vectors as:
As a final step, for each mirror position , a set of lines parallel to and passing through the mirrored point is drawn. Since the lines , and cross (in a least square sense) into the physical point (Fig. 4), its position in the camera coordinate system {L} can be easily computed and used in Eq. (1) for estimating the rotation matrix and the translation vector , i.e. the transformation . With an analogous procedure, from the mirrored images at three independent poses of the right mirror, it is possible to estimate the pose of the right camera with respect to the reference system {T} associated to the active target and, finally, the relative pose of the two real cameras as needed for stereo-triangulation.
To encourage its adoption and allow the reproducibility of related results, the entire calibration procedure is described step by step as follows.
Step 1 - System Setup
The long-range stereo-camera system is setup and focused on the scene of interest such as to perform optimally under the current measurement conditions. The video monitor is then placed at a close distance just behind the cameras facing the measurement scene as pictured in Fig. 2. A set of phase shifted circular fringe patterns [13] is synthetically generated by selecting the fringe pitch in order to obtain a wrapped phase map with a single phase jump and with a darker area of sufficiently large size for enabling an accurate centroids detection (see Fig. 3).
Step 2 - Images capturing for extrinsic calibration
Two identical planar mirrors are fixed on two kinematic mounts and placed conveniently close to the cameras. Consider, in fact, that small tilting angles give rise to large displacements of the monitor in the image and that such effect increases with the distance camera/mirror. At three independent positions of the mirrors for each camera, the sequence of the phase shifted fringe pattern is acquired and stored for subsequent data processing. Note that, since the normal vectors to the mirrors are calculated from Eq. (6), the mirror positions should be selected such as , and are not parallel to each other (see e.g. the first three frames in Fig. 5 as an example of a valid set of mirrored images).
Step 3 - Images capturing for intrinsic calibration
Although only three positions of the mirrored monitor suffice for extrinsic calibration [16], it is more appropriate to consider a larger number (>10) of images to enhance the accuracy of intrinsic calibration [7] (Fig. 5). To this scope, the mirrors are removed, and the monitor is placed in multiple poses at different distances along each separate camera FOV. Indeed, due to the insensitivity to defocus proper of the fringe patterns, high quality phase maps can be obtained independently from the distance monitor/camera (see Fig. 6 and compare the two last frames in Fig. 5 corresponding to the closest and farthest positions of the set considered for the left camera calibration). The last frame in Fig. 5 is relative to the closest position in which the monitor is entirely included in the common field of view of the long-range stereo-camera system.
Step 4 - Intrinsic calibration
The set of the three mirrored images plus the additional images collected in step 3 (thirteen in this work) are used for separately calibrating the right and the left camera through common calibration procedures [7]. As a result, the intrinsic parameters of each camera and the coordinates of the mirrored points and (with and, in this work,) to be used in Eq. (1) are computed. Images with mirrored-views and direct-views of the active target can be used in a bundle since it has been demonstrated elsewhere that the view through a mirror has no effects on the camera intrinsic parameters estimation [12]. In this work, a reprojection error of 0.117 ± 0.09 pixels and 0.115 ± 0.07 pixels was calculated for the right and left camera, respectively. The largest reprojection error was found for the mirrored images and it is very likely to be ascribed to the low quality of the mirrors (taken from overhead projectors). Moreover, even though the effect of the perspective view on the centroid calculation was taken into account in the image data processing [19], its accuracy could still have been affected by the down-sampling of the circle dots along the horizontal direction (see the first three images in Fig. 5).
Step 5 - Extrinsic calibration
As a final step of calibration, the three mirrored images are used to retrieve the relative poses between the two real long-range stereo-cameras according to the procedure detailed above. The right and left images of the last position in the series of Fig. 5 can be used for a quick check of the 3D reconstruction error (and thus of the accuracy of the stereo-camera calibration) since it is the sole captured image-pair in which the calibration target is seen simultaneously by the two real cameras.
3. Experiments
To evaluate the metrological performance of a long-range stereo system calibrated with the proposed method, the same 27” video monitor used for calibration served also as a test object for the shape and deformation error analysis. The size of the FOV, in fact, denies the possibility to find a sufficiently accurate test object filling most of the measurement working volume (WV). For this reason, by assuming the screen to be sufficiently flat and the conversion pixel/mm accurate enough (1 pixel = 0.311 mm from the manufacturer data sheet), a composite 3D test object was created by sequentially placing the monitor in 27 positions defining a fairly regular 3 × 3 × 3 matrix spanning a WV of about 3 m (width) × 2 m (height) × 1.8 m (depth) centered at the crossing point of the camera axes (Fig. 7).
A first set of experiments aimed to map the 3D reconstruction error within the measurement volume. To this scope, two different targets were synthetically generated: i) a 1800 × 900 pixels image with a regular 4 × 2 pattern of circles with a 300 pixels diameter at a 450 pixels distance and ii) a 1800 × 900 pixels image with a synthetic speckle pattern generated with the freely available Matlab code developed by Sur and coauthors [20].
In each monitor position (Fig. 7), the images of both the dot and of the speckle patterns were sequentially acquired. Figure 8 shows the images composed from a mosaic of the 27 sub-images of the monitor with the dots (first row) and the speckle pattern (second row) at three different through-depth planes within the WV. The magnification factor along the depth, grossly evaluated from the pitch of the dot pattern images at the (2, 2, k) positions (with k = 1, 2, 3), is 1.32, 1.59 and 1.84 mm/pixel at the front, middle and rear planes, respectively. A significant variation of the magnification factor due to the oblique view can also be observed at the same depth along the parallax direction (see e.g. the first mage in Fig. 8).
A first analysis on the 3D reconstruction error was performed by using the sole information on the centroids positions of the dots pattern, thus excluding the possible effect of the DIC registration algorithm on the measurement accuracy. The possible deleterious effects of the spatial down-sampling and of the defocus blur of the images is expected to be negligible given the size of the dots and the large depth-of-focus of the video-system. Figure 9(a) reports the 3D plot of the reconstructed dot patterns (8 dots for each of the 27 considered positions within the working volume) with indication of the distance between the dots averaged at each monitor position. The theoretical distance between dots is 139.95 mm (corresponding to a 450 pixels pitch) while the measured dots distance averaged over the whole set of positions is 138.7 ± 0.46 mm, which corresponds to a relative error of 0.18%. Despite the measurement error is very low, the volumetric map of the dots distance yet reveals a clear pattern over the working volume thus indicating the existence of a position-dependent measurement bias.
In a subsequent analysis, the 27 images with the speckle pattern (Fig. 8, second row) were processed to verify the accuracy in reconstructing the monitor plane throughout the working volume with DIC. For each pair of images, a regular point grid with a 2 pixels spacing was correlated by using a 21 × 21 pixels subset size and a second-order shape function. Figure 9(b) reports the deviation from its best fitting plane for each of the 3D surfaces reconstructed with 3D-DIC. An average planarity error of 0.17 ± 0.038 mm was calculated for the whole set of monitor positions.
Two additional tests were performed to quantify the error in the 3D displacement and deformation measurement. In particular, with the monitor in the (2, 2, 2) position, the displayed speckle image was first shifted along the horizontal direction in 8 known positions (listed in Table 1) and then synthetically deformed [20] with an inhomogeneous displacement function described by the formula , with pixels and pixels−1.
Table 1 reports the results obtained for the displacement measurement test. It should be remarked that, although the displacement applied to the synthetic speckle pattern displayed in the monitor is an in-plane displacement distribution, the experimental results in Table 1 refer to the magnitude of the total displacement of the video monitor plane and hence include both the DIC registration error and the 3D reconstruction error.
Figures 10(a) and 10(b) report the pair of undeformed/deformed synthetic images displayed for evaluating the accuracy of a deformation measurement. To compare the experimental 3D deformation map to the theoretical 2D counterpart, the coordinates of the 3D-DIC reconstructed surfaces of the video monitor in the undeformed and deformed configurations were transformed into a new reference system having their best fitting plane as xy plane. The 3D displacement components maps were hence computed and the u displacement map (Fig. 10(d)) was compared to the theoretical (imposed) image deformation. A spurious displacement along the vertical direction v = 0.016 ± 0.01 pixels and out-plane direction w = 0.01 ± 0.19 pixels was obtained from the stereo-DIC measurement. The plots of the two superimposed u displacement profiles are reported in Fig. 10(e) showing a good overlap between the two distributions with an expected peak smoothing (of about 3%) due to the low spatial resolution of the ROI (389 × 239 pixels processed with a 21 × 21 pixels subset size, Fig. 10(c)) and to the large deformation gradient applied to the synthetic image (see also the plot of the residuals between experimental and theoretical full-field data in Fig. 10(f)).
As a final test, with the aim to further illustrate the validity and potentialities of the proposed method, a more complex scene including objects of different shapes and sizes was laid out over the WV as pictured in Fig. 11(a). A synthetic speckle pattern was hence projected onto the scene to provide the suitable texture needed for stereo correlation (Fig. 11(b)). The 3D reconstructed geometries of the objects visible in the scene are reported together with the corresponding depth map in Figs. 11(c) and 11(d) as an illustrative example of the potential of the proposed method.
4. Conclusion
This paper reports the proof of concept of a generalized and easy-to-implement calibration method for large-scale stereo-DIC systems that overcomes the limitations entailed in traditional calibration schemes. In particular, the method exploits the insensitivity to defocus of phase targets, and the flexibility offered by a mirror-based extrinsic calibration procedure. The results in terms of accuracy and uncertainty of 3D shape and deformation measurements need to be read by considering the very challenging experimental conditions entailed in this study, that include a very low image spatial resolution and a considerable perspective deformation of the stereo-images due to the large values of WD and camera baseline.
An accurate stereo-rig calibration is a prerequisite of foremost importance for accurate long-range stereo-DIC measurement. However, two key-problems [4] remain to be addressed: i) how to create a suitable speckle pattern and ii) how to optimize the imaging system output (camera resolution, lens, WD, camera baseline). These issues will be addressed in future work with the aim of extending the possibility to perform high-accuracy DIC measurements for large-scale applications in material science, biomechanics, civil, geotechnical, automotive and aerospace engineering.
Funding
National Natural Science Foundation of China (NSFC) (11872009, 11632010).
Acknowledgments
The authors wish to thank MSc Michelangelo Nigro for his assistance in conducting the experimental tests.
References
1. M. A. Sutton, J. J. Orteu, and H. Schreier, Image correlation for shape, motion and deformation measurements: basic concepts, theory and applications (Springer Science & Business Media, 2009).
2. B. Pan, “Digital image correlation for surface deformation measurement: historical developments, recent advances and future goals,” Meas. Sci. Technol. 29(8), 082001 (2018). [CrossRef]
3. B. Pan, K. Qian, H. Xie, and A. Asundi, “Two-dimensional digital image correlation for in-plane displacement and strain measurement: a review,” Meas. Sci. Technol. 20(6), 062001 (2009). [CrossRef]
4. R. Ghorbani, F. Matta, and M. A. Sutton, “Full-field deformation measurement and crack mapping on confined masonry walls using digital image correlation,” Exp. Mech. 55(1), 227–243 (2015). [CrossRef]
5. M. Sutton, F. Matta, D. Rizos, R. Ghorbani, S. Rajan, D. Mollenhauer, H. Schreier, and A. Lasprilla, “Recent progress in digital image correlation: background and developments since the 2013 Murray lecture,” Exp. Mech. 57(1), 1–30 (2017). [CrossRef]
6. X. Shao, X. Dai, Z. Chen, Y. Dai, S. Dong, and X. He, “Calibration of stereo-digital image correlation for deformation measurement of large engineering components,” Meas. Sci. Technol. 27(12), 125010 (2016). [CrossRef]
7. Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000). [CrossRef]
8. R. Tsai, “A versatile camera calibration technique for high-accuracy 3d machine vision metrology using off-the-shelf tv cameras and lenses,” IEEE J. Robot. Autom. 3(4), 323–344 (1987). [CrossRef]
9. C. Schmalz, F. Forster, and E. Angelopoulou, “Camera calibration: active versus passive targets,” Opt. Eng. 50(11), 113601 (2011). [CrossRef]
10. L. Huang, Q. Zhang, and A. Asundi, “Camera calibration with active phase target: improvement on feature detection and optimization,” Opt. Lett. 38(9), 1446–1448 (2013). [CrossRef] [PubMed]
11. Y. An, T. Bell, B. Li, J. Xu, and S. Zhang, “Method for large-range structured light system calibration,” Appl. Opt. 55(33), 9563–9572 (2016). [CrossRef] [PubMed]
12. T. Bell, J. Xu, and S. Zhang, “Method for out-of-focus camera calibration,” Appl. Opt. 55(9), 2346–2352 (2016). [CrossRef] [PubMed]
13. S. S. Gorthi and P. Rastogi, “Fringe projection techniques: whither we are?” Opt. Lasers Eng. 48(2), 133–140 (2010). [CrossRef]
14. P. Wang, J. Wang, J. Xu, Y. Guan, G. Zhang, and K. Chen, “Calibration method for a large-scale structured light measurement system,” Appl. Opt. 56(14), 3995–4002 (2017). [CrossRef] [PubMed]
15. B. Li, N. Karpinsky, and S. Zhang, “Novel calibration method for structured-light system with an out-of-focus projector,” Appl. Opt. 53(16), 3415–3426 (2014). [CrossRef] [PubMed]
16. K. Takahashi and S. Nobuhara, andT.Matsuyama, “A new mirror-based extrinsic camera calibration using an orthogonality constraint,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2012), pp. 1051–1058. [CrossRef]
17. J. A. Hesch, A. I. Mourikis, and S. I. Roumeliotis, “Mirror-based extrinsic camera calibration,” in Algorithmic Foundation of Robotics VIII, (Springer, 2009), pp. 285–299.
18. R. K. Kumar, A. Ilie, J.-M. Frahm, and M. Pollefeys, “Simple calibration of non-overlapping cameras with a mirror,” in Computer Vision and Pattern Recognition,2008. CVPR 2008. IEEE Conference on, (IEEE, 2008), pp. 1–7. [CrossRef]
19. J. Heikkila and O. Silven, “A four-step camera calibration procedure with implicit image correction,” Computer Vision and Pattern Recognition,1997. CVPR 1997. IEEE Conference on, (IEEE, 1997),1106–1112. [CrossRef]
20. F. Sur, B. Blaysat, and M. Grediac, “Rendering deformed speckle images with a boolean model,” J. Math. Imaging Vis. 60, 1–17 (2017).