## Abstract

Camera calibration is necessary for accurate image measurements, particularly in multicamera systems. The calibration process involves corresponding the coordinates of 3D calibration points with a 2D image and requires the establishment of a reliable 3D world coordinate system. This paper presents a convenient multicamera calibration method that uses a rotating calibration plate and multi-view stereo vision to calculate 3D points and their relationship with the image coordinates. Despite simple implementation, the rotation of the calibration plate presents numerous calibration points from various planes, increasing the stability of the solution and the noise reduction. The relocation accuracy and reprojection error are experimentally verified.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Camera calibration plays an important role in both image measurements and machine vision tasks. Under most circumstances, camera parameters, including intrinsic, extrinsic, and distortion parameters, are obtained through calculations. Camera calibration calculates the relationship between 3D calibration points in space and 2D coordinates on the image plane. Therefore, it is essential to define a 3D world coordinate system and obtain the 3D coordinates of calibration points. The camera calibration is a prerequisite for any work that follows. It is obvious that calibration results directly influence the accuracy of image measurements.

To achieve accurate image measurement results, it is necessary to utilize a calibration object to define the 3D coordinates of the calibration points. First, the 3D world coordinates are usually defined on the calibration object. Cameras then capture images of the calibration object to obtain the 2D image coordinates corresponding to these 3D calibration points. Thus, the associations between the 3D coordinates of the calibration point in the world coordinate system and the 2D coordinates in the image coordinate system are determined. Finally, the intrinsic and extrinsic parameters of the camera are calculated by means of several calibration methods, for example, direct nonlinear minimization [1–3], close-form solution [4,5], and a two-step method [6–8].

Calibration objects are indispensable in the camera calibration process. There are approximately four types of camera calibration methods, depending on the nature of the calibration object. The first is the self-calibration method [9–11]. A camera can be calibrated directly from an image sequence, despite unknown motion and changes in some of the intrinsic parameters. When a series of images of a fixed scene is taken, the absolute conic is fixed, and once this is determined, the metric geometry can be computed [12]. Since this method does not require pre-determined calibration objects, it can quickly produce camera parameters. However, the results obtained are unreliable [13]. The second method is to employ a plane calibration plate, such as in Zhang’s method [13]. By randomly placing the plane calibration plate in different orientations, the intrinsic parameters of the camera are calculated. Although this method is easily implemented, the calibration results vary with different orientations of the calibration plate. To some extent, the calculated intrinsic and extrinsic camera parameters obtained with this method only converge to local optimal solutions that correspond to specific orientation sequences of the calibration plate. The third method is to adopt a linear translation stage to drive the plane calibration plate for translational movement, such that the world coordinate system is determined by the known movement of the plane calibration plate [14,15]. This method requires extreme precision in the installation orientation of the calibration plate and the positioning accuracy of the translation stage. The fourth method is based on a 3D calibration object [16,17], on which calibration points are printed. In this method, the 3D coordinates of the calibration points can be obtained directly, which is important for accurate camera calibration. However, this method requires high precision in the manufacturing stage of the calibration object [13]. On the other hand, if multicamera calibration is performed, the limitation of the field of view means that each camera may only be able to capture a limited number of calibration points for stereo calibration of the object, resulting in fewer calibration points that can be employed in the solution, and hence influencing the stability of the results.

The multicamera system has been widely employed in many fields, for example, in the measurement of fluids [14,15,18], 3D reconstruction [19,20], and multicamera tracking [21,22]. The multicamera system can achieve accurate measurements, but calibration is more complicated. The inward-looking [23] multicamera setup is commonly used in the experiment of background oriented schlieren (BOS) [18,24] or the flame chemiluminescence tomography (FCT) [16,25]. The cameras are usually distributed over a wide range in order to acquire sufficient data. And there exists a volume that in the field of view of all cameras. Under this circumstance, it is impossible for all cameras to see a calibration plate simultaneously, thus the unified world coordinate system is difficult to be determined. Therefore, Shen and Hornsey [23] present a calibration method based on a novel 3D target. While a 3-D laser scanning system has to be employed to determine the actual target configuration due to the manufacturing tolerances. Reference [16] utilizes a special 3D calibration object. The calibration points are printed on several known planes, such that the world coordinate system is directly determined by the calibration object. However, making an accurate calibration object is more difficult and expensive, and the calibration points are limited to some special plane. Feng et al. [26] proposed a method using a transparent glass calibration board. All cameras can capture the calibration points simultaneously from different positions and orientations. However, the refractive phenomenon from the transparent glass needs to be considered in the calibration process. Besides, this method may be invalid for cameras that are at a large angle to the calibration plate due to the refractive phenomenon. A 2D standard checkerboard is employed for the calibration of a 23-camera setup [24]. The coordinate system of each camera is transformed to the first pair of cameras one after another. The calibration process is complicated and the error may be amplified by excessive transformation coordinate system. Thus, it is necessary to develop a new multicamera calibration method.

In this paper, we propose a flexible method for multicamera calibration that is easily implemented. Our method does not require a special calibration object, but only a calibration plate. Furthermore, it does not require precision in the movement of the calibration plate. The only requirement is rotation around a fixed axis. Our method also does not require all cameras to be able to see the calibration plate simultaneously, as long as there are overlapping fields of view between adjacent groups. In Section 2, we describe how we calculated the world coordinates of rotating calibration points using multi-view stereo vision. Section 3 presents the world coordinate system and camera coordinate system determined by rotating calibration points. Section 4 shows the experiments and results.

## 2. World coordinates of rotating calibration points

#### 2.1 Camera imaging model

Visual applications usually employ the pinhole model of a camera, where all the light is concentrated through the optical center of the camera. The homogeneous coordinate of a 3D point in the world coordinate system is denoted by ${M_w} = {({x_w},{y_w},{z_w},1)^T}$, and a point in the camera and pixel coordinate systems is denoted as ${M_c} = {({x_c},{y_c},{z_c},1)^T}$ and $m = {(u,v,1)^T}$, respectively, as shown in Fig. 1. Based on the pinhole model, the transformation from a 3D point to its 2D image point is given by:

*u*and

*v*are denoted by ${f_u}$ and ${f_v}$, and

*s*is a nonzero scale factor. The principal point is denoted by $({u_0},{v_0})$, and ${\mathbf P}$ denotes the camera projection matrix.

Due to the imperfection of the camera lens, there are two different types of distortion in the camera imaging process, namely the radial and tangential distortions. Under the influence of these distortions, a world point ${M_w}$ is no longer imaged at $(u,v)$, but at $(u^{\prime},v^{\prime})$. They are defined as follows:

#### 2.2 Multi-camera system

In this paper, the cameras of a multicamera system are divided into groups. The following description will be focused on the inward-looking multicamera system. The camera distribution of an eight-camera system with two groups is shown in Fig. 2. A benchmark camera has been selected within each group.

#### 2.3 World coordinates of rotating calibration points

In this section, the world coordinates of rotating calibration points are calculated, consequently establishing a world coordinate system for each group. Zhang’s [13] method has been widely applied in the field of computer vision, owing to its ability to calculate the intrinsic parameters of all cameras. The extrinsic parameters of each camera can then be calculated using epipolar geometry, which describes the visual geometric relationship between two images of the same scene. It requires only the intrinsic parameters and relative orientation of the camera.

If the world point *M* is imaged as ${m_1}$ and ${m_2}$ in two adjacent cameras, they satisfy the following relation:

The calibration plate is then placed in the common field of view, and is rotated while images of the plate are captured, as shown in Fig. 3. Thus, sequences of images of rotating calibration points can be obtained. Each calibration point is imaged by the four cameras in the group. According to Eq. (1), they satisfy the relation:

*i-th*camera in the group, that corresponds to the world point ${({x_w},{y_w},{z_w},1)^T}$, and ${p^i}$ is the element of the projection matrix of the

*i-th*camera, where $i \in \{{1,2,\ldots ,n} \}$. Here,

*n*is the number of the camera. Therefore, there are $2n$ equations for each image point, and the corresponding world point can be computed by solving this linear equation.

## 3. World coordinate system and camera coordinate system

#### 3.1 World coordinate system

It should be noted that the world coordinate points calculated in Section 2 is based on the camera coordinate system of the benchmark camera. A new world coordinate system needs to be created for the group.

After the calibration plate is rotated, the trajectory of each corresponding point can be fitted to a circle, each of which has a center. The coordinates of these circle centers can be used to determine the equation of the axis of rotation. The origin of the new world coordinate system is determined from the average of the coordinates of all circle centers, where the *z-axis* is the axis of rotation, as shown in Fig. 3. A 3D straight line can be denoted by:

The direction vector of the axis of rotation, as computed from the circle centers, is ${(m,n,1)^T}$, which gives the direction vector of the new *z-axis*. The new *x-axis*, or *y-axis*, is a unit vector perpendicular to the *z-axis*, and the last axis can be obtained from the cross product of the first two axes. For example:

Therefore, the transformation from the old world coordinate system to the new one is given by:

where ${M_{old}}$ is a point in the old world coordinate system, and ${M_{new}}$ is a point in the new world coordinate system.#### 3.2 Camera coordinate system

For each camera there are world coordinate points and image points. Using Eq. (1), the projection matrix of each camera can be calculated. If there are *n* corresponding points, Eq. (1) can be written as:

In the pinhole model of camera, when we select a point on the image plane, we can utilize the camera projection matrix to calculate the projection line of the point in the world coordinate system. These lines of different image points intersect at the light center, as shown in Fig. 4. Equation (1) can be written as:

*z-axis*of the camera coordinate system can be calculated using the principal center $({u_0},{v_0})$.

The *x-axis* is perpendicular to the *z-axis*, and it lies on the plane determined by the lines that passes through $({u_0},{v_0})$ and $({u_0} + \Delta ,{v_0})$, since the *x-axis* is parallel to the *u-axis*. And the *y-axis* is the cross product of *x-axis* and *z-axis*. It should be emphasized that the directions of the *x-axis* and *y-axis* are consistent with the *u-axis* and *v-axis* of the image plane, respectively.

#### 3.3 Unified coordinate system

If we apply the calibration procedure described above to a multicamera system with multiple groups, there are several sets of world coordinate points and world coordinate systems. The eight-camera system with two groups is shown in Fig. 2. The relationships between these world coordinate systems need to be mapped into one unified world coordinate system. The world coordinate system of any of the groups can be transformed into a selected group. In this paper, we choose group B as the benchmark and transform the coordinates of group A into group B.

For all these rotating calibration points, there are points that can be seen by the cameras of both groups A and B simultaneously. With these overlapping points, the transformation relationship between adjacent groups can be calculated. The transformation from group A to group B is given by the following:

#### 3.4 Bundle adjustment

An initial estimate of the camera parameters was obtained using the aforementioned calibration procedure. The camera parameters were then optimized with the widely used bundle adjustment. Owing to the influence of noise, cumulative errors, and solution inaccuracy, the results need to be further optimized for improved accuracy. The bundle adjustment tries to minimize the sum of the reprojection errors of all points. The minimization is achieved using a nonlinear least squares algorithm. Levenberg-Marquardt (LM) is the most commonly used algorithm. This is because it combines the gradient descent and the Gauss-Newton methods, and uses an effective damping strategy, so that rapid convergence can be obtained for a wide range of initial estimations. The cost function for the bundle adjustment is given by:

*N*is the number of image points of all cameras; $m$ and $\tilde{m}$ are the actual and predicted image points, respectively. Equation (15) is solved using the bundle adjustment based on the LM algorithm. The optimal camera parameters of all cameras are then obtained. These camera parameters include the distortion coefficients

*d*, intrinsic matrix ${\mathbf K}$, rotation matrix ${\mathbf R}$, and translation vector ${\mathbf t}$.

## 4. Experiments and results

The proposed method was verified experimentally. The camera we used is AVT Guppy PRO F125B with a fixed focal length lens of 12 mm. The image resolution of the given camera is 1292×964 pixels. The calibration plate we used is a planar chessboard pattern with 8×8 evenly distributed corner points, and the distance between the adjacent points is 10 mm in both horizontal and vertical directions. The calibration plate and rotation device are shown in Fig. 5.

#### 4.1 Accuracy of 3D rotating points

The world coordinate system determined by the rotating calibration plate, and the world coordinate points of the calibration points in a group, are shown in Fig. 6. Because of the limitation of the field of view, this set of calibration points only covers a rotation angle of approximately quarter a circle, which is consistent with the fact that the camera of group A can only see approximately quarter a circle when the calibration plate rotates in the actual experiment.

Ideally, 3D point sequence obtained by rotating a calibration point should be located in a plane. However, due to the error, these points may be distributed around a plane. The distances between 3D points and corresponding fitted plane are calculated, as shown in Fig. 7. There are 14 images of calibration plates captured, and 64 calibration points in each plate. There are 896 distances are calculated. The max distance is 0.0162 mm, and the mean is 0.004 mm. Therefore, we can consider that these corresponding points are approximately on the same plane with small errors. Besides, the priori distances between 3D calibration points, which lie in a calibration plate, are known. Then we can calculate the relative errors between first point and all other points on each calibration plate. The errors as shown in Fig. 8. The max value of error is 0.0554 mm, and the mean is 0.0112 mm. It can be found from Fig. 7 and Fig. 8 that the error of 3D points is relatively small.

Further, the accuracy of the 3D calibration points obtained from the rotating calibration plate and multi-view stereo vision is found using the fitted rotation trajectory shown in Fig. 6. The relocation error is represented by the difference between the distance from a 3D point to the corresponding circle center and the radius of circle. The average relocation error of *j-th* fitted circle is given by:

*j-th*fitted circle. ${M_{ji}}$ is the

*i-th*3D point of

*j-th*fitted circle. ${O_j}$ is the circle center of the

*j-th*fitted circle, and ${r_j}$ is the radius.

The average relocation error between each calibration point and its corresponding circular trajectory is calculated for an object distance of ∼800 mm. As shown in Fig. 9, the maximum error is 0.0439 mm, and the average is 0.0175 mm. There are many possible reasons for the error, for example, if the axis of rotation is curved, or if the rotation of the calibration plate is not around a single axis, but a cluster of axes. However, because the calibration plate is very light, the error can be ignored.

#### 4.2 Four-camera experiment

The four-camera system is shown in Fig. 10. The reprojection errors of all points of the four-camera system are shown in Fig. 11. In this system, the reprojection error is less than 0.2 pixel, and most of the points are less than 0.15 pixel. The mean and standard deviation of the reprojection error along the *u-axis* are 6.76×10^{−4} and 0.032 pixels, respectively, and the mean and standard deviation of the reprojection error along the *v-axis* are -6.56×10^{−4} and 0.031 pixels, respectively.

#### 4.3 Eight-camera experiment

The eight-camera system is shown in Fig. 12. The unified world coordinate points are shown in Fig. 13. As can be seen in Fig. 14, the calibration points that can be seen simultaneously by group A and group B, shown as red and green points, overlap with each other.

The reprojection errors of all points of the eight-camera system are shown in Fig. 15. In the eight-camera system, the reprojection error is less than 0.3 pixel, and most of the points are less than 0.2 pixel. The mean and standard deviation of the reprojection error along the *u-axis* are –0.001 and 0.037 pixels, and the mean and standard deviation of the reprojection error along the *v-axis* are 5.95×10^{−4} and 0.033 pixels.

#### 4.4 Twelve-camera experiment

Additionally, the proposed calibration method was tested in a twelve-camera system, as shown in Fig. 16. There are common fields of view between adjacent groups. The twelve cameras were divided into three groups. The intrinsic parameters and distortion coefficients of the twelve cameras are listed in Table 1, and the Euler angles and camera positions are listed in Table 2.

Figure 17 shows the calculated world coordinate system and the orientations of twelve cameras. As can be seen from Fig. 17, the calculated camera orientations and positions are consistent with the actual distribution in Fig. 16.

The reprojection errors of all points of the twelve cameras are shown in Fig. 18. The reprojection error is less than 0.4 pixel, and most of the points are less than 0.2 pixel. The mean and standard deviation of the reprojection error alone the *u-axis* direction are 4.33×10^{−4} and 0.051 pixels, respectively, and the mean and standard deviation of the reprojection error alone the *v-axis* direction are 0.001 and 0.044 pixels, respectively. Therefore, the accuracy and reliability of the proposed method have been experimentally verified for all three multicamera systems.

## 5. Summary

In this paper, we propose a convenient method for multicamera calibration. We mount the calibration plate on a rotation device. On rotating the calibration plate, the cameras capture images of the plane calibration plate in different rotation positions. The world coordinates of the rotating calibration points are then calculated based on multi-view stereo vision. Finally, we calculate the positions and orientations of the cameras based on the pinhole model. This method does not require a special 3D calibration object, only a 2D calibration board. At the same time, precise movement of the calibration plate is not needed. In this method a calibration plate must be rotated, but the rotation interval angle can be unknown. This calibration method can be implemented relatively easily. The calibration points used are no longer limited to a few specific planes. This can increase the stability of the solution and the noise-reduction. Experiments have verified the reliability and accuracy of the proposed method. The relocation accuracy of the method with regard to the rotating calibration point is within 0.1 mm at an object distance of ∼800 mm, and the reprojection error is less than 0.5 pixel.

## Funding

National Natural Science Foundation of China (61701239); Natural Science Foundation of Jiangsu Province (BK20170852).

## Disclosures

The authors declare no conflicts of interest.

## References

**1. **D. C. Brown, “Decentering distortion of lenses,” Photometric Eng. **32**(3), 444–462 (1966).

**2. **W. Faig, “Calibration of close-range photogrammetry systems: Mathematical formulation,” Photogramm. Eng. Remote Sens. **41**(12), 1479–1486 (1975).

**3. **A. Izaguirre, P. Pu, and J. Summers, “A new development in camera calibration calibrating a pair of mobile cameras,” in Proceedings. 1985 IEEE International Conference on Robotics and Automation (1985), pp. 74–79.

**4. **S. Ganapathy, “Decomposition of transformation matrices for robot vision,” Pattern Recognit. Lett. **2**(6), 401–412 (1984). [CrossRef]

**5. **Y. I. Abdel-Aziz and H. M. Karara, “Direct linear transformation from comparator coordinates into object space coordinates in close-range photogrammetry,” Photogramm. Eng. Remote Sens. **81**(2), 103–107 (2015). [CrossRef]

**6. **R. Y. Tsai, “A versatile camera calibration technique for high accuracy 3-D machine vision metrology using off-the-shelf tv cameras and lenses,” IEEE J. Robot. Automat. **3**(4), 323–344 (1987). [CrossRef]

**7. **R. K. Lenz and R. Y. Tsai, “Techniques for calibration of the scale factor and image center for high accuracy 3-D machine vision metrology,” in Proceedings. 1987 IEEE International Conference on Robotics and Automation (1987), pp. 68–75.

**8. **J. Weng, P. Cohen, and M. Herniou, “Camera calibration with distortion models and accuracy evaluation,” IEEE Trans. Pattern Anal. Machine Intell. **14**(10), 965–980 (1992). [CrossRef]

**9. **S. J. Maybank and O. D. Faugeras, “A theory of self-calibration of a moving camera,” Int'l J. Computer Vision **8**(2), 123–151 (1992). [CrossRef]

**10. **R. I. Hartley, “An algorithm for self calibration from several views,” in IEEE Conference on Computer Vision and Pattern Recognition, 908–912 (1994).

**11. **Q. T. Luong and O. D. Faugeras, “Self-calibration of a moving camera from point correspondences and fundamental matrices,” Int’l J. Comput. Vis. **22**(3), 261–289 (1997). [CrossRef]

**12. **H. Richard and Z. Andrew, * Multiple View Geometry in Computer Vision*, 2nd ed. (Cambridge University Press, 2006).

**13. **Z. Y. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Machine Intell. **22**(11), 1330–1334 (2000). [CrossRef]

**14. **B. Wieneke, “Volume self-calibration for 3D particle image velocimetry,” Exp. Fluids **45**(4), 549–556 (2008). [CrossRef]

**15. **S. Scarano, “Tomographic PIV: principles and practice,” Meas. Sci. Technol. **24**(1), 012001 (2013). [CrossRef]

**16. **J. Wang, Y. Song, and Z. H. Li, “Multi-directional 3D flame chemiluminescence tomography based on lens imaging,” Opt. Lett. **40**(7), 1231–1324 (2015). [CrossRef]

**17. **H. Y. Xiong, Y. Xu, and Z. J. Zhong, “Accurate extrinsic calibration method of line structured-light sensor based on standard ball,” in 2009 IEEE International Workshop on Imaging Systems and Techniques (2009), pp. 193–197.

**18. **F. Nicolas, V. Todoroff, and A. Plyer, “A direct approach for instantaneous 3D density field reconstruction from background-oriented schlieren (BOS) measurements,” Exp. Fluids **57**(1), 13 (2016). [CrossRef]

**19. **F. Pedersini, A. Sarti, and S. Tubaro, “Multi-camera acquisitions for high-accuracy 3D reconstruction,” in European Workshop on 3d Structure from Multiple Images of Large-scale Environments (2000), pp. 124–138.

**20. **V. Kolmogorov and R. Zabih, “Multi-camera scene reconstruction via graph cuts,” in Proceedings of the 7th European Conference on Computer Vision (2002), pp. 82–96.

**21. **T. Zhao, M. Aggarwal, and R. Kumar, “Real-time wide area multi-camera stereo tracking,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) (2005), pp. 976–983.

**22. **K. Kim and L. S. Davis, “Multi-camera tracking and segmentation of occluded people on ground plane using search-guided particle filtering,” in Proceedings of the 9th European conference on Computer Vision (2006), pp. 98–109.

**23. **E. Shen and R. Hornsey, “Multi-Camera network calibration with a non-planar target,” IEEE Sens. J. **11**(10), 2356–2364 (2011). [CrossRef]

**24. **S. Grauer, A. Unterberger, A. M. Kempf, and K. Mohri, “Instantaneous 3D flame imaging by background-oriented schlieren tomography,” Combust. Flame **196**, 284–299 (2018). [CrossRef]

**25. **Y. Jin, Y. Song, X. J. Qu, and Z. H. Li, “Hybrid algorithm for three-dimensional flame chemiluminescence tomography based on imaging overexposure compensation,” Appl. Opt. **55**(22), 5917–5923 (2016). [CrossRef]

**26. **M. C. Feng, S. Huang, and J. S. Wang, “Accurate calibration of a multi-camera system based on flat refractive geometry,” Appl. Opt. **56**(35), 9724–9734 (2017). [CrossRef]