On-line three-dimensional coordinate measurement of dynamic binocular stereo vision based on rotating camera in large FOV

Yue Wang; Xiangjun Wang; Xiangjun Wang

doi:10.1364/OE.414365

1. Introduction

Nowadays, the rotating camera that can rotate horizontally and vertically is widely used in video surveillance [1,2], target recognition and tracking [3,4]. The dynamic binocular stereo vision (DBSV) based on rotating camera can be applied to three-dimensional (3D) coordinate measurement in large field of view (FOV), which greatly expands the visual measurement range compared with static binocular stereo vision. A key problem of on-line 3D coordinate measurement is how to calibrate the intrinsic and extrinsic parameters of the camera quickly. The intrinsic parameters of a non-zoom camera are constant, but the extrinsic parameters (i.e. the yaw angle in horizontal direction and the pitch angle in vertical direction) change during the rotation.

The intrinsic parameters of the rotating camera can be calibrated offline by using high-precision traditional calibration methods in advance. The traditional calibration methods mainly include direct linear transformation (DLT) method, Tsai’s two-step method, Weng’s iterative method and Zhang’s method. DLT method [5] ignores the nonlinear distortion of the lens and calculates the intrinsic and extrinsic parameters of the camera linearly by directly using imaging mathematical model of the camera. Tsai’s two-step method [6] firstly calculates the camera parameters by using the DLT method, and then considers the radial distortion coefficients and further improves the calibration accuracy of the camera by nonlinear optimization method. Weng’s iterative method [7] adds the radial and tangential distortion coefficients of the lens to the camera’s perspective imaging model on the basis of Tsai’s two-step method so that the camera model can adapt to the occasions with large FOV and distortion. Zhang’s method [8] is the checkerboard calibration method proposed by Zhengyou Zhang, which requires the camera to obtain at least three checkerboard images with different orientations and estimates the intrinsic parameters of the camera by using the homography matrix between the images and the checkerboard. The traditional calibration methods perform high accuracy, but the reference object with known two-dimensional (2D) or 3D information is required. There are also some active vision calibration methods [9,10] that can calibrate the intrinsic parameters of the camera by using the motion parameters of the camera and images acquired by the camera at different positions. For instance, Gao Yang [11] presented a novel camera calibration technique in the field of large-scale vision measurement, which employed a precise two-axis rotary table and a single stationary optical reference point. The camera was calibrated by moving the camera at different angular positions of the rotary table and simultaneously taking photos of the stationary optical point in front of it.

Since the constant intrinsic parameters, the extrinsic parameters calibration of the rotating camera is the key factor in determining the accuracy of 3D coordinate measurement of the DBSV. In general, the high-precision extrinsic parameters calibration method relies on significant features. Truong A.M. [12] proposed a novel accurate and fully automatic extrinsic calibration framework for camera networks with partially overlapping views, which considered the pedestrians in the observed scene as the calibration objects and analyzed the pedestrian tracks to obtain extrinsic parameters. Van Crombrugge I. et al. [13] proposed an extrinsic calibration method by using Gray code, which was projected on a plane using a standard projector. The extrinsic parameters were calibrated by using bundle adjustment method that optimized the pose of the plane and the cameras with respect to the projector. Jia C.C. [14] proposed an extrinsic calibration method for a multi-RGB-D-camera system by using a tower calibration pattern with circular markers in a limited FOV. Yan F. [15] proposed a high-accuracy camera calibration method by using a high-accuracy light-spot small target, which could overcome the influence of image blur and noise and was not limited by depth of field and target size. However, the above methods will not work if the significant features disappear in the scene. Therefore, some methods utilize high-precision angle measuring instruments to estimate extrinsic parameters. For example, Feng W.W. [16] proposed a method using the inertial measurement unit (IMU) to calibrate stereo vision systems for large-scale measurements, which determined the relative rotation between the two cameras through coordinate transform with the alignment of IMU sensor with the camera.

For the calibration of the rotating camera, researchers have also done a lot of related work. The intrinsic and extrinsic parameters of rotating camera are commonly estimated by images acquired at different rotation positions. Kim M. [17] suggested a robust camera calibration method using rotation information of pan-tilt-zoom (PTZ) camera, which calculated multiple homographies between the image at reference position and the images at differently rotated position. The camera matrix was calculated by these homographies and rotation matrixes in proposed single linear equation. Gudys A. [18] introduced a method of camera calibration and navigation for pan-tilt (PT) camera based on continuous tracking, which allowed the camera pose to be calculated recursively in real time on the basis of the current and previous camera images and the previous pose. To improve the efficiency, researchers tried to reduce the images for calibration. Canlin Li [19] proposed a novel stratified self-calibration method of camera based on rotation movement, which calibrated intrinsic parameters by more than 3 images of the same scene captured by panning and rotating the camera with small relative rotation angles among the captured images in the case of constant intrinsic parameters. Chaoning Zhang [20] proposed a deep learning based approach to automatically estimate the focal length, distortion parameters of a PTZ camera and the rotation angles by an image pair. The methods using image pairs are not only computationally intensive, but also not suitable for the occasion that few feature points are shared between views before and after rotation. To reduce feature points required for calibration, Mei Wensheng [21] presented an imaging geometry model of rotating panoramic camera, and calibrated the coordinates of the rotating center, the rotating initial azimuth, and exterior orientation elements of each image by more than 3 control points. Yunting Li [22] calculated the extrinsic parameters of one PT camera linearly by using a single control point, and Y. Wang [23] estimated the extrinsic parameters of a rotating camera iteratively by using a pair of point correspondences before and after rotation. Some other methods utilized rotation information to estimate intrinsic and extrinsic parameters of the rotating camera. Yang Wenguang [24] constructed a new camera calibration model by installing the debugged telephoto camera on an accurate 2D rotating platform, which calculated the motion matrix of the camera from the readings of the rotating platform. Bruckner M. presented [25] a method for active self-calibration of multi-camera systems consisting of PTZ cameras, which optimized the relative poses by actively rotating and zooming each camera pair, and exploited the rotation knowledge provided by the camera’s PT unit to robustly estimate the camera intrinsic parameters for different zoom steps as well as the rotation between PT unit and camera.

In this paper, a DBSV based on non-zoom rotating camera in large FOV is established. To overcome the inconvenience of manual operation in large FOV, we present a two-point method to calibrate the initial parameters of each camera offline in advance by using two control points with known 3D information. The distortion of the lens is ignored and image center defaults to the principal point of the camera under the condition that the extremely high accuracy of 3D coordinate measurement is not required. The intrinsic parameters and the roll angle of each rotating camera remains constant, and only the pitch angle and yaw angle changes after the camera is rotated. Since each camera is mounted on a high-precision two-axes platform, we directly estimate the pitch angle and yaw angle from the output of the two-axes platform, which realize the on-line calibration of the extrinsic parameters of the camera after rotation. In addition, the DBSV established in this paper has the ability of automatic alignment. The automatic alignment means that each camera can automatically aim at one specified target by controlling the rotation of the two-axes platform. To achieve automatic alignment, we present a novel target matching algorithm using local homography matrix, which can estimate the position of the target on the right image plane once the target is specified on the left image plane.

This paper is organized as follows. A model of DBSV based on non-zoom rotating camera in large FOV is established in section 2. A two-point method to calibrate the initial parameters of the camera is presented in section 3, as well as an on-line extrinsic parameters calibration method. A novel target matching algorithm using local homography matrix is proposed in section 4. In section 5, the computer simulation and real experiments are performed to investigate the accuracy of 3D coordinate measurement and validate the effectiveness of the proposed methods. The conclusion is represented in section 6.

2. Dynamic binocular stereo vision based on rotating camera in large FOV

2.1 Model of two-axes rotating camera

The two-axes rotating camera model is presented in Fig. 1. The camera is located at the position of the black rectangle before rotation, and the camera rotates to the position of the blue dotted rectangle after rotation. The intrinsic parameter matrix of the camera is defined as K, as shown in Eq. (1).

(1)$${\boldsymbol K} = \left[ {\begin{array}{{ccc}} {{f_x}}&0&{{u_0}}\\ 0&{{f_y}}&{{v_0}}\\ 0&0&1 \end{array}} \right], $$

where f_x, f_y respectively represents the camera’s focal length in x-direction and y-direction, and o(u₀, v₀) represents camera’s principal point.

Fig. 1. Model of two-axes rotating camera.

Download Full Size | PDF

The east-ground-north coordinate system at the optical center is defined as O_c-EGN (P_egn), and the camera coordinate system before and after rotation is denoted as O_c-X_cY_cZ_c (P_c) and O_c-X_c'Y_c'Z_c’ (P_c’), respectively. We define the three attitude angles of the rotating camera as roll angle, pitch angle, and yaw angle, which are respectively the angles rotating around the north direction, east direction, and the direction to ground. It’s worth noting that the attitude angles rotating around clockwise direction are positive. The rotation angles in the horizontal and vertical directions are denoted as Pan, Tilt, and the camera’s attitude angles before and after rotation are denoted as r, p, y, and r’, p’, y’, respectively. The mapping relationship between the east-ground-north coordinate system and the camera coordinate system can be expressed by

(2)$$\left\{ \begin{array}{l} {P_c} = {\boldsymbol R}{P_{egn}}\\ {\boldsymbol R} = {{\boldsymbol R}_z}(r){{\boldsymbol R}_x}(p){{\boldsymbol R}_y}(y) \end{array} \right., $$

where ${{\boldsymbol R}_z}({\times} ) = \left[ {\begin{array}{{ccc}} {\cos ({\times} )}&{\sin ({\times} )}&0\\ { - \sin ({\times} )}&{\cos ({\times} )}&0\\ 0&0&1 \end{array}} \right]$, ${{\boldsymbol R}_x}(\# ) = \left[ {\begin{array}{{ccc}} 1&0&0\\ 0&{\cos (\# )}&{\sin (\# )}\\ 0&{ - \sin (\# )}&{\cos (\# )} \end{array}} \right]$, ${{\boldsymbol R}_y}({\ast} ) = \left[ {\begin{array}{{ccc}} {\cos ({\ast} )}&0&{ - \sin ({\ast} )}\\ 0&1&0\\ {\sin ({\ast} )}&0&{\cos ({\ast} )} \end{array}} \right]$, and ×, #, * respectively represents the corresponding attitude angle.

And the mapping relationship between the camera coordinate systems before and after rotation can be represented by

(3)$$\left\{ \begin{array}{l} {P_c}^{\prime} = {{\boldsymbol R}_{ab}}{P_c}\\ {{\boldsymbol R}_{ab}} = {{\boldsymbol R}_x}(Tilt){{\boldsymbol R}_y}(Pan) \end{array} \right., $$

where r'₌r, p'₌p + Tilt, y'₌y + Pan.

To simply the camera model, the image center defaults to the principal point and the distortion coefficients of the camera are ignored because of the extremely small distortion of the lens of the camera in this paper, and the physical size of unit pixel defaults to factory value. It must be emphasized that the above assumptions are suitable for measurement situation that high accuracy is not required in large FOV.

A feature point in 3D world is denoted as Q, and the projection points on the image plane before and after rotation are respectively q(u, v) and q'(u’, v’). According to the perspective projection imaging model of the camera, we have

(4)$$\left\{ \begin{array}{l} \lambda \left[ {\begin{array}{{c}} {(u - {u_0}){d_x}}\\ {(v - {v_0}){d_y}}\\ 1 \end{array}} \right] = \tilde{{\boldsymbol K}}{P_c},\mu \left[ {\begin{array}{{c}} {(u^{\prime} - {u_0}){d_x}}\\ {(v^{\prime} - {v_0}){d_y}}\\ 1 \end{array}} \right] = \tilde{{\boldsymbol K}}{P_c}^{\prime}\\ \tilde{{\boldsymbol K}} = \left[ {\begin{array}{{ccc}} {{f_x}}&0&0\\ 0&{{f_y}}&0\\ 0&0&1 \end{array}} \right] \end{array} \right., $$

where λ, μ is the scale factor, d_x, d_y is the physical size of unit pixel in x-direction and y-direction.

According to Eqs. (3,4), we can get

(5)$$\kappa \left[ {\begin{array}{{c}} {U^{\prime}}\\ {V^{\prime}}\\ 1 \end{array}} \right] = \tilde{{\boldsymbol K}}{{\boldsymbol R}_{ab}}{\tilde{{\boldsymbol K}}^{ - 1}}\left[ {\begin{array}{{c}} U\\ V\\ 1 \end{array}} \right], $$

where $\kappa $ is the scale factor, and U'=(u'-u₀)d_x, V'=(v'-v₀)d_y, U=(u-u₀)d_x, V=(v-v₀)d_y.

We define ${\boldsymbol H} = \tilde{{\boldsymbol K}}{{\boldsymbol R}_{ab}}{\tilde{{\boldsymbol K}}^{ - 1}}$, and H is the inter-image homography matrix of the rotating camera.

2.2 Model of dynamic binocular stereo vision

To achieve online measurement of 3D coordinate in large FOV, a DBSV is established in this paper. The composition of the DBSV is shown in Fig. 2(a), which mainly includes two independent front-ends and a terminal. Each front-end is composed of a camera, a two-axes platform, a GPS positioning instrument, a digital two-axes inclinometer, an image sending router, and a wireless serial port. And the terminal is composed of a host computer, the left and right image receiving routers, a switch and a wireless serial port. Each camera is mounted on the two-axes platform so that the camera is allowed to rotate in the horizontal and vertical directions. The two-axes platform is composed of an azimuth stepping motor and a pitch stepping motor, and the accuracy of both stepping motors is 0.002°/step [26]. The two-axes inclinometer is mechanically connected to the platform, which is used to level this platform according to the output of the inclinometer. The ground position of each camera is obtained by the GPS in advance and the positioning accuracy of the GPS can reach 4 centimeter in real-time kinematic (RTK) mode [27]. The images acquired by each camera are transmitted to the host computer through the wireless bridge formed by image routers. The rotation of the two-axes platform is controlled by the host computer through the wireless serial port after generating the motion parameters of the stepping motors so that the FOV of each camera can be adjusted. It should be noted that the established DBSV is applied to 3D coordinate measurement in outdoor areas without tall obstacles, because the signal occlusion will make GPS unable to work.

Fig. 2. (a)The composition of dynamic binocular stereo vision, (b)Model of dynamic binocular stereo vision based on rotating camera.

Download Full Size | PDF

The mathematical model of the DBSV based on rotating camera is presented in Fig. 2(b). The left and right cameras are located at the position of black rectangle at the initial time, and rotate to the position of the blue dotted rectangle after the i-th rotation. Since the physical size of unit pixel of the camera in x-direction and y-direction is equal in this paper, we define the intrinsic parameter matrix of the left and right cameras as K₁, K₂.

(6)$${{\boldsymbol K}_1} = \left[ {\begin{array}{{ccc}} {{f_1}}&0&{{u_l}}\\ 0&{{f_1}}&{{v_l}}\\ 0&0&1 \end{array}} \right] {{\boldsymbol K}_2} = \left[ {\begin{array}{{ccc}} {{f_2}}&0&{{u_r}}\\ 0&{{f_2}}&{{v_r}}\\ 0&0&1 \end{array}} \right]$$

where f₁, f₂ is respectively the focal length of the left and right cameras, and o_l(u_l, v_l), o_r(u_r, v_r) is respectively the principal point of the left and right cameras.

We define the east-ground-north coordinate system at the ground position of the left camera as the world coordinate system. Since the ground position of each camera is previously obtained by GPS, the world coordinate of each camera is denoted as (S_1x, S_1y, S_1z), (S_2x, S_2y, S_2z), respectively, which can be calculated by referring to the Ref. [28]. The attitude angles (i.e. roll angle, pitch angle and yaw angle) of the left and right cameras at the initial time and after the i-th rotation are respectively denoted as (r₀₁, p₀₁, y₀₁), (r₀₂, p₀₂, y₀₂) and (r_i₁, p_i₁, y_i₁), (r_i₂, p_i₂, y_i₂). The approximate values of the roll angle and the pitch angle of each camera can be obtained by the inclinometer and the pitch stepping motor respectively, but the approximate value of yaw angle cannot be measured by the instruments in this paper. The rotation angles of the left and right cameras in the horizontal and vertical directions are denoted as Pan₁, Tilt₁ and Pan₂, Tilt₂. We define the rotation matrix of the left and right cameras relative to the east-ground-north coordinate system at the corresponding camera’s position at the initial time and after the i-th rotation as R₀₁, R_i₁ and R₀₂, R_i₂.

Given an arbitrary point Q(X_w, Y_w, Z_w) in 3D world and the corresponding projection points q₁(u_i₁,v_i₁), q₂(u_i₂,v_i₂) on the left and right image planes after the i-th rotation, Eqs. (7,8) can be obtained according to the perspective imaging model.

(7)$$\left\{ \begin{array}{l} \lambda \left[ {\begin{array}{{c}} {({u_{i1}} - {u_l}){d_x}}\\ {({v_{i1}} - {v_l}){d_y}}\\ 1 \end{array}} \right] = {{\tilde{{\boldsymbol K}}}_1}{{\boldsymbol R}_{i1}}\left[ {\begin{array}{{c}} {{X_w} - {S_{1x}}}\\ {{Y_w} - {S_{1y}}}\\ {{Z_w} - {S_{1z}}} \end{array}} \right]\\ {{\boldsymbol R}_{i1}} = {{\boldsymbol R}_z}({r_{i1}}){{\boldsymbol R}_x}({p_{i1}}){{\boldsymbol R}_y}({y_{i1}}) \end{array} \right., $$

(8)$$\left\{ \begin{array}{l} \mu \left[ {\begin{array}{{c}} {({u_{i2}} - {u_r}){d_x}}\\ {({v_{i2}} - {v_r}){d_y}}\\ 1 \end{array}} \right] = {{\tilde{{\boldsymbol K}}}_2}{{\boldsymbol R}_{i2}}\left[ {\begin{array}{{c}} {{X_w} - {S_{2x}}}\\ {{Y_w} - {S_{2y}}}\\ {{Z_w} - {S_{2z}}} \end{array}} \right]\\ {{\boldsymbol R}_{i2}} = {{\boldsymbol R}_z}({r_{i2}}){{\boldsymbol R}_x}({p_{i2}}){{\boldsymbol R}_y}({y_{i2}}) \end{array} \right., $$

where the definition of ${\tilde{{\boldsymbol K}}_1}$ and ${\tilde{{\boldsymbol K}}_2}$ is similar to $\tilde{{\boldsymbol K}}$ in Eq. (4), and r_i₁=r₀₁, p_i₁=p₀₁+Tilt₁, y_i₁=y₀₁+Pan₁, r_i₂=r₀₂, p_i₂=p₀₂+Tilt₂, y_i₂=y₀₂+ Pan₂.

Equations (7,8) are abbreviated here as λH_i₁=M_i₁ and μH_i₂=M_i₂, and the four linear equations in Eq. (9) can be obtained after eliminating the scale factor λ, μ.

(9)$$\left\{ \begin{array}{l} e{q_i}(1):{{\boldsymbol M}_{i1}}(1) - {{\boldsymbol H}_{i1}}(1){{\boldsymbol M}_{i1}}(3) = 0\\ e{q_i}(2):{{\boldsymbol M}_{i1}}(2) - {{\boldsymbol H}_{i1}}(2){{\boldsymbol M}_{i1}}(3) = 0\\ e{q_i}(3):{{\boldsymbol M}_{i2}}(1) - {{\boldsymbol H}_{i2}}(1){{\boldsymbol M}_{i2}}(3) = 0\\ e{q_i}(4):{{\boldsymbol M}_{i2}}(2) - {{\boldsymbol H}_{i2}}(2){{\boldsymbol M}_{i2}}(3) = 0 \end{array} \right., $$

where M_im(n), H_im(n) (m=1, 2, n=1, 2, 3) represents the n-th element of corresponding matrix.

Therefore, the world coordinates of point Q can be obtained by the least square method (LSM) once the intrinsic and extrinsic parameters of each camera and the image coordinates of the point are determined.

3. Camera calibration method

3.1 Initial parameters estimation by only using two control points

As mentioned in section 2.1, the image center defaults to the principal point and the distortion of the lens is ignored, and the physical size of unit pixel defaults to factory value. As a result, the unknown initial parameters of each rotating camera are only the focal length and the three initial attitude angles. In the followings, a method to estimate the initial parameters of the camera by only using two control points is introduced. We firstly consider the factory value of the focal length of the camera, the approximate values of the roll and pitch angles as the initial values, and calculate the rough value of the camera’s initial yaw angle by the first control point. Secondly, we calculate the refined focal length and the three initial attitude angles by solving the linear equations established by the two control points.

For the left camera, given the first control point Q₁(X_w₁, Y_w₁, Z_w₁) in initial common FOV and the projection point q₁(u₁, v₁) on the image plane of the left camera, we can get

(10)$$\lambda {[{\tilde{{\boldsymbol K}}_1}{{\boldsymbol R}_z}({r_{01}}){{\boldsymbol R}_x}({p_{01}})]^{ - 1}}\left[ {\begin{array}{{c}} {({u_1} - {u_l}){d_x}}\\ {({v_1} - {v_l}){d_y}}\\ 1 \end{array}} \right] = {{\boldsymbol R}_y}({y_{01}})\left[ {\begin{array}{{c}} {{X_{w1}} - {S_{1x}}}\\ {{Y_{w1}} - {S_{1y}}}\\ {{Z_{w1}} - {S_{1z}}} \end{array}} \right]$$

The above Eq. (10) can be abbreviated as

(11)$$\lambda {{\boldsymbol J}^{ - 1}}\left[ {\begin{array}{{c}} {({u_1} - {u_l}){d_x}}\\ {({v_1} - {v_l}){d_y}}\\ 1 \end{array}} \right] = \left[ {\begin{array}{{c}} {C{y_{01}}X - S{y_{01}}Z}\\ Y\\ {S{y_{01}}X + C{y_{01}}Z} \end{array}} \right], $$

where ${{\boldsymbol J}_{3 \times 3}} = {\tilde{{\boldsymbol K}}_1}{{\boldsymbol R}_z}({r_{01}}){{\boldsymbol R}_x}({p_{01}})$, [X, Y, Z]^T=[X_w₁-S_1x, Y_w₁-S_1y, Z_w₁-S_1z]^T, Sy₀₁, Cy₀₁ is the abbreviation of sin(y₀₁) and cos(y₀₁), respectively.

According to Eq. (11), we have

(12)$$\left[ {\begin{array}{{cc}} {{Q_1}X + {Q_3}Z}&{{Q_1}Z - {Q_3}X}\\ {{Q_2}X}&{{Q_2}Z} \end{array}} \right]\left[ {\begin{array}{{c}} {S{y_{01}}}\\ {C{y_{01}}} \end{array}} \right] = \left[ {\begin{array}{{c}} 0\\ {{Q_3}Y} \end{array}} \right], $$

where Q_3×1=J⁻¹[(u₁-u_l)d_x, (v₁-v_l)d_y, 1]^T, Q_i (i=1, 2, 3) represents the i-th element of matrix Q.

The Eq. (12) can be abbreviated as A_2×2[Sy₀₁ Cy₀₁]^T=b_2×1. Given the initial values of the focal length and the roll and pitch angles, the sine and cosine values of initial yaw angle can be firstly estimated by [Sy₀₁ Cy₀₁]^T=A⁻¹b. The tangent value of the yaw angle can be calculated by Ty₀₁=Sy₀₁/Cy₀₁, where Ty₀₁ is the abbreviation of tan(y₀₁). Then the rough value of the initial yaw angle can be obtained by

(13)$$\left\{ \begin{array}{l} \begin{array}{{cc}} {{{\tilde{y}}_{01}} = \textrm{ta}{\textrm{n}^{ - 1}}(|{T{y_{01}}} |)}&{if(S{y_{01}} \ge 0,C{y_{01}} > 0)} \end{array}\\ \begin{array}{{cc}} {{{\tilde{y}}_{01}} = \pi - \textrm{ta}{\textrm{n}^{ - 1}}(|{T{y_{01}}} |)}&{if(S{y_{01}} > 0,C{y_{01}} \le 0)} \end{array}\\ \begin{array}{{cc}} {{{\tilde{y}}_{01}} = \textrm{ta}{\textrm{n}^{ - 1}}(|{T{y_{01}}} |) - \pi }&{if(S{y_{01}} \le 0,C{y_{01}} < 0)} \end{array}\\ \begin{array}{{cc}} {{{\tilde{y}}_{01}} ={-} \textrm{ta}{\textrm{n}^{ - 1}}(|{T{y_{01}}} |)}&{if(S{y_{01}} < 0,C{y_{01}} \ge 0)} \end{array} \end{array} \right.. $$

Similarly with Eq. (7), given two control points in the initial common FOV, λ₁H₁=M₁ and λ₂H₂=M₂ can be established for the left camera. Then the four linear equations related to unknown initial parameters of the left camera in Eq. (14) can be obtained.

(14)$$\left\{ \begin{array}{l} eq(1):{{\boldsymbol M}_1}(1) - {{\boldsymbol H}_1}(1){{\boldsymbol M}_1}(3) = 0\\ eq(2):{{\boldsymbol M}_1}(2) - {{\boldsymbol H}_1}(2){{\boldsymbol M}_1}(3) = 0\\ eq(3):{{\boldsymbol M}_2}(1) - {{\boldsymbol H}_2}(1){{\boldsymbol M}_2}(3) = 0\\ eq(4):{{\boldsymbol M}_2}(2) - {{\boldsymbol H}_2}(2){{\boldsymbol M}_2}(3) = 0 \end{array} \right., $$

where H_m(n), M_m(n) (m=1, 2; n=1, 2, 3) represents the n-th element of the m-th matrix.

We consider the factory value of the focal length of the camera as the initial value of focal length, and consider the rough yaw angle and the approximate values of the roll and pitch angles as the initial values of the camera’s corresponding attitude angles. Then, the refined focal length and three initial attitude angles can be calculated iteratively by

(15)$$F({f_1},{r_{01}},{p_{01}},{y_{01}}) = \sum\limits_{i = 1}^4 {{{|{eq(i)} |}^2} \to \min } . $$

Similar to the left camera, the refined initial parameters of the right camera can also be calculated by the two same control points.

3.2 Extrinsic parameters estimation after rotation

Since the camera can rotate horizontally and vertically, the pitch angle and yaw angle of each camera changes after the camera is rotated. And the pitch and yaw angles after rotation equal to the sum of the initial attitude angles and rotation angles in horizontal and vertical directions. Therefore, the key problem of extrinsic parameters estimation after rotation is to solve the rotation angles of each camera.

3.2.1 Extrinsic parameters self-calibration using a pair of point correspondences

For the left camera, given a single feature point Q and the projection points q(u, v) and q'(u’, v’) on the image plane before and after rotation. Similarly with Eq. (5), Eq. (16) can be obtained according to the internal-image homography of the rotating camera.

(16)$$\lambda \left[ {\begin{array}{{c}} {U^{\prime}}\\ {V^{\prime}}\\ 1 \end{array}} \right] = \left[ {\begin{array}{{ccc}} {C{p_1}}&0&{ - {f_1}S{p_1}}\\ {S{p_1}S{t_1}}&{C{t_1}}&{{f_1}C{p_1}S{t_1}}\\ {\frac{{S{p_1}C{t_1}}}{{{f_1}}}}&{ - \frac{{S{t_1}}}{{{f_1}}}}&{C{p_1}C{t_1}} \end{array}} \right]\left[ {\begin{array}{{c}} U\\ V\\ 1 \end{array}} \right], $$

where U'=u'-u_l, V'=v'-v_l, U = u-u_l, V = v-v_l, Sp₁=sin(Pan₁), Cp₁=cos(Pan₁), St₁=sin(Tilt₁), Ct₁=cos(Tilt₁).

Equation (16) is equivalent to

(17)$$\left\{ \begin{array}{l} S{p_1}C{t_1}UU^{\prime} - S{t_1}VU^{\prime} + {f_1}C{p_1}C{t_1}U^{\prime} - {f_1}C{p_1}U + f_1^2S{p_1} = 0\\ S{p_1}C{t_1}UV^{\prime} - S{t_1}VV^{\prime} + {f_1}C{p_1}C{t_1}V^{\prime} - {f_1}S{p_1}S{t_1}U - {f_1}C{t_1}V - f_1^2C{p_1}S{t_1} = 0 \end{array} \right.. $$

The rotation angles in horizontal and vertical directions from the output of the two-axes platform can be considered as the initial values, and then the refined Pan₁ and Tilt₁ can be calculated by Eq. (17). Similarly, Pan₂ and Tilt₂ can also be solved by the point correspondences {q, q'}.

3.2.2 Extrinsic parameters self-calibration using high precision two-axes platform

The above method minimizes the feature points required for the extrinsic parameters calibration of the rotating camera, which can obtain the refined results even if the output accuracy of the two-axes platform is not high. However, the extraction of the feature point is time-consuming and the pixel accuracy of the feature point significantly affects the calibration results, and the method using point correspondences before and after rotation will not work if the feature point disappears in the FOV.

To avoid using point correspondences between images, we directly estimate the rotation angles of each camera from the output of the high-precision two-axes platform in this paper. Therefore, the extrinsic parameters are determined immediately after each camera is rotated, which makes it possible that the 3D coordinate can be measured online.

4. Automatic alignment of rotating camera

In addition to achieving 3D coordinate measurement, the DBSV established in this paper also has the ability of automatic alignment. The automatic alignment means that each camera can automatically aim at one target by adjusting the FOV. Once the target is specified on the left image plane, the position of this target on right image plane should be determined by image matching. Then the host computer can control each two-axes platform to rotate to place the target in the center of each camera’s FOV according to the motion parameters of stepping motors generated by image coordinates of the target.

4.1 Principles of automatic alignment

In Fig. 3, point Q is one specified target in 3D world, which has the projection point q(u, v) on the image plane. We assume the specified target will be in the center of the camera’s FOV after the camera rotates Pan degrees in horizontal direction and Tilt degrees in vertical direction. The rotation angles of point Q respectively rotating around Y-axis and X-axis happen to be Pan and Tilt if point Q moves to Q’ and the projection point of Q’ coincides with the image center. But the angle direction is opposite. Therefore, the rotation angles required for automatic alignment can be calculated by Eq. (18) according to the triangle similarity principle of pinhole model of the camera.

(18)$$\left\{ \begin{array}{l} Pan ={-} {\tan^{ - 1}}(\frac{{\Delta x}}{f}) ={-} {\tan^{ - 1}}({d_{xy}}\frac{{{u_0} - u}}{f})\\ Tilt ={-} {\tan^{ - 1}}(\frac{{\Delta y}}{f}) ={-} {\tan^{ - 1}}({d_{xy}}\frac{{{v_0} - v}}{f}) \end{array} \right., $$

where f, (u₀, v₀), d_xy respectively represents the focal length of the camera, the principal point and the physical size of unit pixel.

Fig. 3. The automatic alignment of rotating camera.

Download Full Size | PDF

In summary, the rotation angles of each camera required for automatic alignment in horizontal and vertical directions can be obtained once the positions of the specified target on each camera’s image plane are determined. Then the step count of the azimuth and pitch stepping motors (i.e. the motion parameters of the two-axes platform) for automatic alignment can be calculated. The key issue involved in automatic alignment is to achieve the matching of the specified target on the left and right images. Therefore, a novel target matching method based on image registration is proposed below.

4.2 Target matching algorithm based on Moving DLT

As shown in Fig. 4, a spatial point Q is on the plane $\pi $ and the corresponding projection points are denoted as q(x, y), q'(x’, y’). Given the point matches of multiple spatial points on the plane $\pi $, the global homography matrix H between the images can be calculated. And the mapping between the matching points {q, q'} can be described by

(19)$$s\tilde{x}^{\prime} = {\boldsymbol H}\tilde{x}, $$

where s is the scale factor, $\tilde{x}$=[x, y, 1]^T, $\tilde{x}^{\prime}$=[x’, y’, 1]^T.

Fig. 4. The mapping based on global homography matrix.

Download Full Size | PDF

We define h = [h₁₁, h₁₂, h₁₃, h₂₁, h₂₂, h₂₃, h₃₁, h₃₂, h₃₃]^T, where h_ij (i, j=1, 2, 3) represents the element of the matrix H in the i-th row and j-th column. Then Eq. (19) can be represented by

(20)$$\left\{ \begin{array}{l} x{h_{11}} + y{h_{12}} + {h_{13}} - x^{\prime}x{h_{31}} - x^{\prime}y{h_{32}} - x^{\prime}{h_{33}} = 0\\ x{h_{21}} + y{h_{22}} + {h_{23}} - y^{\prime}x{h_{31}} - y^{\prime}y{h_{32}} - y^{\prime}{h_{33}} = 0 \end{array} \right.. $$

Given n pairs of point matches $\{{{q_i},{q_i}^{\prime}} \}_{i = 1}^n$, each element of matrix H can be estimated by

(21)$$\tilde{{\boldsymbol h}} = \mathop {\arg \min }\limits_{\boldsymbol h} \sum\limits_{i = 1}^n {{{||{{{\boldsymbol a}_i}{\boldsymbol h}} ||}^2}} = \mathop {\arg \min }\limits_{\boldsymbol h} {||{{\boldsymbol Ah}} ||^2}, s.t ||{\boldsymbol h} ||= 1$$

where a_i${\in} $R^2×9 represents the two rows of coefficients in Eq. (20) for the i-th group of data {q_i, q_i'}, and A${\in} $R^2n×9 is obtained by stacking vertically $\{{{{\boldsymbol a}_i}} \}_{i = 1}^n$.

In Fig. 4, a spatial point on the extension of O_lq is denoted as Q’, which is not on the plane $\pi $. The corresponding matching point on the right image plane estimated by the global homography matrix is still q’, but its true matching point is q'’. This deviation is mainly caused by the distance between the actual spatial point and the corresponding plane of the homography matrix and the translation of the right camera relative to the left camera. Generally, the global homography matrix can be used to estimate the position of the corresponding matching point if the translation between cameras is sufficiently small relative to the depth of the scene.

To improve the accuracy of target positioning, a local homography matrix H_* is used to determine the image coordinates of the point to be matched. Given the matching points {x_*, x_*'} on the left and right images, we can get

(22)$${\tilde{x}_\ast }^{\prime} = {{\boldsymbol H}_\ast }{\tilde{x}_\ast }. $$

The local homography matrix can be estimated by

(23)$${{\boldsymbol h}_\ast } = \mathop {\arg \min }\limits_{\boldsymbol h} \sum\limits_{i = 1}^n {{{||{w_\ast^i{{\boldsymbol a}_i}{\boldsymbol h}} ||}^2}} $$

The weight $w_\ast ^i$ changes with x_*, which can be calculated by

(24)$$w_\ast ^i = \textrm{exp(} - {||{{x_\ast } - {x_i}} ||^\textrm{2}}\textrm{/}{\sigma ^\textrm{2}}\textrm{)}, $$

where $\sigma $ is the scale parameter, and x_i represents the image coordinate of the i-th feature point on the left image.

Obviously, Eq. (24) assigns higher weights to feature points closer to x_*, which can better reflect the mapping relationship of the feature points near x_* compared with global homography matrix. The local homography matrix generated by traversing the entire image using Eq. (23) is smooth, so we call the method Moving DLT [29].

Equation (23) can be described as

(25)$$\left\{ \begin{array}{l} {{\boldsymbol h}_\ast } = \mathop {\arg \min }\limits_{\boldsymbol h} {||{{W_\ast }{\boldsymbol Ah}} ||^2}\\ {W_\ast } = diag(\left[ {\begin{array}{{ccccc}} {w_\ast^1}&{w_\ast^1}&{\ldots }&{w_\ast^n}&{w_\ast^n} \end{array}} \right]) \end{array} \right.$$

The weight W_* will be of little significance and the solution of Eq. (25) may be unstable when x_* is in area where the feature points are sparse. To prevent numerical problems in the estimation of the local homography matrix, a threshold $\gamma $ in Eq. (26) is used to offset the weight, where $\gamma $ is closer to 1, the local homography matrix is closer to the global homography matrix.

(26)$$w_\ast ^i = \textrm{max}(\textrm{exp}( - {||{{x_\ast } - {x_i}} ||^\textrm{2}}\textrm{/}{\sigma ^\textrm{2}}),\gamma )$$

It is computationally intensive to estimate the local homography matrix of each pixel by traversing the entire image. Therefore, the source image is usually evenly divided into a grid of C₁×C₂ cells, as shown in Fig. 5(b). Then the corresponding positions in the target image [as shown in Fig. 5(a)] of all pixels in the grid can be estimated by this local homography matrix of the center pixel of the grid.

Fig. 5. (a)The target image (right image), (b)The source image (left image) divided into 20×20 cells.

Download Full Size | PDF

5. Computer simulation and experiment

5.1 Computer simulation

The DBSV established in this paper is applied for 3D coordinate measurement in large FOV. The pixel extraction accuracy significantly affects the accuracy of 3D coordinate measurement. In order to investigate the accuracy of 3D coordinate with respect to the pixel extraction accuracy, we simulate 121 control points in the neighborhood of the center of the common FOV at different attitude angles, and add Gaussian noise with mean value of 0 and variance of 0.5 to the image coordinates. Both virtual cameras’ focal length, principal point and unit pixel size is 25 millimeters, (960 pixels, 300 pixels), 4.8 microns, respectively, and virtual cameras’ world coordinates are respectively (0, 0, 0), (30, 0, 0) (unit: meters). In this simulation, the top view of DBSV is shown in Fig. 6(a), and the roll angle of the two cameras remains zero, the pitch angle varies from -25°∼25°, and the yaw angle varies from -45°∼45°. The root mean square error (RMSE) between the real and estimated 3D coordinate of the simulated control points is used to evaluate the accuracy of 3D coordinate measurement. Figure 6(b) reflects the RMSE caused by the same level of Gaussian noise under different attitude angles, and Fig. 6(c) reflects the angle between optical axes under different attitude angles. The RMSE in Fig. 6(b) doesn’t exceed 0.02 meters on the whole, but significantly increases when the angle between optical axes is smaller than 20°.

Fig. 6. (a)The top view of dynamic binocular stereo vision, (b)The RMSE versus Gaussian noise, (c)The angle between optical axes.

Download Full Size | PDF

In this paper, the extrinsic parameters of each rotating camera are obtained from the output of the two-axes platform. In order to investigate the accuracy of 3D coordinate measurement with respect to the error of attitude angles, we add four different levels of attitude angle errors of 0.002°, 0.004°, 0.006°, 0.008° to roll angle, pitch angle and yaw angle, respectively, and estimate the average value of the RMSE of the 3D coordinates under different attitude angles. Since the roll angle is close to zero, the average of RMSE in Fig. 7(a) is extremely small and does not exceed 9×10⁻³ millimeter when the error of roll angle reaches the maximum. Figure 7(b) and Fig. 7(c) reflects that the average of RMSE increases significantly with the increasing error of pitch angle and yaw angle. However, the average of RMSE does not exceed 16 millimeter and 21 millimeters when the error of pitch angle and the error of yaw angle respectively reaches the maximum. Similarly with Fig. 6(b), the RMSE caused by the same level of attitude angle error is more significant when the angle between optical axes is smaller than 20°.

Fig. 7. The RMSE versus error of attitude angles under different attitude angles: (a)RMSE versus error of roll angle, (b)RMSE versus error of pitch angle, (c)RMSE versus error of yaw angle.

Download Full Size | PDF

The positioning accuracy of GPS largely determines the accuracy of the 3D coordinate measurement. In order to investigate the accuracy of 3D coordinates with respect to the GPS positioning accuracy, we add four different levels of positioning error of 2 centimeters, 4 centimeters, 6 centimeters, and 8 centimeters to the world coordinates of each camera. In Fig. 8, the average of the RMSE of the 3D coordinates under different attitude angles increases with the increasing positioning error, but does not exceed 0.13 meters when the error reaches the maximum. Moreover, the RMSE caused by the same level of positioning error hardly fluctuates under different attitude angles.

Fig. 8. The RMSE versus positioning error.

Download Full Size | PDF

5.2 Experimental results

In order to evaluate the calibration accuracy of the initial and extrinsic parameters of the rotating camera and the accuracy of 3D coordinate measurement in large FOV, outfield experiments are performed, as shown in Fig. 9(a). The single front-end of the DBSV established in this paper is shown in Fig. 9(b). The baseline distance and measuring distance is about 30 meters, 200 meters, respectively.

Fig. 9. (a)The schematic diagram of outfield experiment, (b)The single front-end of dynamic binocular stereo vision.

Download Full Size | PDF

The initial parameters of each camera are calibrated in advance by using the proposed two-point method. In order to validate the feasibility of the two-point method, two random control points out of ten in the initial common FOV are used to estimate the focal length and initial attitude angles of each camera, and the RMSE of the 3D coordinate of the remaining control points is used to evaluate the calibration accuracy. We place a white rectangular flag at different positions in the common FOV each time, and select the lower left corner of the flag as the control point. The image coordinates of these control points on each image plane are manually extracted after the area of white flag is enlarged. The red dots on the left and right image planes in Fig. 10 are these control points, and the world coordinates of the ten control points are all obtained by handheld GPS. The world coordinates of the cameras and the control points and the image coordinates of the control points are shown in Table 1.

Fig. 10. (a)The control points on left image plane, (b)The control points on right image plane.

Download Full Size | PDF

Table 1. The coordinates of cameras and control points.

View Table | View all tables in this article

This experiment is repeated 12 times, and we take the calibration result of one of the trials as the initial parameters of the cameras (as shown in Table 2) and load them into the established DBSV since the cameras are non-zoom.

Table 2. The initial parameters of cameras.

View Table | View all tables in this article

In Fig. 11, the RMSE of the 3D coordinate of the remaining control points is less than 0.28 meters in the measuring distance of 200 meters, which validate the effectiveness of the two-point method.

Fig. 11. The RMSE of 3D coordinates.

Download Full Size | PDF

To achieve automatic alignment of each camera, we estimate the position of one target on the right image plane once the target is specified on the left image plane. The white rectangular flag on the left image in Fig. 12(a) is the specified target. The position of the flag on the right image is respectively estimated by the global homography matrix and local homography matrix, as shown in Fig. 12(b). Obviously, the position of the flag on the right image estimated by the local homography matrix can better match its real position. After the positions of the flag on each image are determined, the motion parameters of the stepping motors required for alignment can be generated. Then this white flag can be located in the center of each camera’s FOV by controlling the rotation of the two-axes platform, as shown in Figs. 12(c) and 12(d).

Fig. 12. (a)The specified target on left image, (b)The target matching based on global and local homography matrix, (c)Automatic alignment of left camera, (d)Automatic alignment of right camera.

Download Full Size | PDF

After automatic alignment, we estimate the pitch angle and yaw angle of each camera by summing the rotation angles obtained from the high-precision two-axes platform and initial attitude angles. In order to validate the feasibility of our method, we compare the pitch and yaw angles estimated by our method with that estimated by the self-calibration method in section 3.2.1 and SPCM [22] under the condition that the focal length and the roll angle is known. The self-calibration method estimates the rotation angles of the camera in the horizontal and vertical directions by using a pair of point correspondences before and after rotation, and then calculate the extrinsic parameters of the camera by summing the rotation angles and the initial attitude angles. The SPCM is a traditional calibration method that calculate extrinsic parameters by using a single control point with known 3D information, which performs good accuracy. The calibration results from SPCM are considered as reference values in this paper.

To avoid using calibration object, we use the intersection of a set of straight lines in natural scene as the single feature point required for the self-calibration method. The straight lines before and after rotation are extracted by Hough algorithm after extracting the region of interest (ROI), as shown in Fig. 13. In addition, the ground position of the single control point required for SPCM is also obtained by handheld GPS.

Fig. 13. The extraction of straight lines:(a)Straight lines detected before rotation, (b)Straight lines detected after rotation.

Download Full Size | PDF

The experiment is repeated 10 times, and the pitch and yaw angles of the left and right cameras estimated by the three methods are reflected in Table 3. The pitch and yaw angles from our method are comparable with that from the self-calibration method and SPCM, which proves the correctness of our method. Although there exist slight deviations between the results of our method and the reference values from SPCM, our method does not require point correspondences before and after rotation or control points with 3D information. The absolute error of the pitch and yaw angles estimated by our method relative to the reference values is reflected in Fig. 14, and the average value of absolute error does not exceed 0.061°.

Fig. 14. The absolute error of the pitch angle and yaw angle relative to the reference values.

Download Full Size | PDF

Table 3. The extrinsic parameters of cameras.

View Table | View all tables in this article

To evaluate the accuracy of 3D coordinate measurement, we reconstruct the 3D coordinate of 12 control points by using the extrinsic parameters estimated by the above three methods after the camera is rotated each time. A total of 120 control points are accumulated in 10 trials. The absolute error of the 3D coordinate of all control points on each axis is reflected in Fig. 15(a). Although the measurement accuracy of our method is slightly lower than that of SPCM, the extrinsic parameters of the cameras are determined immediately after the camera is rotated. Figure 15(b) reflects the average error of the 3D coordinate on each axis estimated by our method, which is less than 0.28 meters, 0.19 meters, and 0.097 meters, respectively.

Fig. 15. The error of 3D coordinate:(a)Absolute error of control points on each axis, (b)Average error on each axis.

Download Full Size | PDF

6. Conclusion

A DBSV based on non-zoom rotating camera in large FOV is established in this paper. The ground position of each camera is obtained by GPS in advance. The camera model is simplified so that the unknown initial parameters of each camera are only reduced to the focal length and three attitude angles. Moreover, the approximate values of the roll angle and pitch angle of each camera are respectively obtained by the inclinometer and the pitch stepping motor in real time. To quickly estimate the initial parameters of the camera in large FOV, a novel two-point method is proposed under the condition that the approximate value of the yaw angle of each camera is unknown. Firstly, the rough value of the initial yaw angle of each camera is estimated by using the first control point. Secondly, the refined focal length and three initial attitude angles are calculated iteratively by using the two control points. Compared with other mature calibration methods, the two-point method does not require a calibration object with known size, which not only reduces the calibration cost, but also overcomes the inconvenience of manual operation because of complex environment in large FOV. The intrinsic parameters and the roll angle of each camera remains constant, but the pitch angle and yaw angle changes after the camera is rotated. The pitch and yaw angles are estimated by summing the rotation angles from the output of high-precision two-axes platform and the corresponding initial attitude angles, which avoids using point correspondences between images and makes it possible that 3D coordinate can be measured online after rotation. In addition, a target matching algorithm based on Moving DLT is proposed to achieve the automatic alignment of each camera, which improves the positioning accuracy of the target by using local homography matrix. The accuracy of 3D coordinate measurement of the established DBSV is comparable with that of state-of-the-art methods.

It should be emphasized that the DBSV established in this paper is suitable for occasions where the extremely high accuracy of 3D coordinate measurement is not required in large FOV. To improve the accuracy, more control points are required to compute the distortion coefficients of the camera when to calibrate each camera’s initial parameters, and a higher-precision inclinometer and two-axes platform should be selected. However, more control points will increase the difficulty of manual operation in large FOV, and higher-precision instruments will increase hardware costs.

Funding

Anhui University of Technology (QZ202014).

Disclosures

The authors declare no conflicts of interest.

References

1. R. A. Persad, C. Armenakis, and G. Sohn, “Calibration of a PTZ surveillance camera using 3D indoor model,” In Proceedings of International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives38, 6 (2010).

2. J. Davis and X. Chen, “Calibrating pan-tilt cameras in wide-area surveillance networks,” In Proceedings of the Ninth IEEE International Conference on Computer Vision 1, 144–149 (2003).

3. M. Wensheng, H. Shuaipeng, L. Mousi, Q. Hongyu, and X. Fang, “Rotation panorama photogrammetry method based on digital camera,” Geomatics and Information Science of Wuhan University 42(2), 243–249 (2017). [CrossRef]

4. X. Kaishun, C. Shanxing, and W. Yilin, “Modeling and simulation for target tracking system based on dual-pan-tilt camera,” Journal of System Simulation 27(2), 362–368 (2015).

5. Y. I. Abdel-Aziz and H. M. Karara, “Direct linear transformation into object space coordinates in close-range photogrammetry,” Photogramm. Eng. Remote Sens. 81(2), 103–107 (2015). [CrossRef]

6. L. Zhang and D. Wang, “Automatic calibration of computer vision based on RAC calibration algorithm,” Metallurgical and Mining Industry 7(7), 308–312 (2015).

7. Z Zhang., “Flexible camera calibration by viewing a plane from unknown orientations,” In Proceedings of the Seventh IEEE International Conference on Computer Vision 1, 666–673 (1999).

8. Z. Y. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Machine Intell. 22(11), 1330–1334 (2000). [CrossRef]

9. S. D. Ma, “A self-calibration technique for active vision system,” IEEE Trans. Robot. Automat. 12(1), 114–120 (1996). [CrossRef]

10. Hua Li, Guanghui Wang, Fuchao Wu, and Zhangyi Hu, “A new self-calibration technique via epipoles,” In Proceedings of the Fifth Asian Conference on Computer Vision 2, 670–675 (2002).

11. G. Yang, L. Jiarui, C. Jiaqi, and X. Qiuyu, “An efficient and flexible camera calibration technique for large-scale vision measurement based on precise two-axis rotary table,” Nanotechnology and Precision Engineering 1(1), 59–65 (2018). [CrossRef]

12. A. M. Truong, W. Philips, N. Deligiannis, L. Abrahamyan, and J. Z. Guan, “Automatic multi-camera extrinsic parameter calibration based on pedestrian torsors,” Sensors 19(22), 4989 (2019). [CrossRef]

13. I. Van Crombrugge, R. Penne, and S. Vanlanduit, “Extrinsic camera calibration for non-overlapping cameras with gray code projection,” Opt. Lasers Eng. 134, 106305 (2020). [CrossRef]

14. C. C. Jia, T. Yang, C. J. Wang, B. H. Fan, and F. G. He, “An extrinsic calibration method for multiple RGB-D cameras in a limited field of view,” Meas. Sci. Technol. 31(4), 045901 (2020). [CrossRef]

15. F. Yan, Z. Liu, X. Pan, and Y. Shen, “High-accuracy calibration of cameras without depth of field and target size limitations,” Opt. Express 28(19), 27443–27458 (2020). [CrossRef]

16. W. W. Feng, Z. L. Su, Y. S. Han, H. B. Liu, Q. F. Yu, S. P. Liu, and D. S. Zhang, “Inertial measurement unit aided extrinsic parameters calibration for stereo vision systems,” Opt. Lasers Eng. 134, 106252 (2020). [CrossRef]

17. M. Kim, S. Kim, and J Choi., “Robust and incremental stitching and calibration with known rotation on pan-tilt-zoom camera,” In Proceedings of 20th IEEE International Conference on Image Processing, 2247–2251 (2013).

18. A. Gudys, K. Wereszczynski, J. Segen, M. Kulbacki, and A Drabik., “Camera calibration and navigation in networks of rotating cameras,” In Proceedings of 7th Asian Conference on Intelligent Information and Database Systems (ACIIDS) 9012, 237–247 (2015).

19. C. Li and R. Su, “A novel stratified self-calibration method of camera based on rotation movement,” Journal of Software 9(5), 1281–1287 (2014). [CrossRef]

20. Rameau Chaoning Zhang and F. Junsik Kim, “DeepPTZ: deep self-calibration for PTZ cameras,” In Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 1030–1038 (2020).

21. M. Wensheng, H. Shuaipeng, L. Mousi, Q. Hongyu, and X. Fang, “Rotation panorama photogrammetry method based on digital camera,” Geomatics and Information Science of Wuhan University 42(2), 243–249 (2017). [CrossRef]

22. L. Yunting, Z. Jun, and H. Wenwen, “Method for pan-tilt camera calibration using single control point,” J. Opt. Soc. Am. A 32(1), 156–163 (2015). [CrossRef]

23. Y. Wang, X. Wang, Z. Wan, and J. Zhang, “A method for extrinsic parameter calibration of rotating binocular stereo vision using a single feature point,” Sensors 18(11), 3666 (2018). [CrossRef]

24. W.G. Yang, W.X. Qian, Y. Qian, and F Wang., “Camera internal parameter calibration based on rotating platform and image matching,” In Proceedings of Conference on Optics and Photonics for Information Processing XIII11136, 8pp. (2019).

25. M. Bruckner, F. Bajramovic, and J. Denzler, “Intrinsic and extrinsic active self-calibration of multi-camera systems,” Machine Vision and Applications 25(2), 389–403 (2014). [CrossRef]

26. https://www.orientalmotor.com.cn/products/limo/list/detail/?brand_tbl_code=LM&product_name=DG60-ASAK.

27. https://www.nfyq.cn/wap/view.php?aid=89.

28. Y. Wang and X. Wang, “An improved two-point calibration method for stereo vision with rotating cameras in large FOV,” J. Mod. Opt. 66(10), 1106–1115 (2019). [CrossRef]

29. J. Zaragoza, T. J. Chin, and Q. H. Tran, “As-projective-as-possible image stitching with moving DLT,” IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1285–1298 (2014). [CrossRef]

Index	O₁ (meters)	O₂ (meters)
	(0, 0, 0)	(5.56, 0.05, −29.09)
	World Coordinates (meters)	Image Coordinates (pixels)
	P(X_w, Y_w, Z_w)	p₁(u₁, v₁)	p₂(u₂, v₂)
1	183.96, 7.36, −26.46	953, 309	973, 310
2	187.66, 7.46, −38.19	1256,303	1313,310
3	201.08, 7.44, −36.65	1149,292	1254,297
4	208.57, 7.44, −27.37	889, 287	1006,286
5	201.35, 7.44, −14.37	578, 297	655, 290
6	93.76, 7.49, −14.04	984, 509	155, 521
7	103.49, 7.47, −25.60	1476,466	860, 489
8	111.27, 7.43, −16.66	985, 443	433, 451
9	102.09, 7.44, −28.13	1615,467	995, 495
10	108.34, 7.47, −27.52	1509,447	968, 470

Camera	f (millimeter)	r (°)	p (°)	y (°)
C ₁	25.11	−0.2212	−2.1657	98.2610
C ₂	25.27	−0.6083	−2.2333	89.0170

Index	pitch angle, yaw angle (left camera) pitch angle, yaw angle (right camera)
Index	Our method	Self-calibration method	SPCM
1	−3.61,105.76 −3.94, 90.11	−3.63,105.74 −3.95,90.12	−3.66,105.76 −3.99,90.13
2	−3.25,111.86 −3.94,98.07	−3.26,111.83 −3.94,98.03	−3.31,111.86 −3.93,98.07
3	−4.03,107.91 −4.55,89.77	−4.04,107.91 −4.56,89.82	−4.04,107.95 −4.58,89.83
4	−4.03,115.97 −4.98,98.26	−3.98,115.90 −4.98,98.21	−4.03,115.99 −5.00,98.28
5	−4.66,110.67 −5.37,89.80	−4.65,110.66 −5.38,89.86	−4.76,110.72 −5.50,89.87
6	−5.32,111.30 −6.34,86.82	−5.31,111.26 −6.30,86.92	−5.39,111.33 −6.46,86.93
7	−3.39,108.88 −3.85,94.44	−3.42,108.88 −3.92,94.43	−3.43,108.92 −3.88,94.46
8	−2.07,101.78 −2.16,93.17	−2.04,101.80 −2.13,93.18	−2.10,101.79 −2.17,93.19
9	−1.96,97.41 −2.00,88.84	−1.97,97.44 −2.03,88.87	−2.03,97.436 −2.07,88.86
10	−2.09,92.43 −2.00,83.61	−2.09,92.46 −2.06,83.66	−2.11,92.459 −2.13,83.64

Index	O₁ (meters)	O₂ (meters)
	(0, 0, 0)	(5.56, 0.05, −29.09)
	World Coordinates (meters)	Image Coordinates (pixels)
	P(X_w, Y_w, Z_w)	p₁(u₁, v₁)	p₂(u₂, v₂)
1	183.96, 7.36, −26.46	953, 309	973, 310
2	187.66, 7.46, −38.19	1256,303	1313,310
3	201.08, 7.44, −36.65	1149,292	1254,297
4	208.57, 7.44, −27.37	889, 287	1006,286
5	201.35, 7.44, −14.37	578, 297	655, 290
6	93.76, 7.49, −14.04	984, 509	155, 521
7	103.49, 7.47, −25.60	1476,466	860, 489
8	111.27, 7.43, −16.66	985, 443	433, 451
9	102.09, 7.44, −28.13	1615,467	995, 495
10	108.34, 7.47, −27.52	1509,447	968, 470

Camera	f (millimeter)	r (°)	p (°)	y (°)
C ₁	25.11	−0.2212	−2.1657	98.2610
C ₂	25.27	−0.6083	−2.2333	89.0170

On-line three-dimensional coordinate measurement of dynamic binocular stereo vision based on rotating camera in large FOV

Abstract

1. Introduction

2. Dynamic binocular stereo vision based on rotating camera in large FOV

2.1 Model of two-axes rotating camera

2.2 Model of dynamic binocular stereo vision

3. Camera calibration method

3.1 Initial parameters estimation by only using two control points

3.2 Extrinsic parameters estimation after rotation

3.2.1 Extrinsic parameters self-calibration using a pair of point correspondences

3.2.2 Extrinsic parameters self-calibration using high precision two-axes platform

4. Automatic alignment of rotating camera

4.1 Principles of automatic alignment

4.2 Target matching algorithm based on Moving DLT

5. Computer simulation and experiment

5.1 Computer simulation

5.2 Experimental results

6. Conclusion

Funding

Disclosures

References

Cited By

Figures (15)

Tables (3)

Equations (26)

Optics Express