## Abstract

For important camera calibration in the field of computer vision, a new target form, namely, a grid spherical target (GST) that is different from the spherical target, is proposed. The GST has advantages of spherical and checkerboard targets because of the grid on the sphere. And the latitude and longitude circles and the intersection points between latitude and longitude circles on the GST are used to calibrate the camera. Firstly, the Image of Absolute Conic should be obtained using the elliptic curves of latitude and longitude circles on the GST in the images. After obtaining the initial intrinsic and extrinsic parameters of the camera using the Image of Absolute Conic, optimum solutions of the intrinsic and extrinsic parameters are solved through nonlinear optimization by using the latitude circles and the intersection points of the latitude and longitude lines. Finally, the effectiveness of the GST-based method is proven in simulation and physical experiments.

© 2017 Optical Society of America

## Corrections

27 June 2017: Typographical corrections were made to Refs. 2, 3, 10, 12, 14, 18, 19, 21, 23, 29, and 31.

## 1. Introduction

Camera calibration is a hot research topic in the field of computer vision [1–3]. The accuracy of camera calibration directly affects the measuring accuracy of a vision system, such as industry inspection, vision-based autonomous robot navigation, and virtual reality. Thus, a simple and high-accuracy camera calibration method is of great significance. Current camera calibration methods are primarily divided into two major classes: (1) target-based calibration methods, which are applicable to the calibration of vision measuring systems for precise dimensional measurement; and (2) camera self-calibration methods that do not require targets, which are applied to vision-based autonomous robot navigation and scene reconstruction. The calibration methods of the first class are an important topic under discussion in this paper.

By adopting a 3D target of two planes [4–6] for camera calibration, this method may achieve high accuracy, but it is now relatively less used because of difficulties in machining a 3D target. Zhang [7] introduced a simple, flexible, and high-accuracy method using a freely moving planar checkerboard target, which has been widely applied to camera calibration [8]. Additionally, some researchers adopted different calibration methods using various features on planar targets [9–11]. Camera calibration methods [12–14] using 1D targets can only calibrate a single camera under certain circumstances, thereby limiting the applications of 1D targets in machine vision-based measuring systems. Considerable camera calibration methods [15–21] have been proposed based on spherical target images, besides the abovementioned target forms. The proposed method determines the intrinsic camera parameters based on the SCs. Given the particularity of a spherical target that is different from conventional 3D targets, a camera shooting a spherical target at any angle may obtain a contour image. This method is ideal for the synchronous calibration of the intrinsic parameters for multiple cameras shooting at different angles. However, it is greatly affected by image noise, so its accuracy does not satisfy requirements in high-accuracy measuring system calibration. Other methods using spinning surface [22], parallel circles [23], and coaxial circles [24] have also been described in some articles, but they are unable to achieve high-accuracy calibration.

The calibration method using 2D target is flexible and convenient, with high calibration accuracy. It has been widely applied in camera calibration, but it is unsuited to synchronously complete the calibration of multiple cameras at different positions. Use of a spherical target allows a complete outer contour of the spherical target at any angle. Such a technique is suitable for the synchronous calibration of multiple cameras at different positions, but its calibration accuracy can hardly meet high-accuracy measurement requirements. Moreover, a certain requirement must be met in the location of the spherical target: if the location is too close to the center of the camera, the calibration will be inaccurate. However, if it is too far, the contour acquisition and elliptic fitting may be subject to camera distortion, thereby affecting the calibration accuracy.

Based on the above analysis, this paper presents a new target form, namely, the grid spherical target (GST), which combines the strengths of the spherical and planar targets. The GST is different from the spherical target has been proposed in [15–21], mainly, it calibrates the camera using the latitude and longitude circles and the intersection points between latitude and longitude circles. Meanwhile, the GST-based method in this paper allows multiple cameras to shoot simultaneously and calibrate their respective intrinsic parameters, while achieving the same calibration accuracy as that using planar target. During the calibration, the linear solutions of the intrinsic and extrinsic parameters are obtained from the elliptic curves of the latitude and longitude circles in the image. Based on the coordinates of the intersecting points of the elliptic curves of the longitude and latitude lines in the image and the lens distortion, the nonlinear optimization method is adopted to obtain the optimum solutions of the intrinsic and extrinsic parameters.

About the GST method, there are some limitations. Obviously, it is difficult to achieve high accuracy manufacturing compared with checkerboards, and it can’t be applied widely because of manufacturing technology. The GST in this paper is manufactured by five-axis NC machine. In the future, with the improvement of manufacturing technology, it can be made from ceramics, and the grid on it can be manufactured by laser corrosion method and the manufacturing accuracy of it will be improved. In addition, it is an advantage of GST to achieve calibration of cameras at different positions. But if there are two cameras placed along the same horizontal axis as the sphere, then the images acquired by both cameras would be identical. So if we need to calibrate multiple cameras in the future, to avoid this ambiguity, some marker points should be pasted on the longitude intervals such as 40°, 60°, 90 °, 120°, 180°, 240° and so on. These marker points will help us to distinguish different cameras from images.

The rest of this paper is organized as follows. In section 2, a mathematical model for grid spherical imaging is introduced. The method principle of the proposed method is introduced in detail in Section 3. The proposed calibration method is verified by simulation experiments and physical experiments in Section 4. Conclusions are drawn in Section 5.

## 2. Mathematical model for grid spherical imaging

Figure 1 shows the model for grid spherical target imaging. ${O}_{\text{c}}{x}_{\text{c}}{y}_{\text{c}}{z}_{\text{c}}$ represents the camera coordinate system; ${O}_{\text{t}}{x}_{\text{t}}{y}_{\text{t}}{z}_{\text{t}}$ represents the target coordinate system; ${C}_{\text{li}(i)}$represents the equation for the *i*^{th} circle of longitude under the coordinate system ${O}_{\text{li(}i\text{)}}{x}_{\text{li(}i\text{)}}{y}_{\text{li(}i\text{)}}{z}_{\text{li(}i\text{)}}$_{;} and ${e}_{\text{li}(i)}$ represents the elliptic curves of ${C}_{\text{li}(i)}$in the image. *i = 1*,*2…N* where *N* is the number of circles of longitude, ${C}_{\text{la}(j)}$is the equation for the *j*^{th} circle of latitude under the coordinate system ${O}_{\text{la(}j\text{)}}{x}_{\text{la(}j\text{)}}{y}_{\text{la(}j\text{)}}{z}_{\text{la(}j\text{)}}$, and ${e}_{\text{la}(j)}$ represents the elliptic curves of ${C}_{\text{la}(j)}$ in the image. *j = 1*,*2…M* where *M* represents the number of circles of latitude. The origin of coordinates and *Z*-axis of every ${O}_{\text{li(}i\text{)}}{x}_{\text{li(}i\text{)}}{y}_{\text{li(}i\text{)}}{z}_{\text{li(}i\text{)}}$are the same as those of ${O}_{\text{t}}{x}_{\text{t}}{y}_{\text{t}}{z}_{\text{t}}$. The coordinate axis of every ${O}_{\text{la(}j\text{)}}{x}_{\text{la(}j\text{)}}{y}_{\text{la(}j\text{)}}{z}_{\text{la(}j\text{)}}$ is parallel to that of ${O}_{\text{t}}{x}_{\text{t}}{y}_{\text{t}}{z}_{\text{t}}$, but only a fixed value separates their origins of coordinates in the *Z*-axis direction.

${q}_{}={[{x}_{},{y}_{},{z}_{},1]}^{T}$ represents the 3D coordinates of an arbitrary point on the target under the target coordinate frame ${O}_{\text{t}}{x}_{\text{t}}{y}_{\text{t}}{z}_{\text{t}}$. ${p}_{}={[{u}_{},{v}_{},1]}^{T}$ represents the undistorted homogeneous image coordinates of $q$ in the image. According to the pinhole camera model, they satisfy the following equation:

*u*

_{0}and

*v*

_{0}are the coordinates of the principal point, ${a}_{x}$ and ${a}_{y}$ are the scale factors in the image

*u*and

*v*axes, respectively, and $\gamma $ is the skew of the two image axes. $R,t$represent the rotation matrix and the translation vector from ${O}_{\text{t}}{x}_{\text{t}}{y}_{\text{t}}{z}_{\text{t}}$ to ${O}_{\text{c}}{x}_{\text{c}}{y}_{\text{c}}{z}_{\text{c}}$, respectively.

## 3. Method principle

In the calibration process, the initial solutions of intrinsic and extrinsic parameters are obtained from the elliptic curves of the latitude and longitude circles in the image. Then, the nonlinear optimization method corresponding with the coordinates of the intersection points of the longitude and latitude lines in the image is adopted for optimum solutions of the intrinsic and extrinsic parameters.

#### 3.1 Extracting and fitting the image of the elliptic curves of the latitude and longitude circles

The Steger method [25] is used to extract the center of the image of the longitude and latitude lines on the target and then link the extracted points to line segments. Two arbitrary segments are selected, and the first point and the last point in one of segments are compared to them of the other segment. If the distance between the two nearest neighboring points of them is less than 4 pixels, and the included angle between the normal vectors (**n**_{1}, **n**_{2} in Fig. 2(a)) of segments (s_{1}, s_{2} in Fig. 2(a)) is smaller than the threshold value, the two segments will be categorized into a set of curves. The rest can be deduced like that, until all segments are categorized into different sets of curves. The corresponding elliptic curve equations for all sets of curves are obtained by ellipse fitting [26].

For all elliptic curves, the long axis, short axis length, center of the ellipse, and included angle between the long axis and the y-axis in the image are calculated. Based on the above parameters, all elliptic curves are divided into two classes: (1) elliptic curves in which the Seita angle (as shown in Fig. 2(b)) which are included angle between the long axis and the y-axis in the image, the long axis and center of the ellipse are basically the same, and the relative deviation is smaller than the threshold value, belonging to the elliptic curves of the circles of longitude; and (2) all elliptic curves other than those of the first class, belonging to the elliptic curves of the circles of latitude. The process is shown in Fig. 3.

#### 3.2 Obtaining the linear solutions of the intrinsic and extrinsic parameters

### 3.2.1 Obtaining the linear solutions of the intrinsic camera parameters

According to the geometric basics of projection, all the circles spatially intersect the absolute conic (AC) [27] at two imaginary intersection points, i.e., ** I** = (1,I,0,0)

^{T},

**= (1,-I,0,0)**

*J*^{T}

_{,}also called the imaginary circular point (CP) [27], and the projected points of the CPs in the image are called the images of circular points (ICPs). Circles that are parallel or on a same plane intersect the absolute conic at the same two CPs. According to computer vision knowledge [27], the projected curve of the absolute conic (AC) in the image is called the Image of Absolute Conic (IAC)$\omega $, and the relations between $\omega $ and the intrinsic camera parameters are as below:

**can also be obtained.**

*K*As shown in Fig. 4(a), all latitude circles on GST in space are parallel, so the intersection points between any latitude circle image ${e}_{\text{la(}j\text{)}}$ and $\omega $ are the same two ICPs ${m}_{I}^{}$ and ${m}_{J}^{}$(${m}_{I}^{}$ and ${m}_{J}^{}$ are the conjugate imaginary numbers). The line ${l}_{\text{la}}$ connecting ${m}_{I}^{}$ and ${m}_{J}^{}$is called the vanishing line [27] of the plane on which a latitude circle is in the image and all latitude circles have the same vanishing line ${l}_{\text{la}}$ in the image. Meanwhile, ${l}_{\text{la}}$intersects ${e}_{\text{la(}j\text{)}}$ or $\omega $at ${m}_{I}^{}$and${m}_{J}^{}$. As shown in Fig. 4(b), ${l}_{\text{li(}i\text{)}}$ represents the vanishing line of the plane on which every longitude circle is, and ${l}_{\text{li(}i\text{)}}$ intersects ${e}_{\text{li(}i\text{)}}$ or $\omega $at two ICPs. If we know ${l}_{\text{la}}$ and ${l}_{\text{li(}i\text{)}}$, the ICPs on $\omega $can be obtained by the intersection points between ${l}_{\text{li(}i\text{)}}$ and ${e}_{\text{li(}i\text{)}}$, or ${l}_{\text{la}}$and ${e}_{\text{la(}j\text{)}}$.

Assuming that ${c}_{\text{la}(j)}={[{x}_{\text{la}(j)},{y}_{\text{la}(j)},{z}_{\text{la}(j)},1]}^{\text{T}}$ is the center of the latitude circle ${C}_{\text{la}(j)}$, ${o}_{\text{la}(j)}={[{u}_{\text{la}(j)},{v}_{\text{la}(j)},1]}^{\text{T}}$ represents the undistorted homogeneous coordinate of ${c}_{\text{la}(j)}$ in the image, and the undistorted homogeneous coordinate of the center of the largest latitude circle in the image is expressed by ${\tilde{o}}_{\text{la}(j)}$. ${c}_{\text{li}(i)}={[{x}_{\text{li}(i)},{y}_{\text{li}(i)},{z}_{\text{li}(i)},1]}^{\text{T}}$ represents the center of a longitude circle ${C}_{\text{li}(i)}$, and ${o}_{\text{li}(i)}={[{u}_{\text{li}(i)},{v}_{\text{li}(i)},1]}^{\text{T}}$ represents the undistorted homogeneous coordinate of ${c}_{\text{li}(i)}$ in the image. The undistorted homogeneous coordinate of the center of GST in the image is expressed as ** o**, and this point is also the point ${o}_{\text{li}(i)}$ corresponding to the center of any longitude circle and the point${\tilde{o}}_{\text{la}(j)}$ corresponding to the center of the largest latitude circle in the image, namely, ${\tilde{o}}_{\text{la}(j)}={o}_{\text{li}(i)}=o$.

Assume that $E=\left[\begin{array}{ccc}1& & 0\\ 0& & 1\\ -a& & -b\end{array}\begin{array}{c}\\ \\ \end{array}\begin{array}{c}\\ \\ \end{array}\begin{array}{c}-a\\ -b\\ {a}^{2}+{b}^{2}-{r}^{2}\end{array}\right]$ is the matrix expression of a circle in a space plane, and ${(x-a)}^{2}+{(y-b)}^{2}={r}^{2}$ is the analytic expression of it. The homogeneous coordinate of circle center is $O=\left[\begin{array}{c}a\\ b\\ 1\end{array}\right]$ in the space and Eq. (3) shows that the product of *E* and *O* is a homogeneous vector of the infinite line [27].

Where ${L}_{\infty}=\left[\begin{array}{c}0\\ 0\\ 1\end{array}\right]$ is the infinite line, $\lambda $ is a non-zero factor.

Assume that *H* is the homography matrix from the plane on which the circle is to the camera image plane. In Eqs. (4)-(6), *e* is the matrix expression of the circle in the image, *o*_{c} is the image coordinate of *O* and *l* is the image of *L* .

According to Eqs. (4)-(6), Eq. (7) can be obtained,

where ${l}_{\infty}$ is the image of ${L}_{\infty}$, namely, ${l}_{\infty}$ is the vanishing line of the plane on which the circle is in the image.So a polar relation exists between ${o}_{\text{la}(j)}$ and ${l}_{\text{la}}$ relative to ${e}_{\text{la}(j)}^{}$, as Eq. (8). Thus, by solving Eq. (8), ${l}_{\text{la}}$ is obtained

where $\lambda $ is a non-zero constant. Given ${\tilde{o}}_{\text{la}(j)}={o}_{\text{li}(i)}=o$, ${l}_{\text{la}}$ may be obtained according to Eq. (9),where $\lambda $ is a non-zero constant, ${\tilde{e}}_{\text{la}(j)}$ represents the elliptic curves of the largest circle of latitude in the image.Similarly, polar relations exist between ${o}_{\text{li}(i)}$ and ${l}_{\text{li(}i\text{)}}$ relative to ${e}_{\text{li}(i)}$, as shown in Eq. (10)

where $\lambda $ is a non-zero constant. Given ${\tilde{o}}_{\text{la}(j)}={o}_{\text{li}(i)}=o$, the vanishing line ${l}_{\text{li(}i\text{)}}$ can be obtained by Eq. (11).In addition, ** o** in Eqs. (9) and (11) can be obtained using the method in [17]. In [17], it is mentioned that if there are three spheres in different positions, three equations will be listed as follows:

As shown in Fig. 5, *C*_{1}, *C*_{2}, *C*_{3} are contour images of spheres, *C*_{1}*, *C*_{2}*, *C*_{3}* are the dual of them respectively, and *O*_{1}, *O*_{2}, *O*_{3} are the images of the center of spheres respectively. *l*_{12} is an eigenvector of ${C}_{2}{C}_{1}^{\ast}$ corresponding to the eigenvalue ${\beta}_{2}/{\beta}_{1}$, and *l*_{12} can be uniquely obtained from the eigenvectors of ${C}_{2}{C}_{1}^{\ast}$since it is the only line having two intersection points with both conics. *l*_{13}, *l*_{23} can also be obtained from ${C}_{3}{C}_{1}^{\ast}$ and ${C}_{2}{C}_{3}^{\ast}$ respectively. Then *O*_{1} can be calculated from cross-product of *l*_{12} and *l*_{13}. Similarly *O*_{2} and *O*_{3} can also be obtained. So we can get the ** o** from there GSTs.

Because ${l}_{\text{li(}i\text{)}}$intersects ${e}_{\text{li(}i\text{)}}$at two ICPs, and ${l}_{\text{la}}$ intersects an arbitrary ${e}_{\text{la}(j)}^{}$at the same pair of ICPs, shooting at least three longitude circles or the largest latitude circle combined with two longitude circles can produce $\omega $, and then camera intrinsic parameters can be solved by Cholesky decomposition.

### 3.2.2 Linear solution of extrinsic camera parameters

For the normal vector of plane ${O}_{\text{t}}{x}_{\text{t}}{y}_{\text{t}}$, namely, the normal vector of the plane on which all circles of latitude of the target can be found, the coordinate of the vanishing point in the image is ${v}_{\text{1}}$. For the normal vector of plane ${O}_{\text{t}}{x}_{\text{t}}{z}_{\text{t}}$, namely, the normal vector of the plane on which a longitude circle of the target is, the coordinate of the vanishing point in the image is ${v}_{\text{2}}$. ${v}_{\text{1}}$ and ${v}_{\text{2}}$ can be solved by Eq. (13):

Assume that the vectors along the *z*-axis and *y*-axis under ${O}_{\text{t}}{x}_{\text{t}}{y}_{\text{t}}{z}_{\text{t}}$ are ${\overrightarrow{d}}_{\text{1}}={[0,0,1]}^{\text{T}}$ and ${\overrightarrow{d}}_{\text{2}}=[0,1,0]$, respectively. The relationships between ${v}_{\text{1}}$ and ${\overrightarrow{d}}_{\text{1}}$ and ${v}_{\text{2}}$ and ${\overrightarrow{d}}_{\text{2}}$ are expressed as Eq. (14):

Using Eq. (14) and combining ${\overrightarrow{d}}_{\text{1}}={[0,0,1]}^{\text{T}}$ and ${\overrightarrow{d}}_{\text{2}}=[0,1,0]$, Eq. (15) can be obtained:

${r}_{1}{\text{,}}_{}{r}_{2}{\text{,}}_{}{r}_{3}$ represent the orthogonal unit vectors, thereby obtaining Eq. (16):

Then, the rotation matrix ** R** from ${O}_{\text{t}}{x}_{\text{t}}{y}_{\text{t}}{z}_{\text{t}}$to ${O}_{\text{c}}{x}_{\text{c}}{y}_{\text{c}}{z}_{\text{c}}$ is obtained using Eqs. (13)–(16).

As shown in Fig. 6, the expression for the *j*^{th} circle of latitude under the coordinate system${O}_{\text{la(}j\text{)}}{x}_{\text{la(}j\text{)}}{y}_{\text{la(}j\text{)}}{z}_{\text{la(}j\text{)}}$ of the *j*^{th} circle of latitude is assumed as ${C}_{\text{la}(j)}^{}=\left[\begin{array}{ccc}1& 0& 0\\ 0& 1& 0\\ 0& 0& -{d}_{j}^{2}\end{array}\right]$, where ${d}_{j}$ represents the radius of the *j*^{th} circle of latitude. For the *j ^{th}* circle of latitude${C}_{\text{la}(j)}^{}$, the coordinate system${O}_{\text{la(}j\text{)}}{x}_{\text{la(}j\text{)}}{y}_{\text{la(}j\text{)}}{z}_{\text{la(}j\text{)}}$ is parallel to the target coordinate system ${O}_{\text{t}}{x}_{\text{t}}{y}_{\text{t}}{z}_{\text{t}}$, but only the fixed value $D$ separates them in the z direction. The value $D$ represents the inherent dimension of the target, and it is a known quantity. Therefore, the rotation matrix $\tilde{R}$ from ${O}_{\text{la(}j\text{)}}{x}_{\text{la(}j\text{)}}{y}_{\text{la(}j\text{)}}{z}_{\text{la(}j\text{)}}$ to ${O}_{\text{c}}{x}_{\text{c}}{y}_{\text{c}}{z}_{\text{c}}$ and the rotation matrix $R$ from ${O}_{\text{t}}{x}_{\text{t}}{y}_{\text{t}}{z}_{\text{t}}$ to ${O}_{\text{c}}{x}_{\text{c}}{y}_{\text{c}}{z}_{\text{c}}$ are the same, but the translation vectors $\tilde{t}$ and $t$are different, and $\tilde{t}=\text{t}+R\times {\left[\begin{array}{ccc}0& 0& D\end{array}\right]}^{T}$as shown in Fig. 6. The sign of

*D*should be noted, it is positive on the positive side of the z

*-axis but negative on the negative side.*

_{t}According to multi-view geometry foundation [27], the relationship between ${C}_{\text{la}(j)}^{}$ and ${e}_{\text{la}(j)}$ is as below:

where $\lambda $ is a non-zero constant, and $H$represents the homography matrix from the plane on which the*j*

^{th}circle of latitude is to the camera image plane.

By solving Eq. (17):

By consolidating Eq. (18):

Equation (19) can be obtained from an arbitrary circle of latitude. From the projected curves of any three circles of latitude on the target, a set of no less than three equations similar to Eq. (19) is enumerated. Using the least square method, $t$ that is from ${O}_{\text{li(}j\text{)}}{x}_{\text{li(}j\text{)}}{y}_{\text{li(}j\text{)}}{z}_{\text{li(}j\text{)}}$ to ${O}_{\text{c}}{x}_{\text{c}}{y}_{\text{c}}{z}_{\text{c}}$ can be solved.

#### 3.3 Non-linear Optimization

The 3D coordinates of the intersection points on the GST under the target coordinate system are given. The intersection points are applied to solve the camera intrinsic parameters, thereby improving the calibration accuracy.

The nonlinear optimization is a two-stage process. First, the optimal solutions of the intrinsic and extrinsic parameters are obtained from the curves of the longitude and latitude lines, and the serial numbers are identified by perspective projection positioning. The identified intersection points of the latitude and longitude lines are introduced into the objective function for optimization, and the optimum solutions of the intrinsic and extrinsic parameters are solved by nonlinear optimization.

At the first stage, nonlinear optimization establishes the objective function for optimization with the minimum difference between the projected curve of a circle of latitude and the fitted curve of the extracted image after image distortion correction, as expressed in Eq. (20):

*l*= 1,2,3…

*L. L*represents the number of times the target is placed;

*N*represents the total number of elliptic curves of latitude circles extracted from every position of the target;${e}_{\text{la(}jl)}$represents the elliptic curve corresponding to the

*j*

^{th}latitude circle in the image acquired from the position that the target is placed for the

*l*

^{th}time, and the elliptic curve is fitted after image distortion correction; ${H}_{\text{la(}jl)}$ represents the homography matrix from the plane on which the

*j*

^{th}latitude circle is to the camera image plane; and ${C}_{\text{la(}j)}$ represents the expression of the

*j*

^{th}latitude circle on the plane under the coordinate system${O}_{\text{la(}j\text{)}}{x}_{\text{la(}j\text{)}}{y}_{\text{la(}j\text{)}}{z}_{\text{la(}j\text{)}}$.

The 3D coordinates of the intersection points on the GST under the target coordinate system are given, and solving the intrinsic camera parameters according to the intersection points may improve the calibration accuracy. Therefore, the identified intersection points of the latitude and longitude lines are introduced into the objective function for optimization, and the optimum solutions of the intrinsic and extrinsic parameters are obtained by nonlinear optimization.

The objective function for optimization is established with the minimum difference between the projected point of the intersection point on the target and the image intersection point after image distortion correction, as shown in Eq. (21):

*n*is the number of extracted intersection points at every position of the target;

*m*represents the number of times the target is placed; and $d({\widehat{p}}_{ij},{p}_{ij})$ is the distance between ${\widehat{p}}_{ij}$ and ${p}_{ij}$.

According to the objective function for optimization in Eq. (21), the nonlinear optimum solutions of the intrinsic and extrinsic parameters are obtained by the LM method.

## 4. Experimental results

The effectiveness of this method is proven in simulation and physical experiments. Simulation experiments lay special stress on analyzing the root-mean-square error (RMSE) of the results relative to the true value of the GST-based method at different noise levels and those of Zhang’s camera calibration method (Zhang’s) and spherical contour (SC)-based method. Physical experiments focus on comparing GST-based method and the SC-based method relative to the results of Zhang’s.

#### 4.1 Simulation experiments

Simulation experiments are designed to analyze the calibration accuracy of the GST-based method, Zhang’s, and SC-based method at different noise levels. GST (diameter: 120 mm; longitude/latitude interval: 20/15 degrees). Subsequently, 6 × 6 feature points are selected for nonlinear optimization. Zhang’s adopts a planar target of 6 × 6 feature points, of which the horizontal spacing of the adjacent feature points is 10 mm and the vertical spacing is 16 mm. The diameter of the SC target is 100 mm, and the clear imaging range of the camera is 180–230 mm from the lens. The target is placed within the scope, and the intrinsic parameters matrix of the camera is$K=\left[\begin{array}{ccc}1024& 0& 400\\ 0& 960& 300\\ 0& 0& 1\end{array}\right]$. The lens distortion coefficients are $\begin{array}{cc}{k}_{1}=-0.1,& {k}_{2}=0.08\end{array}$.

For Zhang’s and GST-based method, their targets are placed 10 times respectively. The SC-based method adopts the SDP method in the literature [28], by which the camera intrinsic parameters are solved through 10 spherical targets at different positions. The methods are tested 100 times each at different noise levels. The different noise levels are added to the simulated ideal extracted points on latitude and longitude circles in the image, so the calibration results in the simulation experiment include the effect of the curve fitting method relative to noise.

The three calibration methods are assessed according to the camera calibration results, RMSEs relative to the true value, and reprojection errors of the feature points of the target on the image plane.

In Fig. 7, the RMSEs of *f _{x}*,

*f*,

_{y}*u*and

_{0}*v*obtained by the SC-based method are about two to three times greater than those by Zhang’s and GST. The RMSEs of

_{0}*f*,

_{x}*f*,

_{y}*u*and

_{0}*v*by Zhang’s and the proposed method are basically the same. In terms of the RMSEs of

_{0}*k*

_{1}and

*k*

_{2}, GST is slightly poorer than Zhang’s method, and those obtained by Zhang’s are about 2/3 of those by the proposed method.

The reprojection error of the feature points of the target image on the image plane refers to, under the target coordinate system, the error between the target image feature points that are projected onto the image coordinate system on the basis of the calibration results and the camera model and the coordinates of the feature points extracted in the image. Figure 8 shows the reprojection error of the feature points of the target image on the image plane in the *u* and *v* directions under the image coordinate system obtained by GST and Zhang’s, where *u* represents the *x*-coordinate under the image coordinate system, and *v* represents the *y*-coordinate under the image coordinate system. In Fig. 8, similar to Zhang’s method, reprojection error based on GST increases linearly with the noise level.

#### 4.2. Physical experiment

In physical experiments, the effectiveness of GST is validated by comparing the calibration errors of the GST and SC-based methods relative to Zhang’s. In experiments, the focal length of the camera lens is 16 mm, with the clear imaging range of 300–400 mm and the image resolution of 1628 × 1236 pixels. The feature point of the planar checkerboard target for calibration is 10 × 10, with the horizontal/vertical spacing of 17 mm between two adjacent feature points, and the target accuracy of 0.1 mm. The diameter of GST is 100 mm, and the longitude interval is 20 degrees. The latitude lines are vertically arranged on the latitude plane with the same interval and the vertical interval is 10 mm, and the accuracy is 0.1 mm. The diameter of the spherical target in the SC-based method is 50 mm.

The physical site map is shown in Figs. 9(a)-9(c), where Fig. 9(a) represents a camera using the checkerboard target for calibration, Fig. 9(b) represents a camera using the GST for calibration, and Fig. 9(c) represents a camera using the spherical target for calibration.

Steger method has been used in [29–31]. In [31], Steger method is used in rail wear measurement and Section 4.1 in [29] shows that in good, dim and strong light environment, the light stripe can be obtained with similar accuracy. To verify it further, Figs. 10(a) and 10(b) show two sets of experiments demonstrating the curve fitting result on different light conditions. Firstly, the points on latitude and longitude lines in the image are extracted by Steger method, and then they are fitted by curve fitting method. In uniform dark or uniform bright situation, Fig. 10(a) and Fig. 10(b) show that two sets of experiments both achieve the same curve fitting results in uniform dark and uniform bright lighting. But in non-uniform lighting condition in Figs. 10(a) and 10(b), the curve cannot be fitted well because of reduced extracted points in latitude or longitude circles. So in actual calibration scene, the lighting condition should be relatively uniform to get high-accuracy curve fitting.

During the experiment, planar target, GST, and spherical target are respectively laid out for 10 times, and the distances from the targets to the lens are about 300–400 mm. Zhang’s method directly adopts Matlab Toolbox in [32] for calibration. Given the field of view (F.O.V.) of the camera in Zhang’s, 6 × 7 or 7 × 7 feature points of the target are extracted from every position image (as shown in Fig. 11(a)). For the GST method, the number of intersection points extracted at every position is 64 or 72 because of the extraction of light stripes, and the target image is shown as Fig. 11(b). For the spherical target, the contour is extracted by Canny algorithm [33], the SDP method in the literature [17] is adopted, and the target image in Fig. 11(c) is obtained by solving from 10 spheres.

As a result of high machining costs of high-accuracy GST, the GST with accuracy of 0.1 mm is temporarily used to verify the effectiveness of the proposed method. Thus, we recommend the use of the checkerboard target with the same accuracy for contrast test while choosing the planar target.

The calibration results of experiments are shown in Table 1. Zhang’s is extensively applied with high accuracy. Therefore, with the results from Zhang’s as the true value, the calibration errors of the GST and SC-based methods relative to Zhang’s are analyzed.

From the reprojection errors, GST and Zhang’s are further analyzed. As shown in Figs. 12(a) and 12(b) their reprojection errors are 0.21 pixels. Figures 12(c) and 12(d) show the statistics forms of the reprojection errors based on the two methods. From the figures, the reprojection error in Zhang’s is basically within 0–0.5 pixels, whereas that of GST is within 0–0.8 pixels. However, the reprojection error of GST is concentrated more within the range of 0–0.2 pixels compared with that of Zhang’s.

To further prove the effectiveness of the proposed method, the three methods are repeatedly tested. In the specific experiments, the spherical target, planar target, and GST are placed 10 times each, with 10 target images captured at different positions. Ten sets of images are chosen for camera calibration from the 10 images, and each set contains seven images. Figures 13(a)-13(d) show the box charts and fitting curves for statistical analysis of the calibration results of 10 sets of images using planar target, GST and SC.

Figures 13(a) and 13(b) show the statistics results of the effective focal length based on the three methods. To clarify the results, Figs. 13(d) and 13(e) show a comparison between Zhang’s and GST with their respective results enlarged. In the figures, the asterisk in black indicates the results of the 10 sets of images using each of the methods. First, along the ordinate direction, the box of the SC is the longest among the three methods, indicating that it has the highest data dispersion, and the mean values (marked with dotted horizontal lines in purple) of the results of the SC-based method differ greatly from the other two methods. As shown in Figs. 13(d) and 13(e), along the direction of ordinate, the box length of GST is basically the same as that of Zhang’s. However, according to the fitted normal distribution curves, 2.5 and 10 pixels separate their mean values respectively, but their dispersion degrees are similar. Figures 11(c) and (g) compare the results of the center coordinates of the three methods. From the figures, the dispersion of the SC-based method is high, and the mean values are greater than those of GST and Zhang. According to the normal distribution and box shown in Fig. 13(f), there is a difference of no more than five pixels between the mean value of *u _{0}* of Zhang’s and that of GST. The normal distribution and box in Fig. 13(i) are similar in the dispersion of

*v*, but a large difference of about 20 pixels is observed between their mean values. Figure 13(h) shows the statistical results of

_{0}*r*obtained by the three methods, and the dispersion and deviation from the SC-based method are larger than those from the other two method. In Fig. 13(j), the difference between the mean value of

*r*obtained by Zhang’s and that by GST is about 2.5, but they are similar in terms of dispersion.

Figures 14(a) and 14(b) show a comparison of the reprojection errors in the two directions of ** u** and

**. In the direction of**

*v***, there is a difference of 0.05 pixels between GST and Zhang’s in the reprojection errors, and the dispersion obtained by GST is lower than Zhang. In the direction of**

*u***, there is a difference of 0.015 pixels between GST and Zhang’s in the mean reprojection errors, and the dispersion obtained by GST is slightly lower than Zhang. Thus, the repeatability of the calibration results show that results of the GST method are similar to those of Zhang’s method.**

*v*## 5. Conclusions

This paper presents a camera calibration method based on the GST. By combining the circles of latitude and longitude with the intersection points, this method achieves calibration of the intrinsic and extrinsic parameters. Considering target machining errors, the 3D coordinates of the intersection points, as the parameters for optimization, are introduced into nonlinear optimization, thereby further improving the calibration accuracy. The proposed method presents the strengths of both spherical target-based calibration and planar target-based calibration, while overcoming their respective shortcomings and the same calibration accuracy as Zhang’s can be obtained. In physical and simulation experiments, this method is compared with the current mainstream calibration methods, namely, Zhang’s and the SC-based method. The results prove the effectiveness of this method. Meanwhile, the GST introduced in this method has important application value in the calibration of vision measuring systems, including structured light vision sensor and multi-vision sensor without common F.O.V.

## Funding

National Natural Science Foundation of China (NSFC) (51575033), National Key Scientific Instrument and Equipment Development Projects of China (2012YQ140032) and Central University in Beijing Major Achievements Transformation Project (Online Dynamic Detection System for Train Pantograph and Catenary).

## References and links

**1. **A. S. Poulin-Girard, S. Thibault, and D. Laurendeau, “Influence of camera calibration conditions on the accuracy of 3D reconstruction,” Opt. Express **24**(3), 2678–2686 (2016). [CrossRef] [PubMed]

**2. **B. Sun, J. G. Zhu, L. H. Yang, S. R. Yang, and Z. Y. Niu, “Calibration of line-scan cameras for precision measurement,” Appl. Opt. **55**(25), 6836–6843 (2016). [CrossRef] [PubMed]

**3. **W. M. Li, S. Y. Shan, and H. Liu, “High-precision method of binocular camera calibration with a distortion model,” Appl. Opt. **56**(8), 2368–2377 (2017). [CrossRef] [PubMed]

**4. **R. Y. Tsai, “A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV camera and lenses,” IEEE J. Robot. Autom. **3**(4), 323–344 (1987). [CrossRef]

**5. **J. Heikkila, “Geometric camera calibration using circular control points,” IEEE Trans. Pattern Anal. Mach. Intell. **22**(10), 1066–1077 (2000). [CrossRef]

**6. **J. Y. Weng, P. Cohen, and M. Herniou, “Camera calibration with distortion model and accuracy evaluation,” IEEE Trans. Pattern Anal. Mach. Intell. **14**(10), 965–980 (1992). [CrossRef]

**7. **Z. Y. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. **22**(11), 1330–1334 (2000). [CrossRef]

**8. **Z. Liu, Q. Wu, X. Chen, and Y. Yin, “High-accuracy calibration of low-cost camera using image disturbance factor,” Opt. Express **24**(21), 24321–24336 (2016). [CrossRef] [PubMed]

**9. **J. S. Kim, P. Gurdjos, and I. S. Kweon, “Geometric and algebraic constraints of projected concentric circles and their applications to camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. **27**(4), 637–642 (2005). [CrossRef] [PubMed]

**10. **Y. H. Wu, X. J. Li, F. C. Wu, and Z. Y. Hu, “Coplanar circles, quasi-affine invariance and calibration,” Image Vis. Comput. **24**(4), 319–326 (2006). [CrossRef]

**11. **D. Douxchamps and K. Chihara, “High-accuracy and robust localization of large control markers for geometric camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. **31**(2), 376–383 (2009). [CrossRef] [PubMed]

**12. **Z. Y. Zhang, “Camera calibration with one-dimensional objects,” IEEE Trans. Pattern Anal. Mach. Intell. **26**(7), 892–899 (2004). [CrossRef] [PubMed]

**13. **F. C. Wu, Z. Y. Hu, and H. J. Zhu, “Camera calibration with moving one-dimensional objects,” Pattern Recognit. **38**(5), 755–765 (2005). [CrossRef]

**14. **F. Qi, Q. H. Li, Y. P. Luo, and D. C. Hu, “Camera calibration with one-dimensional objects moving under gravity,” Pattern Recognit. **40**(1), 343–345 (2007). [CrossRef]

**15. **M. Agrawal and L. Davis, “Complete camera calibration using spheres: a dual-space approach,” Analysis Int. Math. J. Analysis Appl. **34**(3), 257–282 (2003).

**16. **H. Teramoto and G. Xu, “Camera calibration by a single image of balls: from conic to the absolute conic,” in Proceedings of the Asian Conference on Computer Vision (ACCV, 2002), pp. 499-506.

**17. **H. Zhang, K.-Y. K. Wong, and G. Zhang, “Camera calibration from images of spheres,” IEEE Trans. Pattern Anal. Mach. Intell. **29**(3), 499–503 (2007). [CrossRef] [PubMed]

**18. **X. H. Ying and H. B. Zha, “Geometric interpretations of the relation between the image of the absolute conic and sphere images,” IEEE Trans. Pattern Anal. Mach. Intell. **28**(12), 2031–2036 (2006). [CrossRef] [PubMed]

**19. **K.-Y. K. Wong, G. Q. Zhang, and Z. H. Chen, “A stratified approach for camera calibration using spheres,” IEEE Trans. Image Process. **20**(2), 305–316 (2011). [CrossRef] [PubMed]

**20. **M. H. Ruan and D. Huber, “Calibration of 3D Sensors Using a Spherical Target,” International Conference on 3D Vision (IEEE, 2015), pp. 187–193.

**21. **J. H. Sun, H. B. He, and D. B. Zeng, “Global Calibration of Multiple Cameras Based on Sphere Targets,” Sensors (Basel) **16**(1), 77 (2016). [CrossRef] [PubMed]

**22. **K. K. Wong, P. R. S. Mendonca, and R. Cipolla, “Camera calibration from surfaces of revolution,” IEEE Trans. Pattern Anal. Mach. Intell. **25**(2), 147–161 (2003). [CrossRef]

**23. **Y. H. Wu, H. J. Zhu, Z. Y. Hu, and F. C. Wu, “Camera Calibration from the Quasi-Affine invariance of two parallel circles,” in Proceedings of European Conference on Computer Vision (ECCV, 2004), pp.190–202.

**24. **C. Colombo, D. Comanducci, and A. D. Bimbo, “Camera calibration with two arbitrary coaxial circles,” in Proceedings of European Conference on Computer Vision (ECCV, 2006), pp. 265–276. [CrossRef]

**25. **C. Steger, “An Unbiased Detector of Curvilinear Structures,” IEEE Trans. Pattern Anal. Mach. Intell. **20**(2), 113–125 (2002). [CrossRef]

**26. **R. Halir and J. Flusser, “Numerically Stable Direct Least Squares Fitting of Ellipses,” (1998).

**27. **R. I. Hartley and A. Zisserman, *Multiple View Geometry in Computer Vision* (Cambridge University, 2000).

**28. **M. Agrawal and L. S. Davis, “Camera calibration using spheres: a semi-definite programming approach,” in Proceedings of the IEEE International Conference on Computer Vision (IEEE, 2003), pp.782–789. [CrossRef]

**29. **Z. Liu, X. J. Li, and Y. Yin, “On-site calibration of line-structured light vision sensor in complex light environments,” Opt. Express **23**(23), 29896–29911 (2015). [CrossRef] [PubMed]

**30. **Z. Liu, X. J. Li, F. J. Li, and G. J. Zhang, “Calibration method for line-structured light vision sensor based on a single ball target,” Opt. Lasers Eng. **69**, 20–28 (2015). [CrossRef]

**31. **Z. Liu, F. J. Li, B. K. Huang, and G. J. Zhang, “Real-time and accurate rail wear measurement method and experimental analysis,” J. Opt. Soc. Am. A **31**(8), 1721–1729 (2014). [CrossRef] [PubMed]

**32. **J. Y. Bouguet, “The MATLAB open source calibration toolbox,” http://www.vision.caltech.edu/bouguetj/calib _doc/.

**33. **J. Canny, “A computational approach to edge detection,” IEEE Trans. Pattern Anal. Mach. Intell. **8**(6), 679–698 (1986). [CrossRef] [PubMed]