Abstract

This paper presents a new linear framework to obtain 3D scene reconstruction and camera calibration simultaneously from uncalibrated images using scene geometry. Our strategy uses the constraints of parallelism, coplanarity, colinearity, and orthogonality. These constraints can be obtained in general man-made scenes frequently. This approach can give more stable results with fewer images and allow us to gain the results with only linear operations. In this paper, it is shown that all the geometric constraints used in the previous works performed independently up to now can be implemented easily in the proposed linear method. The study on the situations that cannot be dealt with by the previous approaches is also presented and it is shown that the proposed method being able to handle the cases is more flexible in use. The proposed method uses a stratified approach, in which affine reconstruction is performed first and then metric reconstruction. In this procedure, the additional constraints newly extracted in this paper have an important role for affine reconstruction in practical situations.

© 2013 OSA

1. Introduction

To develop a method obtaining camera calibration and 3D scene reconstruction simultaneously from images are one of the important topics in the field of computer vision. There have been many approaches concerning these problems.

First, classical approaches utilize known 3D positions of points usually printed on a calibration rig manufactured with accuracy [13]. However, such position information can be obtained through specific acquisition systems for large 3D scenes and is rarely available in general situations. Second, the camera calibration and scene reconstruction can be acquired simultaneously from solely the image sequences using only the constraints on the camera intrinsic parameters. This approach is known as self-calibration and provides great flexibility [4, 5]. However, to acquire high quality, a lengthy image sequence and a set of many accurate feature correspondences are necessary. Third, the information from restricted camera motion can be used to calibrate cameras [68]. Finally, there have been many methods that use the properties of scene geometry or special objects [919]. As these methods do not require a calibration rig but utilize the geometry of a scene or the shapes of objects as a calibration rig, accurate calibration results can be acquired without a lengthy image sequence. The proposed algorithm is categorized into these methods.

In this paper, we propose a linear stratified approach that can use full geometric constraints. The method can obtain affine structure from parallelograms in general position and all available geometric information including constraints from partial knowledge of camera parameters can be utilized to upgrade the result to a metric one. The parallelograms need not to be the part of a parallelepiped. To cope with the general position of parallelograms in practical situations, we extract additional constraints which are from parallelograms and have an important roll for the affine reconstruction step. It can be easily validated that all of the constraints used in the previous related works can be implemented in the proposed method. Detailed comparisons between the proposed method and the previous ones are also presented in this paper.

2. Related works

Many methods using geometric constraints of a scene have been introduced. The work most closely related with that of the present paper was presented by Wilczkowiak et al. who suggested elegant linear formulation for camera calibration and 3D modeling using parallelepipeds in a scene [13]. The non-linear optimization method using the differential evolution also can be applied for cuboid that is a special case of parallelepiped [16]. For these methods, at least six vertices of a parallelepiped must be visible in all views to obtain affine or metric reconstruction. However, the primitive that can give full affine information is not only a parallelepiped. Parallelograms are most general primitives in man-made environment, such as the architecture, but pairs of parallelograms do not always form two faces of a parallelepiped in a scene.

It is also possible to obtain projective reconstruction linearly from at least one plane visible in all views [12]. However, to obtain metric reconstruction linearly, the method requires that the plane should be the plane at infinity determined from three orthogonal vanishing points as in [9]. Given calibration and rotation parameters, a linear estimation method using the metric information of a scene geometry was also suggested in [14]. Quadratic constraints from coplanar parallelograms were discovered for camera calibration in [15]. These constraints can be converted to linear constraints with the metric information of the parallelograms. There have also been many methods using other constraints of scene geometry: symmetric polyhedra [17], ellipses [18], spheres [19]. If there are only symmetric polyhedra and no parallelograms in a scene, the method described in [17] is a good alternative to our method. However, this method assumes the simplest camera model where only the focal length is unknown and needs a nonlinear optimization approach initialized by solving quadric equations.

As mentioned above, in-depth comparisons between the proposed method and the previous ones are presented in detail in the following related sections, respectively.

3. The infinity homography from parallelism and coplanarity

3.1. Preliminaries and motivations

In general, it is possible to estimate the infinite homography, H, with at least four vanishing point correspondences in two views by solving the relevant linear equations of

viHvi,fori1,,4,
where v and v′ are homogeneous coordinates of the corresponding vanishing points in two views and ’≅’ indicates equality up to scale. The vanishing point can be estimated by the intersection of imaged lines parallel in 3D space. One of the vanishing points can be replaced by the epipole related to the two views, as it is mapped between the views by the infinite homography [20].

The two sets of parallel lines with different directions construct imaged parallelograms in camera images. If the two sets of parallel lines lie on the same supporting plane in 3D, the imaged parallelogram corresponds to an actual parallelogram existing on the plane in 3D (See Fig. 1(a)). This is common in typical man-made scenes. However, an imaged parallelogram consisting of two sets of parallel lines on a different supporting plane can exist, as in Figs. 1(b) and 1(c). In this case, the parallelogram does not actually exist in 3D. The proposed framework for computing H utilizes the coplanarity constraint related to the former case.

 

Fig. 1 Examples of imaged parallelograms in camera images due to the two sets of parallel lines with different directions: Only (a) depicts the imaged parallelogram corresponding to an actual parallelogram existing on a plane in 3D.

Download Full Size | PPT Slide | PDF

3.2. Image rectification for a novel framework

The rectification homographies Hri, for i = 1, 2, are defined here for two views. The rectification homography is defined so that it maps each imaged parallelogram to a rectangle that has edges parallel to the vertical and horizontal axes of the image plane (see Fig. 2). The position and the shape of the rectangle can be chosen arbitrarily. The rectification homography is allocated to each imaged parallelogram in views and is easily computed by using four vertex correspondences.

 

Fig. 2 Relationship between the original H and the newly defined Hr, which is the infinite homography between the rectified images.

Download Full Size | PPT Slide | PDF

The image transformed by rectification homography can be considered to be an image on a new camera image plane. A rectified camera is defined as a camera having this new image. This camera can be considered as a virtual camera acquired by rotating the original camera about its optical center and varying the intrinsic parameters. It is worthwhile to note that the image plane of a rectified camera is parallel to the related parallelogram in 3D.

In the images transformed using Hri, it is clear that the two vanishing points deduced from the transformed parallelogram, which is rectangle, are [1, 0, 0]T and [0, 1, 0]T in the homogenous coordinate system without any calculation because the parallel lines intersect in infinity (see Fig. 2).

Since the newly deduced vanishing points from the rectangle are also a transformation of the original vanishing points, it is possible to conclude that the infinite homography, Hr, between the rectified cameras satisfies the following equations:

λx[100]T=Hr[100]T
and
λy[010]T=Hr[010]T,
where λx and λy are arbitrary scale factors. From Eqs. (2) and (3), Hr has the following form:
Hr=[λx0u0λyv00w],
where u, v, and w are arbitrary scalar values.

From Fig. 2, the relationship between the original H and the newly defined Hr is as follows:

HHr21HrHr1.
This gives equations for the elements of H and Hr.

As Eq. (5) can be obtained for each parallelogram (two sets of parallel lines), the number of equations constraining the variables is 9m + 1 (1 is for scale factor determination), where m is the number of parallelograms. Since the number of variables are 9 for H and 5m for Hrk (k = 1,⋯ ,m), 9m+1 ≥ 9+5m is required for the estimation to be possible. Thus, the required number of parallelograms is m ≥ 2. It is worthwhile to note that this requirement is also identical for the method that uses vanishing point correspondences because two parallelogram can give four vanishing points. It can be concluded that, in the new framework, there is no change in the number of constraints. Accordingly, although Eq. (5) is used to compute H, no advantage is given without using the additional constraint depicted in Section 3.1.

3.3. An additional constraint from an actual parallelogram

This section verifies that one additional constraint applies to Hr when all lines included in the two sets of parallel lines related to the imaged parallelogram lie on the same plane in 3D.

First, the camera calibration matrix of a rectified camera is deduced. The camera calibration matrix is

K=[fusu00fvv0001],
where fu and fv denote the focal length expressed in horizontal and vertical pixel dimensions, respectively, s is the skew parameter, and (u0, v0) are the pixel coordinates of the principal point.

The aspect ratio of a parallelogram is defined as the ratio of the height to the length of the bottom side. Let the four vertices of a parallelogram in 3D be [0, 0, 1]T, [1, 0, 1]T, [rcotθ + 1, r, 1]T, and [rcotθ, r, 1]T, where r is the aspect ratio of the parallelogram and θ is the angle between the two sides. [0, 0, 1]T, [1, 0, 1]T, [1, ai, 1]T, and [0, ai, 1]T, for i = 1, 2, are chosen as the four vertices of two rectangles in each view of two rectified cameras, where ai, for i = 1, 2, are the aspect ratios of the two rectangles.

In fact, it will be shown that the variables r and θ disappear from the equation for the additional constraint, thus, the metric information of the parallelogram is not necessary.

The homography mapping between the vertices of the parallelogram and the rectangle of ith camera can then be computed as

Hi=[hi1hi2hi3][1cotθ00ai/r0001].
The circular points of the plane containing the parallelogram are imaged on the rectified image to be hi1±jhi2. Let ωri be the image of the absolute conic (IAC) on the rectified image plane of the ith camera. It is then possible to obtain two equations that are linear in ωri[10]
(hi1±jhi2)Tωri(hi1±jhi2)=0
or
hi1Tωrihi2=0andhi1Tωrihi1=hi2Tωrihi2.
From Eq. (7), it can be seen that ωr has the form of
ωri[sin2θ(r/ai)sinθcosθα(r/ai)sinθcosθr2/ai2βαβγ],
where α, β, and γ are arbitrary scalar values.

If Kri represents the camera calibration matrix of the ith rectified camera, then, because ωri=KriTKri1[20], after some manipulations it is possible to obtain

Kri[sinθcosθα0(ai/r)sinθβ00γ],
where α′, β′, and γ′ are arbitrary scalar values.

As the circular points are fixed under a similarity transformation, the aforementioned selection of the vertices of the parallelogram and the rectangle does not lose generality.

Theorem 1The rotation matrix between two rectified cameras related to the same parallelogram is

RI=[100010001]orRI¯=[100010001].

Proof: Without loss of generality, it is assumed that the plane supporting the parallelogram is on Z = 0 of the world coordinate system and that [r1, r2, r3, t] is the rotation and translation which relate the world coordinate system to the camera coordinate system. It can then be seen that

HiKri[r1r2t]
or
μKri1Hi=[r1r2t],
where Hi corresponds to Eq. (6), Kri corresponds to Eq. (8), and μ is an arbitrary scale factor.

By substituting Eq. (6) and Eq. (8) into Eq. (9), the result is

μ[sinθ0α0sinθβ00γ]=[r1r2t],
where α″, β″, and γ″ are arbitrary scalar values.

Considering that ||r1|| = ||r2|| = 1 and r3 = r1 × r2, it is given that [r1, r2, r3] = RI or RĪ. Since the rotation matrix between two rectified cameras is one of RITRI, RI¯TRI¯, RITRI¯, and RI¯TRI, it is RI or RĪ.

Theorem 2 Assume that parallelograms imaged in two views are related to the same parallelogram which actually exists in 3D, and that the rectified parallelograms, which are rectangles, in each view have the aspect ratio of a1and a2, respectively, then λy = (a2/a1)λx in Eq. (4).

Proof: Let Kr1 and Kr2 be the camera calibration matrices of the two rectified cameras. Assume Rr is the rotation matrix between two rectified cameras. From Theorem 1 and Eq. (8), Hr can be computed as follows [21]:

HrKr2RrKr11[10u0a2a1v00w],
where u′, v′, and w′ are arbitrary scalar values.

As mentioned above, it is worthwhile to note that Hr is independent of the angle between the two sides of the parallelogram, θ, and the aspect ratio of the parallelogram, r. From Theorem 2, if the two cameras observe the same parallelogram which actually exists in 3D, the additional and linear constraint of λy = (a2/a1)λx can be inserted in Eq. (5) because the aspect ratios, a1 and a2, are values that can be chosen arbitrarily. If a parallelogram does not in fact exist in 3D, it is not possible to assume that r is identical for each camera due to the parallax of the two cameras (refer to Figs. 1(b) and 1(c)); the constraint cannot be used.

Assume that m parallelograms are viewed with two cameras and that mr(≤ m) is the number of actual parallelograms. From Eq. (5) and the additional constraint, the infinite homography, H, can be computed by solving the set of homogeneous equations

HHr2k1HrkHr1k,fork=1,,m
or
A[H11,H12,,H33,λx1,(λy1),u1,v1,w1,,λxm,(λym),um,vm,wm]T=0,
where A is a 9m × (9 + 5mmr) matrix composed of the element of the rectification homographies, Hr1k and Hr2k, and Hij is the (i, j) element of H. The parentheses in Eq. (12) indicate that if kth parallelogram is an actual parallelogram, the variable, λyk, is not necessary due to the additional constraint.

3.4. Comparison with related works

As mentioned in Section 3.1, four vanishing points or three vanishing points together with the fundamental matrix are required to obtain the infinite homography. However, it is difficult to acquire more than three vanishing points giving independent constraints in general man-made scenes as in Fig. 3 and it is also annoying to obtain many point correspondences between all view pairs for the fundamental matrix estimation. However, even in that case, the proposed method can obtain the infinite homography due to the additional constrains arising from the actual parallelograms. A degenerate case of the proposed method occurs only when all parallelograms are on planes parallel to each other. Moreover, even if there are four vanishing points, the additional constraints can improve the accuracy of camera calibration when metric constraints are not sufficient. This issue is explored further in Section 7.1.2 and 7.2.1.

 

Fig. 3 Examples of the position of vanishing points in general man-made scenes. Since all parallel lines are orthogonal or parallel to the ground plane, there are only three independent vanishing points. The vanishing points v2, v3, v4, v5, and v6 are collinear and on the vanishing line of the ground plane.

Download Full Size | PPT Slide | PDF

However, it can be seen that these results also can be acquired from the method of [13] because parallelepiped also gives only three vanishing points. This is due to the fact that coplanarity constraints are merged into the canonical projection matrix suggested in [13]. However, as mentioned previously, the proposed method can deal with the general position of parallelograms in practical situations due to the additional constraints and, consequently, is free from the restriction that the parallelograms should be the faces of a parallelepiped.

Additional quadric constraints can also be derived from parallelograms and can be used to calibrate camera parameters directly [15]. However, metric information of a parallelogram are needed to convert these non-linear constraint equations to linear ones. For example, a parallelogram should be a diamond or rectangle. In fact, those linear constraint equations can also be derived in the proposed method as well. Those equations are equivalent to the constraint equations in Section 5.1 when a parallelogram’s two sides are equal in length or orthogonal, which means that a parallelogram is a diamond or rectangle.

4. Reconstruction up to affine transformation

Once the infinite homography, H0i, between one reference view and ith view are computed, the camera matrices of an affine reconstruction are P0 = [I|0] and Pi=[H0i|ti][20]. If ith view do not have common images of two or more parallelogrmas with reference view, some manipulations may be needed using intermediate view. If the infinite homographies of Hji and H0j are given, H0i can be computed as follows:

H0i=HjiH0j.
The variables remaining to be solved are camera motion t and 3D points . The equation for point projection is
[uv1]P[X˜1][h1Tt1h2Tt2h3Tt3][X˜1].
By taking vector product of the two sides of Eq. (13), two independent homogeneous equations
[uh3Th1T10uvh3Th2T01v][X˜t]=0
are obtained. Thus, 2nm set of equations in 3n+ 3(m − 1) unknowns is generated with m views and n 3D points. It is worthwhile to note that t of the reference view P0 is assumed to be 0 without loss of generality. These equations can be solved linearly to obtain t and of an affine reconstruction. It is worthwhile to note that all 3D points need not to be visible in all views in this process.

4.1. Parameter reduction using affine invariance

If there are parallelograms or parallelepipeds in views, the number of variables can be reduced with geometric constraints. In affine reconstruction process, all corner points of parallelograms or parallelepipeds are represented by one reference point and two or three vectors. Moreover, when using the above affine reconstruction formulation, the coordinate of vanishing point in the first view can be considered as a direction vector that is parallel to the line segments corresponding to this vanishing point. Representing the points on edges and surfaces of parallelograms and parallelepipeds with weighted sum of these vectors, coplanarity and colinearity constraints can be satisfied. If the length ratios between line segments parallel to each other are given, these constraints can be implemented as well using ratios of scalar weight. All these constraints are affine invariant.

Let D be a column vector containing the scalar weights for above direction vectors and the coordinates of 3D points, which cannot be represented by weighted sum of these vectors. Then can be represented by

X˜=ΣD,
where Σ contains the direction vectors and the geometric constraints described above. Then, Eq. (14) can be rewritten as
[(xh3Th1)Σ10x(yh3Th2)Σ01y][Dt]=0.
After solving these equations, can be obtained by Eq. (15). Due to this parameterization, the reconstructed vertices constitute parallelograms and parallelepipeds exactly and the coplanarity and colinearity constraints for the points on the edges and surfaces of parallelograms and parallelepipeds are exactly satisfied.

If non-linear optimization are performed to refine the linear estimation results, these parameter reduction is helpful to satisfy the geometric constraints described in this subsection. If these constraints are not embedded through the reduction, extra complex terms describing the geometric constraints between the 3D points should be added to the cost function.

4.2. Comments on linear reconstruction formula

The linear reconstruction formula described above is similar to the method presented in [12]. In that work, a reference plane visible in all views are needed for linear projective reconstruction. If three orthogonal vanishing points, which span the plane at infinity, are detected in all views, internal parameters and metric reconstruction can be obtained linearly. It was assumed that the camera had zero skew and known aspect ratio or fixed internal parameters. However, the assumptions about aspect ratio and fixed internal parameters are not valid for arbitrary archival images or images gathered from the internet while modelling a scene.

The problem of the method utilizing orthogonal vanishing points as in [12] and [9] is a degenerate case where one or two of the vanishing points are at infinity. This case often occurs when the parallel lines are near parallel to image plane. However, this problem does not occur in the proposed method while estimating the infinity homography and reconstructing affine structure from parallelism. Moreover, the orthogonality can be used as well without omission in the process of the metric reconstruction described in the next section.

The structure parameter reduction is similar to [11] presenting single view based approach and is also suggested in the linear step in [14]. However, in [14], given that calibration and rotation parameters are estimated previously by other calibration methods, the parameterization using the scene geometry is done in metric space not in affine space. It is also straightforward that the regularity and symmetry information used in [14] can be implemented during the overall process of the proposed method by using the constraints of length ratios in Section 4.1 and orthogonality in 5.1. This means that, in the proposed method, full geometric information contributes to the camera calibration and reconstruction simultaneously because the calibration results are obtained at the last step.

5. Upgrade to metric reconstruction

Let HEA be the projective transformation from metric to affine space and has the following form

HEA=[A001].
To convert affine reconstructions obtained in the previous section to metric ones, we should find constraints on A. This is equivalent to obtain constraints on the absolute conic ΩA in affine space. A can be obtained from ΩA(=ATA1) by Cholesky factorization. The constraints described in the following subsections can be used to estimate the matrix ΩA.

5.1. Constraints from scene geometry

Assume that there are four points, Ei, for i = 1, ⋯ ,4, in metric space and two vectors, dE1 = E2E1 and dE2 = E4E3. If four points in affine space corresponding to the above four points are Ai, for i = 1, ⋯,4, and dA1 = A2A1 and dA2 = A4A3, then, dE1 = A−1dA1 and dE2 = A−1dA2. Assume that dE1 and dE2 are orthogonal to each other in metric space. Then, since dE1TdE2=0, we can obtain following constraint

dA1TΩAdA2=0.
This equation is similar to the equation describing the relation between the image of absolute conic (IAC) and orthogonal vanishing points. However, Eq. (18) needs the finite 3D points reconstructed in affine space and do not need particular points such as vanishing points.

If we know the ratio of dE1 to dE2 as r, then,

dE1TdE1/dE2TdE2=r2
or
dA1TΩAdA1=r2dA2TΩAdA2.
Moreover, if we know additionally the angle between dE1 and dE2 as θ, then,
cosθ=dE1TdE2(dE1TdE1)1/2(dE2TdE2)1/2
or
dA1TΩAdA2=rcosθdA2TΩAdA2.
It is worthwhile to note that all equations derived in this section can be solved linearly.

5.2. Constraints from partial knowledge of camera parameters

The absolute dual quadric (ADQ), Q*A, in affine space can be computed from HEAQ*EHEAT[20], where Q*E is ADQ in metric space, and has the following form

Q*A=[(ΩA)1000].

The IAC, ωi, in ith view can be related to Q*A as follows [20]:

ωi(PiQ*APiT)1=H0iTΩAH0i1.

If the internal camera parameters are static, we can set all {ωi} to be ω in Eq. (22) and, then, ω can be eliminated using the relation ω=ΩA in reference view. These results in linear equations on ΩA provided H0i is normalized as det(H0i)=1.

Knowing that pixels are rectangles (zero camera skew), and principal point is at origin, and aspect ratio is known as r, the following linear equations on ΩA can be obtained from Eq. (22):

{(H0iTΩAH0i1)12=0(H0iTΩAH0i1)13=(H0iTΩAH0i1)23=0r2(H0iTΩAH0i1)11=(H0iTΩAH0i1)22.

5.3. Comparison with related works

The constraints in Section 5.1 and 5.2 can be derived by basic knowledge of projective geometry. These constraints are somewhat equivalent to the constraints in [13] that is extracted by the prior knowledge about intrinsic parameters of cameras and parallelepipeds. The difference is that the 3D points used to derive the constraints are not only the vertices of parallelograms or parallelepipeds but also any 3D points in a scene in the proposed method.

6. Outline of the algorithm

In summary, the complete algorithm is composed of the following steps:

  1. Compute the rectification homographies for all parallelograms commonly viewed at the cameras.
  2. Establish a linear equation system on H and Hrk’s for camera pairs and solve it.
  3. Compute the the infinity homographies between one reference view and the other views.
  4. Establish a linear equation system on D and t based on affine invariant geometric constraints related to parallelograms and parallelepipeds and solve it.
  5. Construct a linear equation system on ΩA based on prior knowledge of scene geometry and intrinsic camera parameters and solve it.
  6. Extract A from ΩA using Cholesky factorization.
  7. Upgrade the affine reconstruction results obtained from (4) to metric ones with HEA.

7. Experimental results

7.1. Results on synthetic data

7.1.1. Performance evaluation

Simulated experiments were performed in order to make careful analysis of the performance of the algorithm in various noise magnitude, parallelogram size, and singular configurations. The scenario is shown in Fig. 4. Tests were performed with synthetic 1024×768 images, taken by three cameras with the following intrinsic parameters: (fu, fv,s,u0,v0)=(1200, 1000, 0, 512, 384). Two parallelograms were placed in front of three cameras and arranged so that there are only three vanishing points. The distances from the cameras to the parallelograms was about 8m. The cameras were placed horizontally and had arbitrary roll angles. The distances between the neighborhood cameras were about 6m. The constraints used in this experiment were: orthogonality of the edge of the parallelograms and zero skew angle of the cameras.

 

Fig. 4 The environment of the simulated experiment for performance evaluation. The arrows illustrate the systematical variation of the relevant parameters.

Download Full Size | PPT Slide | PDF

The data normalization described in [22] was used for all experiments in this paper. For each parameters, 1,000 independent trials were performed and the results shown are the average. The estimated results are relative quantities and determined up to scale. To obtain position error, the estimated position of the vertices of the parallelograms are fitted to the ground truth using the method of [23].

First, the performance was evaluated with respect to the noise magnitude. Zero-mean uniformly distributed noise over the [−n, n] pixel was added to the image projections of the vertices of the parallelograms where n = 0.0, 0.25, 0.5,⋯ ,2. The edge size of the parallelograms and the angle between the planes of the two parallelograms were set to 1.5m and 120°, respectively. The results are shown at Figs. 5(a) and 5(d). We can see that the errors increase linearly with increasing the noise magnitude.

 

Fig. 5 The results from the simulated experiments to analyze the relation between the performance and noise magnitude ((a) and (d)), size of parallelograms ((b) and (e)), and angle between the two planes ((c) and (f)). [tx, ty, tz] and [X, Y, Z] indicate the position error of the estimated cameras and parallelograms, respectively.

Download Full Size | PPT Slide | PDF

Second, we tested the performance while varying the edge size of the parallelograms from 0.5m to 2.0m. In this experiments, the angle between the planes of the two parallelograms were set to 120° and zero-mean uniformly-distributed noise over the interval [−0.5, 0.5](pixel) was added. The results are shown in Figs. 5(b) and 5(e). We can see that the algorithm can acquire reasonable results for the edge size over 1.0m.

Third, we tested the performance while varying the angle between the planes of the two parallelograms from 50° to 170°. In this experiments, the edge size of the parallelograms were set to 1.5m and zero-mean uniformly-distributed noise over the interval [−0.5, 0.5] pixel was added. The results are shown in Figs. 5(c) and 5(f). We can see that the results are unstable when the angle is near to 100° and 170°. In case of 100°, parallelograms are nearly perpendicular to the image planes of the side cameras. In case of 170°, two parallelograms are almost parallel.

7.1.2. Effect of additional constraints

As mentioned in Section 3.4, the additional constraints can contribute to obtaining the infinite homography even in the case of three vanishing point. In these simulated experiments, the additional accuracy effect of the additional constraint is evaluated. When there are no geometric information except for two parallelograms in a scene, the camera parameters estimated is evaluated. The prior knowledge used is that cameras have a static intrinsic parameters.

Comparisons were made with the following algorithms:

  • Corr: The method using vanishing point correspondences.
  • NAC: The method using the proposed framework not including the additional constraints.
  • AC: The proposed method including the additional constraints.
  • F: Using the infinite homography obtained through the fundamental matrix estimation and the projective reconstruction of the cameras and the plane at infinity [20].
  • Zhang: The classical method of Zhang [2].

The simulated camera had the following parameters: fx = 1, 200, fy = 1, 000, s = 0, u0 = 512, and v0 = 384. The resolution of the simulated image was 1024 × 768. Two parallelograms of side 1m were placed around the origin. The angle between the plans containing each parallelogram was 90°.

Zero-mean uniformly distributed noise over the [−n, n] pixel was added to the image projections of the vertices of the parallelograms where n = 0.25, 0.5, 0.75,⋯ ,2. For each method and noise level, 1,000 independent trials were performed and the results shown are the average. For each trial, three images were taken from the cameras randomly placed 5m ahead of the origin and pointing towards the origin so that the faces of the parallelograms can be viewed. The number of features for the fundamental matrix estimation of F was 20, and the features were randomly distributed over the 3D region containing the two parallelograms. For Zhang, it was assumed that the metric information of the vertices of the parallelograms was given. The estimation results of these experiments are shown in Fig. 6. Since the performances are highly similar both for fu and fv and for uo and vo, the estimation results for fv and vo are not depicted here.

 

Fig. 6 The results of the camera parameter estimates with simulated data: The mean of the absolute error of the calibration parameters are shown as a function of the noise levels for various methods. The cases in which three and four vanishing points exist are indicated by 3vps and 4vps, respectively. (a) and (d) refer to fu. (b) and (e) refer to the skew angle. (c) and (f) refer to u0.

Download Full Size | PPT Slide | PDF

The results with four vanishing point correspondences are shown in Figs. 6(a), 6(b), and 6(c). If none of the sides of the parallelograms are parallel to the plane containing the other parallelogram, there are four vanishing point correspondences, and three of these are not collinear. It is clear that the performance of AC is superior to that of Corr and NAC. It is worthwhile to note that the performances of Corr and NAC are similar, as expected in Section 3.2. The results of AC are similar with those of F, which require the fundamental matrix estimation from many feature correspondences.

If one side of one of the parallelograms is parallel to one side of the other parallelogram as in Fig. 4, only three vanishing points exist. A comparison between the performances of the methods is shown in Figs. 6(d), 6(e), and 6(f) when the number of vanishing points is three or four. Since the case of three vanishing points is a degenerate case for Corr and NAC, no results are depicted for these methods. However, AC and F can provide a solution even in the case of three vanishing points. It was found that the estimation results of AC are nearly identical for the two cases. The estimation results of F with three vanishing points are degraded comparing to the case of four vanishing points. From these result, it can be concluded that AC is superior to F in practical situations in which there are only three vanishing points.

The camera parameters from AC are comparable to those of Zhang. However, it is worthwhile to note that AC uses only parallelism and does not require the metric information that is necessary when using Zhang.

7.2. Results on real images

Various experiments with real images were also performed to test the algorithms. All line segments in the images were extracted automatically [24] (See Fig. 7) and the line segments corresponding to the parallelogram’s sides were selected manually. The vertices of the parallelograms were extracted from the intersections of the lines.

 

Fig. 7 Line extraction examples for the Plant Scene experiment presented in Section 7.2.2.

Download Full Size | PPT Slide | PDF

7.2.1. Tower scene

The resolution of the images was 1024 × 768. The images used in this experiment are shown in Fig. 8. The two parallelograms denoted in Fig. 8(a) with the white dotted lines were used as the input for the algorithms. The prior information used for the algorithm was that cameras have static intrinsic parameters. Fig. 8(d) shows the reconstructed model and the camera poses. Rendered new views of the reconstructed model are shown in Figs. 8(e) and 8(f).

 

Fig. 8 Three images used in the Tower Scene experiment and reconstructed model and camera poses.

Download Full Size | PPT Slide | PDF

Since the ground truth or any references for the reconstruction results are not available in this experiment, the accuracy of the reconstruction for the known geometry that is nevertheless not used in the algorithm is a useful measure of the performances of the algorithms. The line a and the plane consisting of the two lines b and c depicted in Fig. 8(c) were reconstructed. It was known that the angle between the line a and the normal to the plane was zero. The measure of this angle was 2.22° through the proposed method. In this experiment, there were four vanishing points and the method using vanishing point correspondences also can be used. Using that method, the measurement was 16.81°. From these results, it can be seen that the proposed method give more accurate calibration results owing to the additional constraints even in the case of four vanishing points when the static camera constraint is only used. If the geometric constraints were sufficient in this case, the results from the two methods were comparable.

7.2.2. Plant scene

The resolution of the images was 1024 × 768 and the camera parameters were not static while the images were captured. Four captured images are shown in Fig. 9. Image 0 corresponds to the reference camera. The infinite homographies H01, H12, and H23 are computed in this experiments.

 

Fig. 9 Four captured images for Plant Scene experiment. (a) Image 0. (b) Image 1. (c) Image 2. (d) Image 3.

Download Full Size | PPT Slide | PDF

H01 can be obtained with the two parallelograms corresponding to the frontal and right face B of the building. However, it looks like that one parallelogram is insufficient between Image 1 and Image 2 because at least two parallelograms not parallel to each other are required to obtain sufficient constraints. In this case, it can be considered that the right and left faces (B and C) of the building are same parallelograms. This is due to the fact that the proposed method computing the infinite homogrphy is independent to the translational motion of the parallelograms. So, face A, B, and C are used to compute H12. Thus, in contrast to the method described in [13], since the proposed method can use parallelograms and partially overlapped images, it is more flexible in use. For H23, face A and C are used. Metric constraints used in this experiment were: orthogonality of the edge of the building, right angle between line segment a and b, and length ratio of c to d. Fig. 10 shows the reconstructed model and the camera poses in new view positions.

 

Fig. 10 Reconstructed model and camera poses for Plant Scene experiment.

Download Full Size | PPT Slide | PDF

7.2.3. Scene of the Bank of China

Five images of the Bank of China in Hong Kong were gathered from the internet(see Fig. 11). The resolution of the images was various from 419 × 783 to 1536 × 2048.

 

Fig. 11 Five images for the experiment of the scene of Bank of China. (a) Image 0. (b) Image 1. (c) Image 2. (d) Image 3. (e) Image 4.

Download Full Size | PPT Slide | PDF

H01, H02, H23, and H24 are computed using {A, C}, {A, B, D}, {D, E}, and {A, D, F, G}, respectively. It is worthwhile to note that there are no parallelepipeds of which at least six vertices are seen across Image 0 and Image 2 and the method of [13] cannot be applied. However, we can find common parallelograms. It is also the same for the pair of Image 2 and Image 3. {A, G} and {D, F} can be considered as the same parallelograms respectively in the estimation of H24 as explained in Section 7.2.2. Metric constraints used in this experiment were: orthogonality of the edge of the building and orthogonality of crossing diagonal line. Reconstructed model and the camera poses are shown in Fig. 12.

 

Fig. 12 Reconstructed model and camera poses for the experiment of Bank of China.

Download Full Size | PPT Slide | PDF

7.2.4. Scene of the Casa da Música

In this section, reconstruction of the Casa da Música in Porto is presented. Ten images were collected from the internet (see Fig. 13). The resolution of the images was various from 640 × 480 to 2313×2736. It is also noted that there are no parallelepipeds of which at least six vertices are seen across the images and the method of [13] cannot be applied.

 

Fig. 13 Ten images for the experiment of the scene of Casa da Música. (a) Image 0. (b) Image 1. (c) Image 2. (d) Image 3. (e) Image 4. (f) Image 5. (g) Image 6. (h) Image 7. (i) Image 8. (j) Image 9.

Download Full Size | PPT Slide | PDF

The infinite homographies H0i (i = 1,⋯ ,5) were computed using the faces A, B, and C. The infinite homographies H67, H78, and H89 were based on the faces D, E, F, and G. In this experiment, since two parallelograms commonly viewed across the image group {0,⋯ ,5} and {6,⋯ ,9} were not found, the reconstruction process was separately applied to each image group. Then, the two reconstruction results were merged so that the four vertices a, b, c, and d were aligned (see the points depicted in Figs. 13(a), 13(c), 13(h), and 13(i)).

Metric constraints used in this experiment were: right angles for the parallelograms and zero camera skew. The reconstructed model and the camera poses are shown in Fig. 14.

 

Fig. 14 Reconstructed model and camera poses for the experiment of Casa da Música.

Download Full Size | PPT Slide | PDF

8. Conclusions

In this paper, a novel framework was introduced to reconstruct a 3D scene using uncalibrated views and geometric information of the scene. The proposed method is a stratified and linear method. It was shown that all of the constraints derived in the previous works can be implemented in the proposed method.

The proposed method uses the infinite homography estimated together with the additional constraints. The additional constraints arising from the parallelograms are clearly visible when developing a novel framework for estimating the infinite homography via rectification of parallelograms. It was shown that the novel approach including the additional constraints has two advantages. First, the camera parameters with greater accuracy can be acquired when the geometric constraints are not sufficient. Second, even if only three vanishing points exist in the views, the infinite homography can be computed without the need of estimating the fundamental matrix.

The proposed method was tested with both simulated data and real images to show its advantages. The study on the situations that can not be dealt with by the previous approaches was presented and it was shown that the proposed method is more flexible in that the cases can be handled by the proposed method.

Acknowledgments

This work was supported by the strategic technology development program of MCST/MKE/KEIT [KI001798, Development of Full 3D Reconstruction Technology for Broadcasting Communication Fusion].

References and links

1. R. Tsai, “A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf tv cameras and lenses,” IEEE Trans. Robot. Autom. 3, 323–344 (1987). [CrossRef]  

2. Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. 22, 1330–1334 (2000). [CrossRef]  

3. J.-H. Kim and B.-K. Koo, “Convenient calibration method for unsynchronized camera networks using an inaccurate small reference object,” Opt. Express 20, 25292–25310 (2012). [CrossRef]   [PubMed]  

4. M. Pollefeys and L. V. Gool, “Stratified self-calibration with the modulus constraint,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 707–724 (1999). [CrossRef]  

5. M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004). [CrossRef]  

6. T. Moons, L. V. Gool, M. Proesmans, and E. Pauwels, “Affine reconstruction from perspective image pairs with a relative object-camera translation in between,” IEEE Trans. Pattern Anal. Mach. Intell. 18, 77–83 (1996). [CrossRef]  

7. P. Hammarstedt, F. Kahl, and A. Heyden, “Affine reconstruction from translational motion under various auto-calibration constraints,” J. Math. Imaging Vis. 24, 245–257 (2006). [CrossRef]  

8. L. Agapito, E. Hayman, and I. Reid, “Self-calibration of rotating and zooming cameras,” Int. J. Comput. Vision 45, 107–127 (2001). [CrossRef]  

9. R. Cipolla, T. Drummond, and D. P. Robertson, “Camera calibration from vanishing points in images of architectural scenes,” in “Proc. British Machine Vision Conferece,” (Nottingham, England, 1999), pp. 382–391.

10. D. Liebowitz and A. Zisserman, “Combining scene and auto-calibration constraints,” in “Proc. IEEE International Conference on Computer Vision,” (Kerkyra, Greece, 1999), pp. 293–300. [CrossRef]  

11. D. Jelinek and C. J. Taylor, “Reconstruction of linearly parameterized models from single images with a camera of unknown focal length,” IEEE Trans. Pattern Anal. Mach. Intell. 23, 767–773 (2001). [CrossRef]  

12. C. Rother and S. Carlsson, “Linear multi view reconstruction and camera recovery using a reference plane,” Int. J. Comput. Vision 49, 117–141 (2002). [CrossRef]  

13. M. Wilczkowiak, P. Sturm, and E. Boyer, “Using geometric constraints through parallelepipeds for calibration and 3D modelling,” IEEE Trans. Pattern Anal. Mach. Intell. 27, 194–207 (2005). [CrossRef]   [PubMed]  

14. E. Grossmann and J. Santos-Victor, “Least-squares 3D reconstruction from one or more views and geometric clues,” Comput. Vis. Image Und. 99, 151–174 (2005). [CrossRef]  

15. F. C. Wu, F. Q. Duan, and Z. Y. Hu, “An affine invariant of parallelograms and its application to camera calibration and 3D reconstruction,” in “Proc. European Conference on Computer Vision,” (2006), pp. 191–204.

16. L. G. de la Fraga and O. Schutze, “Direct calibration by fitting of cuboids to a single image using differential evolution,” Int. J. Comput. Vision 80, 119–127 (2009). [CrossRef]  

17. N. Jiang, P. Tan, and L.-F. Cheong, “Symmetric architecture modeling with a single image,” ACM T. Graphic. (Proc. SIGGRAPH Asia) 28 (2009). [CrossRef]  

18. F. Mai, Y. S. Hung, and G. Chesi, “Projective reconstruction of ellipses from multiple images,” Pattern Recogn. 43, 545–556 (2010). [CrossRef]  

19. K.-Y. K. Wong, G. Zhang, and Z. Chen, “A stratified approach for camera calibration using spheres,” IEEE Trans. Image Process. 20, 305–316 (2011). [CrossRef]  

20. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Second Edition (Cambridge University Press, 2003).

21. Q.-T. Luong and T. Viéville, “Canonical representations for the geometries of multiple perspective views,” Comput. Vis. Image Und. 64, 193–229 (1996). [CrossRef]  

22. R. Hartley, “In defence of the 8-point algorithm,” in “Proc. International Conference on Computer Vision,” (Sendai, Japan, 1995), pp. 1064–1070.

23. B. K. P. Horn, H. M. Hilden, and S. Negahdaripour, “Closed form solution of absolute orientation using orthonormal matrices,” J. Opt. Soc. Am 5, 1127–1135 (1988). [CrossRef]  

24. D. A. Forsyth and J. Ponce, Computer Vision: A Modern Approach (Prentice Hall, 2003).

References

  • View by:
  • |
  • |
  • |

  1. R. Tsai, “A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf tv cameras and lenses,” IEEE Trans. Robot. Autom. 3, 323–344 (1987).
    [Crossref]
  2. Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. 22, 1330–1334 (2000).
    [Crossref]
  3. J.-H. Kim and B.-K. Koo, “Convenient calibration method for unsynchronized camera networks using an inaccurate small reference object,” Opt. Express 20, 25292–25310 (2012).
    [Crossref] [PubMed]
  4. M. Pollefeys and L. V. Gool, “Stratified self-calibration with the modulus constraint,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 707–724 (1999).
    [Crossref]
  5. M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004).
    [Crossref]
  6. T. Moons, L. V. Gool, M. Proesmans, and E. Pauwels, “Affine reconstruction from perspective image pairs with a relative object-camera translation in between,” IEEE Trans. Pattern Anal. Mach. Intell. 18, 77–83 (1996).
    [Crossref]
  7. P. Hammarstedt, F. Kahl, and A. Heyden, “Affine reconstruction from translational motion under various auto-calibration constraints,” J. Math. Imaging Vis. 24, 245–257 (2006).
    [Crossref]
  8. L. Agapito, E. Hayman, and I. Reid, “Self-calibration of rotating and zooming cameras,” Int. J. Comput. Vision 45, 107–127 (2001).
    [Crossref]
  9. R. Cipolla, T. Drummond, and D. P. Robertson, “Camera calibration from vanishing points in images of architectural scenes,” in “Proc. British Machine Vision Conferece,” (Nottingham, England, 1999), pp. 382–391.
  10. D. Liebowitz and A. Zisserman, “Combining scene and auto-calibration constraints,” in “Proc. IEEE International Conference on Computer Vision,” (Kerkyra, Greece, 1999), pp. 293–300.
    [Crossref]
  11. D. Jelinek and C. J. Taylor, “Reconstruction of linearly parameterized models from single images with a camera of unknown focal length,” IEEE Trans. Pattern Anal. Mach. Intell. 23, 767–773 (2001).
    [Crossref]
  12. C. Rother and S. Carlsson, “Linear multi view reconstruction and camera recovery using a reference plane,” Int. J. Comput. Vision 49, 117–141 (2002).
    [Crossref]
  13. M. Wilczkowiak, P. Sturm, and E. Boyer, “Using geometric constraints through parallelepipeds for calibration and 3D modelling,” IEEE Trans. Pattern Anal. Mach. Intell. 27, 194–207 (2005).
    [Crossref] [PubMed]
  14. E. Grossmann and J. Santos-Victor, “Least-squares 3D reconstruction from one or more views and geometric clues,” Comput. Vis. Image Und. 99, 151–174 (2005).
    [Crossref]
  15. F. C. Wu, F. Q. Duan, and Z. Y. Hu, “An affine invariant of parallelograms and its application to camera calibration and 3D reconstruction,” in “Proc. European Conference on Computer Vision,” (2006), pp. 191–204.
  16. L. G. de la Fraga and O. Schutze, “Direct calibration by fitting of cuboids to a single image using differential evolution,” Int. J. Comput. Vision 80, 119–127 (2009).
    [Crossref]
  17. N. Jiang, P. Tan, and L.-F. Cheong, “Symmetric architecture modeling with a single image,” ACM T. Graphic. (Proc. SIGGRAPH Asia)  28 (2009).
    [Crossref]
  18. F. Mai, Y. S. Hung, and G. Chesi, “Projective reconstruction of ellipses from multiple images,” Pattern Recogn. 43, 545–556 (2010).
    [Crossref]
  19. K.-Y. K. Wong, G. Zhang, and Z. Chen, “A stratified approach for camera calibration using spheres,” IEEE Trans. Image Process. 20, 305–316 (2011).
    [Crossref]
  20. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Second Edition (Cambridge University Press, 2003).
  21. Q.-T. Luong and T. Viéville, “Canonical representations for the geometries of multiple perspective views,” Comput. Vis. Image Und. 64, 193–229 (1996).
    [Crossref]
  22. R. Hartley, “In defence of the 8-point algorithm,” in “Proc. International Conference on Computer Vision,” (Sendai, Japan, 1995), pp. 1064–1070.
  23. B. K. P. Horn, H. M. Hilden, and S. Negahdaripour, “Closed form solution of absolute orientation using orthonormal matrices,” J. Opt. Soc. Am 5, 1127–1135 (1988).
    [Crossref]
  24. D. A. Forsyth and J. Ponce, Computer Vision: A Modern Approach (Prentice Hall, 2003).

2012 (1)

2011 (1)

K.-Y. K. Wong, G. Zhang, and Z. Chen, “A stratified approach for camera calibration using spheres,” IEEE Trans. Image Process. 20, 305–316 (2011).
[Crossref]

2010 (1)

F. Mai, Y. S. Hung, and G. Chesi, “Projective reconstruction of ellipses from multiple images,” Pattern Recogn. 43, 545–556 (2010).
[Crossref]

2009 (2)

L. G. de la Fraga and O. Schutze, “Direct calibration by fitting of cuboids to a single image using differential evolution,” Int. J. Comput. Vision 80, 119–127 (2009).
[Crossref]

N. Jiang, P. Tan, and L.-F. Cheong, “Symmetric architecture modeling with a single image,” ACM T. Graphic. (Proc. SIGGRAPH Asia)  28 (2009).
[Crossref]

2006 (1)

P. Hammarstedt, F. Kahl, and A. Heyden, “Affine reconstruction from translational motion under various auto-calibration constraints,” J. Math. Imaging Vis. 24, 245–257 (2006).
[Crossref]

2005 (2)

M. Wilczkowiak, P. Sturm, and E. Boyer, “Using geometric constraints through parallelepipeds for calibration and 3D modelling,” IEEE Trans. Pattern Anal. Mach. Intell. 27, 194–207 (2005).
[Crossref] [PubMed]

E. Grossmann and J. Santos-Victor, “Least-squares 3D reconstruction from one or more views and geometric clues,” Comput. Vis. Image Und. 99, 151–174 (2005).
[Crossref]

2004 (1)

M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004).
[Crossref]

2002 (1)

C. Rother and S. Carlsson, “Linear multi view reconstruction and camera recovery using a reference plane,” Int. J. Comput. Vision 49, 117–141 (2002).
[Crossref]

2001 (2)

L. Agapito, E. Hayman, and I. Reid, “Self-calibration of rotating and zooming cameras,” Int. J. Comput. Vision 45, 107–127 (2001).
[Crossref]

D. Jelinek and C. J. Taylor, “Reconstruction of linearly parameterized models from single images with a camera of unknown focal length,” IEEE Trans. Pattern Anal. Mach. Intell. 23, 767–773 (2001).
[Crossref]

2000 (1)

Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. 22, 1330–1334 (2000).
[Crossref]

1999 (1)

M. Pollefeys and L. V. Gool, “Stratified self-calibration with the modulus constraint,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 707–724 (1999).
[Crossref]

1996 (2)

T. Moons, L. V. Gool, M. Proesmans, and E. Pauwels, “Affine reconstruction from perspective image pairs with a relative object-camera translation in between,” IEEE Trans. Pattern Anal. Mach. Intell. 18, 77–83 (1996).
[Crossref]

Q.-T. Luong and T. Viéville, “Canonical representations for the geometries of multiple perspective views,” Comput. Vis. Image Und. 64, 193–229 (1996).
[Crossref]

1988 (1)

B. K. P. Horn, H. M. Hilden, and S. Negahdaripour, “Closed form solution of absolute orientation using orthonormal matrices,” J. Opt. Soc. Am 5, 1127–1135 (1988).
[Crossref]

1987 (1)

R. Tsai, “A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf tv cameras and lenses,” IEEE Trans. Robot. Autom. 3, 323–344 (1987).
[Crossref]

Agapito, L.

L. Agapito, E. Hayman, and I. Reid, “Self-calibration of rotating and zooming cameras,” Int. J. Comput. Vision 45, 107–127 (2001).
[Crossref]

Boyer, E.

M. Wilczkowiak, P. Sturm, and E. Boyer, “Using geometric constraints through parallelepipeds for calibration and 3D modelling,” IEEE Trans. Pattern Anal. Mach. Intell. 27, 194–207 (2005).
[Crossref] [PubMed]

Carlsson, S.

C. Rother and S. Carlsson, “Linear multi view reconstruction and camera recovery using a reference plane,” Int. J. Comput. Vision 49, 117–141 (2002).
[Crossref]

Chen, Z.

K.-Y. K. Wong, G. Zhang, and Z. Chen, “A stratified approach for camera calibration using spheres,” IEEE Trans. Image Process. 20, 305–316 (2011).
[Crossref]

Cheong, L.-F.

N. Jiang, P. Tan, and L.-F. Cheong, “Symmetric architecture modeling with a single image,” ACM T. Graphic. (Proc. SIGGRAPH Asia)  28 (2009).
[Crossref]

Chesi, G.

F. Mai, Y. S. Hung, and G. Chesi, “Projective reconstruction of ellipses from multiple images,” Pattern Recogn. 43, 545–556 (2010).
[Crossref]

Cipolla, R.

R. Cipolla, T. Drummond, and D. P. Robertson, “Camera calibration from vanishing points in images of architectural scenes,” in “Proc. British Machine Vision Conferece,” (Nottingham, England, 1999), pp. 382–391.

Cornelis, K.

M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004).
[Crossref]

de la Fraga, L. G.

L. G. de la Fraga and O. Schutze, “Direct calibration by fitting of cuboids to a single image using differential evolution,” Int. J. Comput. Vision 80, 119–127 (2009).
[Crossref]

Drummond, T.

R. Cipolla, T. Drummond, and D. P. Robertson, “Camera calibration from vanishing points in images of architectural scenes,” in “Proc. British Machine Vision Conferece,” (Nottingham, England, 1999), pp. 382–391.

Duan, F. Q.

F. C. Wu, F. Q. Duan, and Z. Y. Hu, “An affine invariant of parallelograms and its application to camera calibration and 3D reconstruction,” in “Proc. European Conference on Computer Vision,” (2006), pp. 191–204.

Forsyth, D. A.

D. A. Forsyth and J. Ponce, Computer Vision: A Modern Approach (Prentice Hall, 2003).

Gool, L. V.

M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004).
[Crossref]

M. Pollefeys and L. V. Gool, “Stratified self-calibration with the modulus constraint,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 707–724 (1999).
[Crossref]

T. Moons, L. V. Gool, M. Proesmans, and E. Pauwels, “Affine reconstruction from perspective image pairs with a relative object-camera translation in between,” IEEE Trans. Pattern Anal. Mach. Intell. 18, 77–83 (1996).
[Crossref]

Grossmann, E.

E. Grossmann and J. Santos-Victor, “Least-squares 3D reconstruction from one or more views and geometric clues,” Comput. Vis. Image Und. 99, 151–174 (2005).
[Crossref]

Hammarstedt, P.

P. Hammarstedt, F. Kahl, and A. Heyden, “Affine reconstruction from translational motion under various auto-calibration constraints,” J. Math. Imaging Vis. 24, 245–257 (2006).
[Crossref]

Hartley, R.

R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Second Edition (Cambridge University Press, 2003).

R. Hartley, “In defence of the 8-point algorithm,” in “Proc. International Conference on Computer Vision,” (Sendai, Japan, 1995), pp. 1064–1070.

Hayman, E.

L. Agapito, E. Hayman, and I. Reid, “Self-calibration of rotating and zooming cameras,” Int. J. Comput. Vision 45, 107–127 (2001).
[Crossref]

Heyden, A.

P. Hammarstedt, F. Kahl, and A. Heyden, “Affine reconstruction from translational motion under various auto-calibration constraints,” J. Math. Imaging Vis. 24, 245–257 (2006).
[Crossref]

Hilden, H. M.

B. K. P. Horn, H. M. Hilden, and S. Negahdaripour, “Closed form solution of absolute orientation using orthonormal matrices,” J. Opt. Soc. Am 5, 1127–1135 (1988).
[Crossref]

Horn, B. K. P.

B. K. P. Horn, H. M. Hilden, and S. Negahdaripour, “Closed form solution of absolute orientation using orthonormal matrices,” J. Opt. Soc. Am 5, 1127–1135 (1988).
[Crossref]

Hu, Z. Y.

F. C. Wu, F. Q. Duan, and Z. Y. Hu, “An affine invariant of parallelograms and its application to camera calibration and 3D reconstruction,” in “Proc. European Conference on Computer Vision,” (2006), pp. 191–204.

Hung, Y. S.

F. Mai, Y. S. Hung, and G. Chesi, “Projective reconstruction of ellipses from multiple images,” Pattern Recogn. 43, 545–556 (2010).
[Crossref]

Jelinek, D.

D. Jelinek and C. J. Taylor, “Reconstruction of linearly parameterized models from single images with a camera of unknown focal length,” IEEE Trans. Pattern Anal. Mach. Intell. 23, 767–773 (2001).
[Crossref]

Jiang, N.

N. Jiang, P. Tan, and L.-F. Cheong, “Symmetric architecture modeling with a single image,” ACM T. Graphic. (Proc. SIGGRAPH Asia)  28 (2009).
[Crossref]

Kahl, F.

P. Hammarstedt, F. Kahl, and A. Heyden, “Affine reconstruction from translational motion under various auto-calibration constraints,” J. Math. Imaging Vis. 24, 245–257 (2006).
[Crossref]

Kim, J.-H.

Koch, R.

M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004).
[Crossref]

Koo, B.-K.

Liebowitz, D.

D. Liebowitz and A. Zisserman, “Combining scene and auto-calibration constraints,” in “Proc. IEEE International Conference on Computer Vision,” (Kerkyra, Greece, 1999), pp. 293–300.
[Crossref]

Luong, Q.-T.

Q.-T. Luong and T. Viéville, “Canonical representations for the geometries of multiple perspective views,” Comput. Vis. Image Und. 64, 193–229 (1996).
[Crossref]

Mai, F.

F. Mai, Y. S. Hung, and G. Chesi, “Projective reconstruction of ellipses from multiple images,” Pattern Recogn. 43, 545–556 (2010).
[Crossref]

Moons, T.

T. Moons, L. V. Gool, M. Proesmans, and E. Pauwels, “Affine reconstruction from perspective image pairs with a relative object-camera translation in between,” IEEE Trans. Pattern Anal. Mach. Intell. 18, 77–83 (1996).
[Crossref]

Negahdaripour, S.

B. K. P. Horn, H. M. Hilden, and S. Negahdaripour, “Closed form solution of absolute orientation using orthonormal matrices,” J. Opt. Soc. Am 5, 1127–1135 (1988).
[Crossref]

Pauwels, E.

T. Moons, L. V. Gool, M. Proesmans, and E. Pauwels, “Affine reconstruction from perspective image pairs with a relative object-camera translation in between,” IEEE Trans. Pattern Anal. Mach. Intell. 18, 77–83 (1996).
[Crossref]

Pollefeys, M.

M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004).
[Crossref]

M. Pollefeys and L. V. Gool, “Stratified self-calibration with the modulus constraint,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 707–724 (1999).
[Crossref]

Ponce, J.

D. A. Forsyth and J. Ponce, Computer Vision: A Modern Approach (Prentice Hall, 2003).

Proesmans, M.

T. Moons, L. V. Gool, M. Proesmans, and E. Pauwels, “Affine reconstruction from perspective image pairs with a relative object-camera translation in between,” IEEE Trans. Pattern Anal. Mach. Intell. 18, 77–83 (1996).
[Crossref]

Reid, I.

L. Agapito, E. Hayman, and I. Reid, “Self-calibration of rotating and zooming cameras,” Int. J. Comput. Vision 45, 107–127 (2001).
[Crossref]

Robertson, D. P.

R. Cipolla, T. Drummond, and D. P. Robertson, “Camera calibration from vanishing points in images of architectural scenes,” in “Proc. British Machine Vision Conferece,” (Nottingham, England, 1999), pp. 382–391.

Rother, C.

C. Rother and S. Carlsson, “Linear multi view reconstruction and camera recovery using a reference plane,” Int. J. Comput. Vision 49, 117–141 (2002).
[Crossref]

Santos-Victor, J.

E. Grossmann and J. Santos-Victor, “Least-squares 3D reconstruction from one or more views and geometric clues,” Comput. Vis. Image Und. 99, 151–174 (2005).
[Crossref]

Schutze, O.

L. G. de la Fraga and O. Schutze, “Direct calibration by fitting of cuboids to a single image using differential evolution,” Int. J. Comput. Vision 80, 119–127 (2009).
[Crossref]

Sturm, P.

M. Wilczkowiak, P. Sturm, and E. Boyer, “Using geometric constraints through parallelepipeds for calibration and 3D modelling,” IEEE Trans. Pattern Anal. Mach. Intell. 27, 194–207 (2005).
[Crossref] [PubMed]

Tan, P.

N. Jiang, P. Tan, and L.-F. Cheong, “Symmetric architecture modeling with a single image,” ACM T. Graphic. (Proc. SIGGRAPH Asia)  28 (2009).
[Crossref]

Taylor, C. J.

D. Jelinek and C. J. Taylor, “Reconstruction of linearly parameterized models from single images with a camera of unknown focal length,” IEEE Trans. Pattern Anal. Mach. Intell. 23, 767–773 (2001).
[Crossref]

Tops, J.

M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004).
[Crossref]

Tsai, R.

R. Tsai, “A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf tv cameras and lenses,” IEEE Trans. Robot. Autom. 3, 323–344 (1987).
[Crossref]

Verbiest, F.

M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004).
[Crossref]

Vergauwen, M.

M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004).
[Crossref]

Viéville, T.

Q.-T. Luong and T. Viéville, “Canonical representations for the geometries of multiple perspective views,” Comput. Vis. Image Und. 64, 193–229 (1996).
[Crossref]

Wilczkowiak, M.

M. Wilczkowiak, P. Sturm, and E. Boyer, “Using geometric constraints through parallelepipeds for calibration and 3D modelling,” IEEE Trans. Pattern Anal. Mach. Intell. 27, 194–207 (2005).
[Crossref] [PubMed]

Wong, K.-Y. K.

K.-Y. K. Wong, G. Zhang, and Z. Chen, “A stratified approach for camera calibration using spheres,” IEEE Trans. Image Process. 20, 305–316 (2011).
[Crossref]

Wu, F. C.

F. C. Wu, F. Q. Duan, and Z. Y. Hu, “An affine invariant of parallelograms and its application to camera calibration and 3D reconstruction,” in “Proc. European Conference on Computer Vision,” (2006), pp. 191–204.

Zhang, G.

K.-Y. K. Wong, G. Zhang, and Z. Chen, “A stratified approach for camera calibration using spheres,” IEEE Trans. Image Process. 20, 305–316 (2011).
[Crossref]

Zhang, Z.

Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. 22, 1330–1334 (2000).
[Crossref]

Zisserman, A.

R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Second Edition (Cambridge University Press, 2003).

D. Liebowitz and A. Zisserman, “Combining scene and auto-calibration constraints,” in “Proc. IEEE International Conference on Computer Vision,” (Kerkyra, Greece, 1999), pp. 293–300.
[Crossref]

ACM T. Graphic. (1)

N. Jiang, P. Tan, and L.-F. Cheong, “Symmetric architecture modeling with a single image,” ACM T. Graphic. (Proc. SIGGRAPH Asia)  28 (2009).
[Crossref]

Comput. Vis. Image Und. (2)

E. Grossmann and J. Santos-Victor, “Least-squares 3D reconstruction from one or more views and geometric clues,” Comput. Vis. Image Und. 99, 151–174 (2005).
[Crossref]

Q.-T. Luong and T. Viéville, “Canonical representations for the geometries of multiple perspective views,” Comput. Vis. Image Und. 64, 193–229 (1996).
[Crossref]

IEEE Trans. Image Process. (1)

K.-Y. K. Wong, G. Zhang, and Z. Chen, “A stratified approach for camera calibration using spheres,” IEEE Trans. Image Process. 20, 305–316 (2011).
[Crossref]

IEEE Trans. Pattern Anal. Mach. Intell. (5)

D. Jelinek and C. J. Taylor, “Reconstruction of linearly parameterized models from single images with a camera of unknown focal length,” IEEE Trans. Pattern Anal. Mach. Intell. 23, 767–773 (2001).
[Crossref]

M. Wilczkowiak, P. Sturm, and E. Boyer, “Using geometric constraints through parallelepipeds for calibration and 3D modelling,” IEEE Trans. Pattern Anal. Mach. Intell. 27, 194–207 (2005).
[Crossref] [PubMed]

Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. 22, 1330–1334 (2000).
[Crossref]

M. Pollefeys and L. V. Gool, “Stratified self-calibration with the modulus constraint,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 707–724 (1999).
[Crossref]

T. Moons, L. V. Gool, M. Proesmans, and E. Pauwels, “Affine reconstruction from perspective image pairs with a relative object-camera translation in between,” IEEE Trans. Pattern Anal. Mach. Intell. 18, 77–83 (1996).
[Crossref]

IEEE Trans. Robot. Autom. (1)

R. Tsai, “A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf tv cameras and lenses,” IEEE Trans. Robot. Autom. 3, 323–344 (1987).
[Crossref]

Int. J. Comput. Vision (4)

C. Rother and S. Carlsson, “Linear multi view reconstruction and camera recovery using a reference plane,” Int. J. Comput. Vision 49, 117–141 (2002).
[Crossref]

L. Agapito, E. Hayman, and I. Reid, “Self-calibration of rotating and zooming cameras,” Int. J. Comput. Vision 45, 107–127 (2001).
[Crossref]

M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual modeling with a hand-held camera,” Int. J. Comput. Vision 59, 207–232 (2004).
[Crossref]

L. G. de la Fraga and O. Schutze, “Direct calibration by fitting of cuboids to a single image using differential evolution,” Int. J. Comput. Vision 80, 119–127 (2009).
[Crossref]

J. Math. Imaging Vis. (1)

P. Hammarstedt, F. Kahl, and A. Heyden, “Affine reconstruction from translational motion under various auto-calibration constraints,” J. Math. Imaging Vis. 24, 245–257 (2006).
[Crossref]

J. Opt. Soc. Am (1)

B. K. P. Horn, H. M. Hilden, and S. Negahdaripour, “Closed form solution of absolute orientation using orthonormal matrices,” J. Opt. Soc. Am 5, 1127–1135 (1988).
[Crossref]

Opt. Express (1)

Pattern Recogn. (1)

F. Mai, Y. S. Hung, and G. Chesi, “Projective reconstruction of ellipses from multiple images,” Pattern Recogn. 43, 545–556 (2010).
[Crossref]

Other (6)

F. C. Wu, F. Q. Duan, and Z. Y. Hu, “An affine invariant of parallelograms and its application to camera calibration and 3D reconstruction,” in “Proc. European Conference on Computer Vision,” (2006), pp. 191–204.

R. Cipolla, T. Drummond, and D. P. Robertson, “Camera calibration from vanishing points in images of architectural scenes,” in “Proc. British Machine Vision Conferece,” (Nottingham, England, 1999), pp. 382–391.

D. Liebowitz and A. Zisserman, “Combining scene and auto-calibration constraints,” in “Proc. IEEE International Conference on Computer Vision,” (Kerkyra, Greece, 1999), pp. 293–300.
[Crossref]

D. A. Forsyth and J. Ponce, Computer Vision: A Modern Approach (Prentice Hall, 2003).

R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Second Edition (Cambridge University Press, 2003).

R. Hartley, “In defence of the 8-point algorithm,” in “Proc. International Conference on Computer Vision,” (Sendai, Japan, 1995), pp. 1064–1070.

Cited By

OSA participates in Crossref's Cited-By Linking service. Citing articles from OSA journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (14)

Fig. 1
Fig. 1

Examples of imaged parallelograms in camera images due to the two sets of parallel lines with different directions: Only (a) depicts the imaged parallelogram corresponding to an actual parallelogram existing on a plane in 3D.

Fig. 2
Fig. 2

Relationship between the original H and the newly defined H r, which is the infinite homography between the rectified images.

Fig. 3
Fig. 3

Examples of the position of vanishing points in general man-made scenes. Since all parallel lines are orthogonal or parallel to the ground plane, there are only three independent vanishing points. The vanishing points v2, v3, v4, v5, and v6 are collinear and on the vanishing line of the ground plane.

Fig. 4
Fig. 4

The environment of the simulated experiment for performance evaluation. The arrows illustrate the systematical variation of the relevant parameters.

Fig. 5
Fig. 5

The results from the simulated experiments to analyze the relation between the performance and noise magnitude ((a) and (d)), size of parallelograms ((b) and (e)), and angle between the two planes ((c) and (f)). [tx, ty, tz] and [X, Y, Z] indicate the position error of the estimated cameras and parallelograms, respectively.

Fig. 6
Fig. 6

The results of the camera parameter estimates with simulated data: The mean of the absolute error of the calibration parameters are shown as a function of the noise levels for various methods. The cases in which three and four vanishing points exist are indicated by 3vps and 4vps, respectively. (a) and (d) refer to fu. (b) and (e) refer to the skew angle. (c) and (f) refer to u0.

Fig. 7
Fig. 7

Line extraction examples for the Plant Scene experiment presented in Section 7.2.2.

Fig. 8
Fig. 8

Three images used in the Tower Scene experiment and reconstructed model and camera poses.

Fig. 9
Fig. 9

Four captured images for Plant Scene experiment. (a) Image 0. (b) Image 1. (c) Image 2. (d) Image 3.

Fig. 10
Fig. 10

Reconstructed model and camera poses for Plant Scene experiment.

Fig. 11
Fig. 11

Five images for the experiment of the scene of Bank of China. (a) Image 0. (b) Image 1. (c) Image 2. (d) Image 3. (e) Image 4.

Fig. 12
Fig. 12

Reconstructed model and camera poses for the experiment of Bank of China.

Fig. 13
Fig. 13

Ten images for the experiment of the scene of Casa da Música. (a) Image 0. (b) Image 1. (c) Image 2. (d) Image 3. (e) Image 4. (f) Image 5. (g) Image 6. (h) Image 7. (i) Image 8. (j) Image 9.

Fig. 14
Fig. 14

Reconstructed model and camera poses for the experiment of Casa da Música.

Equations (32)

Equations on this page are rendered with MathJax. Learn more.

v i H v i , for i 1 , , 4 ,
λ x [ 1 0 0 ] T = H r [ 1 0 0 ] T
λ y [ 0 1 0 ] T = H r [ 0 1 0 ] T ,
H r = [ λ x 0 u 0 λ y v 0 0 w ] ,
H H r 2 1 H r H r 1 .
K = [ f u s u 0 0 f v v 0 0 0 1 ] ,
H i = [ h i 1 h i 2 h i 3 ] [ 1 cot θ 0 0 a i / r 0 0 0 1 ] .
( h i 1 ± j h i 2 ) T ω r i ( h i 1 ± j h i 2 ) = 0
h i 1 T ω r i h i 2 = 0 and h i 1 T ω r i h i 1 = h i 2 T ω r i h i 2 .
ω r i [ sin 2 θ ( r / a i ) sin θ cos θ α ( r / a i ) sin θ cos θ r 2 / a i 2 β α β γ ] ,
K r i [ sin θ cos θ α 0 ( a i / r ) sin θ β 0 0 γ ] ,
R I = [ 1 0 0 0 1 0 0 0 1 ] or R I ¯ = [ 1 0 0 0 1 0 0 0 1 ] .
H i K r i [ r 1 r 2 t ]
μ K r i 1 H i = [ r 1 r 2 t ] ,
μ [ sin θ 0 α 0 sin θ β 0 0 γ ] = [ r 1 r 2 t ] ,
H r K r 2 R r K r 1 1 [ 1 0 u 0 a 2 a 1 v 0 0 w ] ,
H H r 2 k 1 H r k H r 1 k , for k = 1 , , m
A [ H 11 , H 12 , , H 33 , λ x 1 , ( λ y 1 ) , u 1 , v 1 , w 1 , , λ x m , ( λ y m ) , u m , v m , w m ] T = 0 ,
H 0 i = H j i H 0 j .
[ u v 1 ] P [ X ˜ 1 ] [ h 1 T t 1 h 2 T t 2 h 3 T t 3 ] [ X ˜ 1 ] .
[ u h 3 T h 1 T 1 0 u v h 3 T h 2 T 0 1 v ] [ X ˜ t ] = 0
X ˜ = Σ D ,
[ ( x h 3 T h 1 ) Σ 1 0 x ( y h 3 T h 2 ) Σ 0 1 y ] [ D t ] = 0 .
H E A = [ A 0 0 1 ] .
d A 1 T Ω A d A 2 = 0 .
d E 1 T d E 1 / d E 2 T d E 2 = r 2
d A 1 T Ω A d A 1 = r 2 d A 2 T Ω A d A 2 .
cos θ = d E 1 T d E 2 ( d E 1 T d E 1 ) 1 / 2 ( d E 2 T d E 2 ) 1 / 2
d A 1 T Ω A d A 2 = r cos θ d A 2 T Ω A d A 2 .
Q * A = [ ( Ω A ) 1 0 0 0 ] .
ω i ( P i Q * A P i T ) 1 = H 0 i T Ω A H 0 i 1 .
{ ( H 0 i T Ω A H 0 i 1 ) 12 = 0 ( H 0 i T Ω A H 0 i 1 ) 13 = ( H 0 i T Ω A H 0 i 1 ) 23 = 0 r 2 ( H 0 i T Ω A H 0 i 1 ) 11 = ( H 0 i T Ω A H 0 i 1 ) 22 .

Metrics