A perspective projection model for prism-based stereovision was presented in this paper. The prism was considered as a single optical lens. By analyzing each plane individually and then combining them together, an affine transformation matrix which can express the relationship between an object point and its image was derived. Next, the perspective projection model between object point and its image was established. The methods for model parameter calibration, distortion correction and 3D reconstruction were also provided. The proposed method was based on optical geometry and could be used in a multi-ocular prism. Experimental results are presented to demonstrate the reliability and accuracy of our method which could be used in morphological measurement with high precision.
© 2015 Optical Society of America
Stereovision is an important branch of computer vision. It aims to recover the shape, size or location of an object from two or more images taken from different viewpoints by a non-contact measuring way. There are a variety of classification methods for stereovision. According to the number of cameras used, it can be divided into multi-camera based method and single-camera based method.
Multi-camera based method employs two or more cameras to capture images from different views. It is currently used for space exploration , panoramic reconstruction , robot navigation or other volume static target measurements and surface reconstruction . But the multi-purpose visual measurements still have some inherent problems: 1)due to multiple cameras which are associated, there were high cost; 2) In order to obtain a high accuracy, the distance of baseline must be long enough, resulting in large system size; 3) the working condition between multiple cameras cannot completely consistent, thus affecting the measurement accuracy; 4) cameras capture time cannot be synchronized accurately, which will cause the errors in dynamic target measurement. To overcome these limitations, the concept of single-camera stereo vision has been proposed and attracts many researchers.
The single-camera stereovision system can be classified into two categories. One is depth recovery by exploiting some known cues such as the environment or the movement of the camera [4–6 ]. The other is depth recovery by the reflective or refractive properties of optical devices such as mirrors  or prisms. Among these methods, prism-based stereovision has the simplest system structure. With the use of only one camera, it could reduce the cost and eliminate the multi-camera synchronization problem automatically. The system is more compact which can be used in some applications with space constraint. These advantages enable the system with a good potential application that has developed rapidly in recent decades.
In this paper, a novel perspective projection model for prism-based stereovision is proposed. Different from previous studies, an affine transformation matrix is proposed to establish the relationship between object point and its virtual image. Then, the perspective projection model of prism based stereovision can be expressed directly by taking the affine matrix into a pinhole camera model. The methods for model parameter calibration, distortion correction and 3D reconstruction were also provided. Our proposed model can connect the prism imaging with homography between the model plane and its image, not only it can express the projection procedure of prism based stereovision, but it also can solve the 3D reconstruction problem with high precision.
2. Lecture review
Lee and Kweon  first proposed prism based single-lens stereovision system in 2000. In their study, a ray tracing method was used to calculate the relationship between object point and its image point. This groundbreaking method changed the scope of application of prism in common optical system, which lay a foundation for prism as an image lens. However, they ignored the nonlinear distortion of the image and derived transformation model by approximate calculation, which affected the accuracy of 3D reconstruction.
Lim and Xiao  introduced some new understanding about the system by proposing the concept of a “virtual camera.” They assume that one image captured by this system is divided into two halves which is equivalent to two images captured using a two camera system. This idea converted the single-lens stereovision system using a prism into a conventional stereovision system using two cameras, which enables the system to be more easily understood and implemented. Furthermore, they extended this system from binocular to multi-ocular , and proposed epipolar rectification method , stereo matching  and error analysis method , Since then, the concept of virtual camera has been developed and used widely.
There were many significant advantages compared with mufti-camera stereovision systems, such as potential cost-effectiveness, compactness, low computational cost, and elimination of the complicated cameras synchronization process. These multitude of benefits extend the scope of application of the system. Thus, researchers were interested to devote attention to study this system . extended single prism to prism microarray, which made the system smaller. After that , designed an optical lens by making used of a four-ocular prism. In order to solve the problem of non-linear distortion , proposed a distortion rectifying method by adding another auxiliary camera ; used a concave mirror for discrepancy compensating. In order to increase the accuracy of the system , analyzed the color aberration of the prism and concluded that with the increase of prism angle, the nonlinear magnification and color aberration will become severely undesirable ; derived a transformation matrix which can express the relationship between an object point and its image by geometrical optics, and a position estimation method of multi-ocular prism was proposed in . Recently , improved the method in  and enhanced the precision of this kind of stereovision system.
With the improvement of the theoretical system, prism-based stereovision has been applied to many fields . designed a free-form prism for the microscopic measurement, with which a complete three-dimensional image can be captured at one time ; applied this method to particle image velocity measurement ; focused on the method for recognizing partially occluded objects; and  integrated prism to smart phone and developed an app which can implement 3D reconstruction in phone.
With the efforts of so many researches, the prism based single-camera stereovision develops rapidly. It has experienced the crossover from approximate evaluation  to accurate calculation [11,19,21 ]; from two-views [8,9 ] to multi-views [10,15,20 ]; form theoretical study [8–20 ] to various applications [21–25 ]. The methods which have been proposed include the main processing flow of stereovision, such as calibration methods [8–10, 19–21 ], stereo matching algorithms [11,12 ] and 3D reconstruction techniques [14,15,21–23 ]. However, there is still no one complete model which can denote the projection procedure of an object point from world coordinate to image coordinate. On the basis of the former researches, we introduced a perspective projection model for prism-based stereovision, with which the homography between the model plane and its image can be expressed and calculated easily.
3. Virtual image formed by prism
In this section, the prism is considered as an optical lens, and its imaging model as well as imaging procedure will be derived so that the refraction of light by prism could be expressed through an object point and its virtual image.
3.1 Refraction of an arbitrary plane
As shown in Fig. 1 , Π is an arbitrary plane with equation in 3D space and separates two different media with refractive index n and n’. is a point in front of this plane. PB is a line segment which is perpendicular to plane Π. PC is an incident ray intersecting Π at point C, and the back extension line of its refracted ray intersects PB at point . According to Shell’s law, it is clear that α is equal to the incident angle, and α’ is equal to the angle of refraction.
The relationship between point P and P’ can be expressed as below .
Equation (1) can be rewritten as
According to Eq. (2), a point refracted through an arbitrary plane can be expressed by an affine transformation matrix A and a translation vector t. Moreover, matrix A also can be regard as a scaling matrix along the N direction with the scale factor q.
3.2 Imaging model of multi-ocular prism
A multi-ocular prism used in a stereovision system is composed of one back plane and multi inclined planes. If we do not consider the occlusion, the number of views in an image plane is equal to the number of inclined planes. So no matter how many inclined planes there are of a prism, if we captured one point in the image plane, the ray sending from the object point in 3D space will project to this image point after two refraction through the prism. From Eq. (1), a 3D point and its virtual images by the two refraction of prism can be expressed by combining these two processes together.
For the sake of simplicity, only light refraction of binocular prism in the horizontal (O−XZ) plane is discussed. As shown in Fig. 2, P is an object point in 3D space, P’ is the virtual image point of P by the refraction of backplane of prism, Pl and Pr are virtual image points of P’ by the refraction of left and right inclined plane respectively. From Eq. (2), the virtual image point P’, Pl and Pr can be expressed as
The imaging model of a bi-prism can be represented by Eq. (5). The virtual images of any object points can be regarded as the combination of an affine transformation and a translation, which also could be expanded to multi-ocular prism.
4. Perspective projection model for prism based stereovision
A coordinate system attached to a pinhole camera, whose origin O coincides with the pinhole, and a bi-prism is placed in front of the image plane. Without loss of generality, the position of the prism is discretionary on the premise that the two virtual images could be captured by camera image plane synchronously, as shown in Fig. 3 .
4.1 Perspective projection model
A 3D point is denoted by . A 2D point is denoted by . If we use and to denote the homogeneous coordinates: and , the relationship between a 3D point P and its image projection p is given by
As shown in Fig. 3, points Pl and Pr are virtual images of P which formed by prism in camera coordinate, and pl and pr are projection points of P on the image plane. From Eqs. (5), (6) and (7) , we have
So far, the projection model of prism based stereovision can be regarded as the combination of an intrinsic matrix, a scale-translate matrix and a rotate-translate matrix. The scaling factor of the matrix and will be discussed in the next subsection.
4.2 Scaling factor
From Eqs. (1) and (9) , an image point can be determined by an affine transformation (with scaling factor q) and a translation. Therefor the parameter q will affect the results of perspective projection. In fact, when the prism is considered as an optical lens, the position of the virtual images will be changed according to different viewpoint, which is also closely related to scaling factor q.
According to section 3.1, the scaling factors for matrix MRB in Eq. (9) can be defined asEq. (10) can be written asEq. (11) we have
By making use of the vector model of Snell’s law, the scaling factor qr and qb are given by(the detailed deducing can be find in Appendix A):
5. Parameter calibration and 3D reconstruction of prism based stereovision
5.1 Model parameter calibration
From Eq. (9) we know that the projection process from an object point to the image plane by prism can be formulated as the combination of an intrinsic matrix, an affine matrix and an extrinsic matrix. Since the intrinsic matrix of the camera can be easily calculated by single camera calibration, the calibration problem becomes to find the affine matrix and the extrinsic matrix. In Euclidean coordinates, the projection model of bi-prism based stereovision system can be written as
They can be regarded as the projection matrix and be calculated by normal single camera calibration method, such as Direct Linear Transformation. Then we can decompose the extrinsic matrix and acquire the following equations:
5.2 Distortion correction
As discussed in section 4.2, the scaling factor of projection model is determined by the position of 3D object point, so different object points have different projection matrix, this will lead to the image distortion problem. The form of distortion is determined by the parameter k shown in Eq. (13). According to Fig. 3, if we set and , then k can be written as
After simplification, we have
Thus, the form of distortion is a series of elliptic curves. We use the radial distortion to approximately express the distortion caused by prism, which can be written as
5.3 3D reconstructions
Once the scaling factor being defined, all the object points will have the same projection transformation, then the coordinates of 3D points in world coordinates could be calculated uniquely. According to Fig. 3, if the corresponding points areand , let
The equations of reflected ray in world coordinates are
Thus the coordinate of 3D points can be determined by solving Eq. (24).
6 Experimental results and analysis
The camera used in the experiments is a Prosilica GT1910 AVT CCD camera, with a Computar M2518-MPW low distortion lens. The image resolution is 1920*1080. Three bi-prism with different angles used in the experiments are made of K9 optical glass (refraction index 1.51630). Camera and prism are mounted on a mechanical stand which installs venire calipers in the X, Y, Z axes as well as on a rotational stage. The relative positions between the camera and prism are known and adjustable, as shown in Fig. 4 .
In order to validate the proposed projection model, three experiments are conducted. In the first experiment, the reprojection errors of the points in the calibration board were calculated to evaluate the precision of proposed model. In the second experiment, the reconstruction errors of 3D discrete points are analyzed to estimate the performance of proposed model for 3D measurements. At last, the result of a 3D reconstruction of a complicated surface is measured to estimate the performance of the proposed model on 3D morphology reconstruction.
6.1 Reprojection error
Single camera calibration. In this experiment, a circular board was used for feature point extraction. The dimensions of the feature points in the calibration board were 20*20, and the horizontal and vertical interval of feature points were 15 mm. There was only one camera in our stereovision system. So it is easier for the step to calibrate the camera. A well-known calibration algorithm by Zhang  was implemented to calculate the intrinsic and external parameters of the camera.
Calculating the projection matrix for each view. The bi-prism was settled in front of the camera, and then the images captured by our system were divided into two segments. We still used calibration board as the feature points in 3D space, and three bi-prisms with angles 10°, 15° and 20° were used for comparing with the experimental results. From Eq. (16), we can acquire the projection matrix for each view of prism. An initial guess of the parameters can be obtained with the technique described in Section3.1. After that, the imaging matrix and the external matrix can be decomposed.
Reprojection error. Figure 5 shows the reprojection errors of a bi-prism based stereovision system with angles 10°, 15° and 20° respectively. The circular board was placed at some fix positions which were parallel to the camera image plane, and the sampling interval was 500 mm, so we can compare the average reprojection errors with different distances. As shown in these three figures, the accuracy of our proposed method is sufficient for a large range of prism-based stereovision systems. However, the accuracy was inversely proportional to the distance between cameras focus and object. In the meantime, the reprojection errors of the small angle prism are less than that of the large angle prism.
6.2 3D reconstruction of discrete points
The 3D reconstruction of discrete points can be used in stereovision measurement. Thus, we designed a depth and a length measurement experiment to test and evaluate the feasibility of our proposed method. The calibration board was reused as its feature points could be extracted easily and accurately. The programs of reconstruction were implemented in VC2008 with OpenCV 2.3.1 and all the results were acquired on the same condition.
- (1) Depth measurement. The calibration board was set at some known positions, and the distance from the board to the image plane of camera can be measured by laser with distance tolerance of 0.01 mm. During the experiment, the images of calibration board were captured in sequence. As shown in Fig. 6, we compared our depth measurement results with the really distance of feature points. There were 400 sampling points for each distance, and we use the average error to evaluate the performance of our system with the prism’s angles of 10°, 15° and 20° respectively.
- (2) Length measurement. The interval of feature points in calibration board was known. So the same data which were acquired during the depth measurement were used in length measurement. In order to test and evaluate the feasibility of our method, we compared our results with the results of two camera methods. As shown in Tables 1-3
- (3) Error analysis. If we don’t consider the impact of the machining error and assembly error, one unavoidable problem with the biprism is the chromatic aberration. Chromatic aberration is generated from the non-constant refraction index of a material for lights with different colors. In this study, the effect of chromatic aberration was neglected and a compromised value of the refraction index was employed for the proposed projection model. However, the color aberrations become more severe when the angle θ increases further. According to Eq. (17), the angle mainly affects the scaring factor q, If denotes the error caused by chromatic aberration, when the angle of prism was change, we have
Thus, with the increasing value of θ, the chromatic aberration will getting larger, and as a result, the measurement error will increase. However, when θ is close to 0, the prism will be reduced to be a thin glass plate, so much so that the effect of refraction would not exist. In this regard, there is virtually no disparity that can be measured and hence the depth. Thus, the value of θ cannot be too small.
In our work, another important parameter, the Field of View (FOV) was also found to be influenced by θ. As shown in Fig. 7 , the FOV of the camera is 2*β, the angle of the prism is θ, and the angle between the surface normal and the refracted ray which radiates from the boundary of the two half image plane are α1 and α2 respectively. The FOV of the prism based stereovision system can be regarded as the combination of α1 and α2. If the refractive index is n, we have
Thus, the FOV of the prism based stereovision system is mainly decided by the range of β and θ.
Figure 8 shows the FOV of the single-lens prism based stereovision system with the increasing value of θ. When θ is equal to 0, the FOV is the same as a single camera (Fig. 8 (a)); when θ is getting larger, the common view of the two half image planes is getting larger too. In practical application, we always need the object to be captured by both the image planes. However, in the situation depicted by Fig. 8(b), the common view of the two image plane was too small to be useful for stereovision. Thus, the FOV of the system should fall in the range between Fig. 8(c) and Fig. 8(f).
For the distance between the object and camera, according to the experimental results of depth measurement, the average depth recovered does increase with the increasing distance between the object and camera. This is because the chromatic aberration is proportional to the distance. The depth recovery of distant object seems to be more susceptible to the effect of chromatic aberration. However, do note from Fig. 8 the relationship between the prism angle θ and the intended measured distance. Configurations such as those shown in Fig. 8(b)-(c) are used to measure larger depth. They are with small θ, corresponding to a divergent FOV. Conversely, for measuring shorter depth, configurations shown in Fig. 8(d)-(f) with large values of θ are use, and they correspond to convergent FOV.
Considering these factors comprehensively, we can draw the following conclusions:
For commonly used single lens prism based stereovision system, in order to reduce the reconstruction error which is caused by chromatic aberration, the value of θ should as small as better; but in order to make sure that both image plane could acquire the object integrally, θ should be limited to a minimum value, which could be obtained by the equation: α1 = α2.
For smaller depth measurement purpose, system with larger values of θ should be used, but at the same time, the values should be limited to a maximal value, which could be determined by the equation: α1 = 0. This is due to the fact that when θ is larger than this maximal value, the effect of chromatic aberration will become more pronounced to make the system of measurement ineffective.
6.3 Morphology reconstruction
Figure 9 shows an input biprism image and recovered shape for a textured model of Willis Tower. The distance from the front of the object to camera was 1100mm and the angle of prism be used was 10 degree. To find the corresponding points, a simple cross-correlation technique is used.
The height of the model is 27.4 mm. The errors in height between the recovered shape and the true shape is 1.5 mm.
Another morphology reconstruction result could be find in Fig. 10 . In this experiment, the distance from the front of the object to the camera was 850mm and the angle of prism used was 10 degree. The height of the model is 16.5 mm. The error in height between the recovered shape and the true shape is 1.16 mm. It can be seen that the experimental results agrees well with the original model, which demonstrates that the proposed method can be applied not only in three-dimensional measurement, but also in morphology retrieval.
Prism based single–lens stereovision systems have many advantages compared with traditional two or more cameras’ system. On the basis of former researches, we introduced a new projection model for stereovision system using prism in this paper. Our method is based on projective geometry, which can translate the refraction of prism to an affine transformation matrix. Moreover, this method can be easily expanded to multi-ocular prism. The experiment results show that the method is efficient, robust, and has good property of convergence with small reprojection errors. It has also been demonstrated that the 3D map using the 3D structure can be reconstructed from a sequence biprism-stereo images.
Snell’s law can be expressed in a number of ways, one of which is in vectorial form .If the initial ray is specified by the unit vector V, after refraction the direction is V’, at the point where the ray intersects the surface between two homogeneous and isotropic media, described by indices n and n’, its normal has direction N, then V’ can be specified by V and N by the following equation
When the media is identified, the n is constant, then we can write the equation simply as
According to Eq. (12), if ,then we have
Finally, qL and qB can be derived in the same way.
Firstly, we consider the position of the prism was placed at an ideal place: (1)the prism has consistent optical properties and its backplane is parallel to the real image plane, (2) the apex of the prism is perpendicular to the camera optical axis, and (3) the X and Y axes of the camera coordinates are parallel with the image plane coordinates and the Z axis is coincident with the camera’s optical axis. Then, the surface normal of the prism can be written as
From Eq. (1), we have
In most cases, these conditions are hardly achieved in practical applications. There would be some alignment errors and the results would not be accurate if the above assumptions were used. If we use a Rotation matrix R to denote the skewing of prism, then the real surface normal of each plane is
So the affine transformation changes to
According to Eq.(16), let ER express the first three rows of HR, then
It could also be acquired that
Because R is an orthogonal matrix, so and ARBR are similar matrices, then they have the same eigenvalue. Let
Then the rotation matrix R can be expressed as
Given n 3D object points and m inclined planes of the prism, there should be n* m points in the image plane. Assume that the image points are corrupted by independent and identically distributed noise, the maximum likelihood estimation of the prism position can be obtained by minimizing the following function:Eq. (9). By making use the derivation above, this function can be written as
The authors are grateful for the financial support from the National Natural Science Foundation of China (Grant Nos. 61501101, 61472069, 61402089), the Fundamental Research Funds for the Central Universities (Grant No. N130319002), and the general program of the education department of Liaoning province(Grant No. L2014086).
References and links
1. J. Y. Rau, J. P. Jhan, and Y. C. Hsu, “Analysis of oblique aerial images for land cover and point cloud classification in an urban environment,” IEEE Trans. Geosci. Rem. Sens. 53(3), 1304–1319 (2015). [CrossRef]
2. F. Huang, “Sensitivity analysis of pose recovery from multi-center panoramas,” Multimedia Tools Appl. 72(2), 1193–1213 (2014). [CrossRef]
4. X. Cao and H. Foroosh, “Camera calibration and light source orientation from solar shadows,” Comput. Vis. Image Underst. 105(1), 60–72 (2007). [CrossRef]
7. A. Goshtasby and W. A. Gruver, “Design of a single-lens stereo camera system,” Pattern Recognit. 26(6), 923–937 (1993). [CrossRef]
8. D. H. Lee and I. S. Kweon, “A novel stereo camera system by a biprism,” IEEE Trans. Robot. 16(5), 528–541 (2000). [CrossRef]
9. K. B. Lim and Y. Xiao, “Virtual stereovision system: new understanding on single-lens stereovision using a biprism,” J. Electron. Imaging 14(4), 41–52 (2005). [CrossRef]
10. Y. Xiao and K. B. Lim, “A prism-based single-lens stereovision system: from trinocular to multi-ocular,” Image Vis. Comput. 25(11), 1725–1736 (2007). [CrossRef]
11. D. L. Wang, K. B. Lim, and W. L. Kee, “Geometrical approach for rectification of single-lens stereovision system with a triprism,” Mach. Vis. Appl. 24(4), 821–833 (2013). [CrossRef]
12. K. B. Lim, W. L. Kee, and D. L. Wang, “Virtual camera calibration and stereo correspondence of single-lens bi-prism stereovision system using geometrical approach,” Signal Process. Image Commun. 28(9), 1059–1071 (2013). [CrossRef]
13. W. L. Kee, K. B. Lim, Z. L. Tun, and B. Yading, “New understanding on the effects of angle and position of biprism on single-lens biprism stereovision system,” J. Electron. Imaging 23(3), 033005 (2014). [CrossRef]
15. W. S. Sun, C. L. Tien, C. Y. Chen, and D.-C. Chen, “Single-lens camera based on a pyramid prism array to capture four images,” Opt. Rev. 20(2), 145–152 (2013). [CrossRef]
16. K. Genovese, L. Casaletto, J. A. Rayas, V. Flores, and A. Martinez, “Stereo-digital image correlation (DIC) measurements with a single camera using a biprism,” Opt. Lasers Eng. 51(3), 278–285 (2013). [CrossRef]
18. X. Y. Li and R. Wang, “Analysis and optimization of the stereo system with a biprism adapter,” In Proceedings of International Conference on Optical Instruments and Technology: Optical Systems and Modern Optoelectronic Instruments, Y. T. Wang, ed. (Academic, 2009), pp. 75061V. [CrossRef]
20. X. Cui, K. B. Lim, Y. Zhao, and W. L. Kee, “Single-lens stereovision system using a prism: position estimation of a multi-ocular prism,” J. Opt. Soc. Am. A 31(5), 1074–1082 (2014). [CrossRef] [PubMed]
21. L. F. Wu, J. G. Zhu, and H. M. Xie, “A modified virtual point model of the 3D DIC technique using a single camera and a bi-prism,” Meas. Sci. Technol. 25(11), 115008 (2014). [CrossRef]
23. Q. Gao, H. P. Wang, and J. J. Wang, “A single camera volumetric particle image velocimetry and its application,” Sci. China-Technol, Sci. 55(9), 2501–2510 (2012).
24. M. Zhang, Y. R. Piao, J. J. Lee, D. H. Shin, and B. G. Lee, “Visualization of partially occluded 3D object using wedge prism-based axially distributed sensing,” Opt. Commun. 313, 204–209 (2014). [CrossRef]
25. J. B. M. Numhauser and Z. Zalevsky, “Stereovision imaging in smart mobile phone using add on prisms,” 3D Research. 5(1), 1–10 (2014).
26. Z. Y. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000). [CrossRef]