## Abstract

In this paper, we present a novel computational imaging system using a dual off-axis color filtered aperture (DCA) for distance estimation in a single-camera framework. The DCA consists of two off-axis apertures that are covered by red and cyan color filters. The two apertures generate misaligned color channels in which the amount of misalignment of points in the image plane are a function of the distance from the camera of the corresponding points in the object plane. The primary contribution of this paper is the derivation of a mathematical model of the relationship between the color shifting values and distance of an object from the camera when the camera parameters and the baseline distance between the two apertures in the DCA are given. The proposed computational imaging system can be implemented simply by inserting an appropriately sized DCA into any general optical system. Experimental results show that the DCA camera is able to estimate the distances of objects within a single-camera framework.

© 2013 OSA

## 1. Introduction

Over the past few decades, a variety of approaches have been proposed for the capture of three-dimensional (3-D) images or depth information because of its importance in such applications as robot vision, 3-D broadcasting, human computer interfaces, intelligent visual surveillance, and intelligent driver assistant systems [1–4]. Since conventional imaging systems only acquire intensity and color information of a 2-D projection of the 3-D world, the traditional approach for acquiring 3-D images or depth information involve the use of a two-lens camera (or two separate cameras) that capture two images of the same scene from different viewpoints [5, 6]. In spite of their wide-scale use, one of the disadvantages of a dual lens system is its cost.

An alternative to a dual-lens or two-camera system is to attach a special 3-D lens to a single lens camera. For example, Panasonic’s LUMIX G 12.5 mm f-12 lens allows a camera to capture two separate images having different disparities by dividing the image sensor into left-eye and right-eye images. Although this lens allows for the capture and display of 3-D images and the estimation of depth information, the resolution of the image is reduced by a factor of two.

Other approaches for distance estimation rely on various types of cues such as focus, shading, and motion. For example, depth-from-focus systems involve the measurement of the sharpness of an image at each pixel from a set of multiple images that have been captured with a different focus [7,8]. Alternatively, determining depth from de-focus involves the use of a pair of images with different focus settings and the measurement of the degree of de-focus blur [9, 10]. Since both of these approaches, depth from focus and depth from de-focus, require two or more images that have been acquired by a single static camera, they are not practical for real-time video processing applications. On the other hand, Zhuo has used a single de-focused image for depth estimation by measuring the amount of blur using a Gaussian gradient ratio on edges [11]. One of the difficulties, however, arises from the ambiguity that exists between de-focus blur and motion blur. A completely different set of approaches that provide highly accurate depth estimates are time-of-light cameras or those imaging systems that use structured light [12–14]. However, since they require additional illumination sources, they do not fit into our design goal of a passive distance estimation camera.

Recently, computational cameras have been used to capture information that is not available with the use of a traditional camera. Computational imaging systems project rays that are altered by specially designed optics in the light field of the scene onto the image sensor using novel optics [15, 16]. A computer is then used to extract the desired information from the captured image. One type of computational camera involves the modification of the geometric structure of the aperture. Coded aperture cameras, for example, intentionally generate specific image blurs and then decode the acquired image to produce an image depth map [17, 18]. Another approach is to use a pair of apertures that are located off the center of the optical axis to move projected points on the image plane as a function of the distance of an object from the camera. In astronomical telescopes, Hartmann Mask and Scheiner Disk have divided the projected point by shifting property of two off-axis apertures when the object is not focused. Dou has proposed an off-axis aperture camera using the aperture size as a variable to estimate depth for 3D shape reconstruction [19]. In [20], Maik presents an auto-focusing camera that uses three off-axis apertures that are covered with red, green, and blue filters and Kim has proposed a multi-focusing image restoration algorithm using the distance estimation from three color filtered apertures [21]. Kim does not, however, establish the theoretical relationships between measured disparities and the object distance and relies, instead, on the use of pre-defined look-up tables.

In this paper, we present an analysis of the effect of an off-axis aperture, and derive a relationship that shows how points are shifted in the image plane as a function of the location of the aperture and the distance of an object from the camera. We then show how a camera that has dual off-axis apertures that are covered with color filters may be used to estimate the distance by finding the relative shifts between the projections of a point on an object through the two apertures. One of the advantages of using three color filtered apertures is that three different color shifting vectors may be estimated, one from each pair of apertures. Another advantage is that disparities along two orthogonal axis can be determined whereas with a two-aperture camera, the disparity can only be measured along one axis. With a dual-aperture camera, on the other hand, a higher resolution distance estimate is possible since, for equivalent sized apertures, the distance between the apertures may be made larger. It is also possible to make the apertures larger in a dual-aperture camera, thereby producing a brighter image. Finally, an interesting feature of the two-aperture camera is that with the appropriate color filters it is possible to create a 3-d image that may be viewed using a pair of anaglyphic glasses without any additional processing.

This paper is organized as follows. First, in Section 2, we present the basic principles of DCA camera. In Sections 3 the relationship between disparity and the distance of an object is derived. Section 4 presents some experimental results, and in Section 5 we conclude the paper.

## 2. Dual off-axis color filtered apertures

The aperture of an optical system is an opening that adjusts the amount of the light that enters the image sensor. In a traditional imaging system, the center of the aperture is generally aligned with the optical axis of the lens as shown in Fig. 1. As a result, the convergence pattern of any point on an object that is located within the plane of focus of the camera, *A _{i}*, will form a point in the image plane as shown in Fig. 1(a). However, as the object moves away from the plane of focus, the convergence pattern becomes circular with a radius that depends on the distance the object is away from the plane of focus as shown in Fig. 1(b).

If a single on-axis aperture is replaced with two off-axis apertures, then a point on an object that lies outside the plane of focus will be projected onto two different points in the image plane as shown in Fig. 2(a). As the object moves from the plane located at *A _{f}* to

*A*, the two projected points gradually converge to a single point in the image plane and then, as the object continues to move towards the near-focus position,

_{i}*A*, the projected points again separate but in the opposite direction. Since the distance between the two convergence patterns depends on the distance of the object from the camera, it is possible to estimate this distance if the separation Δ

_{n}*y*can be determined. In order to estimate Δ

*y*, the apertures are covered with two different color filters, such as red and cyan, so that the two projections will be in two different color channels as illustrated in Fig. 2(b). Note that a point on the object at the in-focus position has a convergence point where both color channels aligned. However, if the object moves away from the in-focus position, then color misalignment occurs in the image plane. On the other hand, if the object moves from the in-focus position towards the camera, then color misalignment again occurs but in a direction opposite to that for the far-focus position. Thus, the distance of the object can be estimated from the color shift vector, which is the amount of misalignment between the color channels. In the following sections, we derive the relationships necessary to estimate the distance from the color shift vector.

## 3. Geometrical analysis of the relationship between disparity and distance

In this section, we develop the properties of a dual off-axis aperture system, and establish the mathematical model of the mutual relationship between the disparity of two acquired color channels and the physical distance of the object from the camera.

#### 3.1. Off-axis imaging

Image formation of a thin lens with a focal length *f* and an on-axis aperture centered at the vertex of the lens, **c** = (0, 0, 0), is shown in Fig. 3. If the image plane of the camera is at a distance *v*_{0} to the right of the vertex, then Gauss’ thin lens equation says that the plane of focus will be at a distance *z*_{0} where

*v*

_{0}. Similarly, if an object is located at a distance

*z*from the lens, then it will be in focus if the image plane is at However, if the image plane is at

*v*

_{0}, then the object will be blurred as illustrated in Fig. 3(b), and the diameter of the circle of confusion is approximately computed as [22] where

*d*is the diameter of the aperture.

Now consider the imaging system shown in Fig. 4 that has an off-axis aperture centered at **c** = (0, *c _{y}*, 0). Note that when the aperture is centered on the optical axis, the ray from the point

**p**that passes through the center of the aperture will pass through the point

*π*(

_{y}*v*) and will be projected to the point ${\pi}_{y}^{0}({v}_{0})$ in the image plane. However, if the aperture is moved to

**c**= (0,

*c*, 0), then the projected point will move away from ${\pi}_{y}^{0}({v}_{0})$ to

_{y}*π*(

_{y}*v*

_{0}), and the amount that the point moves will depend on the value of

*c*. The location of the projection

_{y}*π*(

_{y}*v*

_{0}) may be found using similar triangles as follows,

*π*(

_{y}*v*

_{0}), we have Since then which defines the location of the projection of

**p**along the

*y*-axis for an aperture at

**c**= (0,

*c*, 0).

_{y}In a real imaging system, the aperture will not be in the plane of the lens but will be, instead, some distance *c _{z}* from the lens as illustrated in Fig. 5. Note that by moving the off-axis aperture away from the lens results in an

*equivalent aperture*that is a function of the location of the point

**p**. Again using similar triangles, it follows that

*c*in Eq. (6) with ${c}_{y}^{\mathit{eq}}$ and solving for

_{y}*π*(

_{y}*v*

_{0}) gives

*y*-coordinate of the projection of the point

**p**= (

*x*,

*y*,

*z*) onto the image plane at

*v*

_{0}for an aperture at

**c**= (0,

*c*,

_{y}*c*). By symmetry, it follows that if the aperture is at

_{z}**c**= (

*c*, 0,

_{x}*c*), then the

_{z}*x*-coordinate of the projection will be

**p**= (

*x*,

*y*,

*z*

_{0}), i.e.,

**p**is in the plane of focus, then

*v*=

*v*

_{0}and, for any aperture

**c**, which is the same as the perspective projection of

**p**for a pinhole camera. However, when the point

**p**is not in the plane of focus, then the location of the projected point will be a function of the location of the aperture. In addition, the point

**p**will generate a blurred disk around the projected point

*π*(

**p**) with a diameter given approximately by Eq. (2).

#### 3.2. Relative shift versus object distance

Having found the projection of a point in the image plane for an off-axis aperture, it is possible to find the difference in the projections of **p** from two displaced apertures. Specifically, suppose that one aperture is at **c**_{1} = (*c _{x}*,

*c*,

_{y}*c*) and another is shifted a distance Δ

_{z}*y*along the

*y*-axis to

**c**

_{2}= (

*c*,

_{x}*c*+ Δ

_{y}*c*,

_{y}*c*). From Eq. (9) it then follows that the point

_{z}**p**, when projected through the two apertures, will be separated by a distance Δ

*y*along the

*y*-axis in the image plane where

**p**is in the plane of focus, then

*v*=

*v*

_{0}and Δ

*y*= 0. However, when

*z*>

*z*

_{0}(the point

**p**lies beyond the plane of focus), then

*v*<

*v*

_{0}, and Δ

*y*< 0, whereas when

*z*<

*z*

_{0}(the point

**p**is closer to the lens than the plane of focus), then

*v*>

*v*

_{0}, and Δ

*y*> 0. Substituting the following relationships (see Eq. 1)

*c*. As we will see in Section 3.4, Eq. (14) provides a method for estimating the distance of an object using a dual aperture camera from the shift Δ

_{y}*y*.

#### 3.3. Conversion of shift vectors from millimeters to pixels

When *c _{z}*,

*z*, and

*z*

_{0}are expressed in millimeters in Eq. (14), then Δ

*y*will also be in millimeters. Since color channel shifts will be determined in terms of the number of pixels, to convert Δ

*y*from millimeters to pixels, it is necessary to know the type of image sensor used in the camera. For a camera with an

*N*

_{1}×

*N*

_{2}array of pixels and a CMOS array that is

*W*×

*H*mm in size, the distance between two pixels will be

*α*gives the shift Δ

*y*in pixels,

*y*versus

*z*is shown in Fig. 6 for a 150 mm lens, a plane of focus set to twenty meters, an aperture shift Δ

*c*= 20 mm and

_{y}*c*= 0. Note that as the object gets closer to the lens, the

_{z}*rate of change*of Δ

*y*versus

*z*increases significantly.

#### 3.4. Distance estimation

Equation (14) specifies the amount of shift that occurs in the projection of a point through two apertures that are separated by a distance Δ*c _{y}* along the

*y*-axis. The geometry of the apertures in the dual off-axis aperture camera is shown Fig. 7 where the red and cyan (blue plus green) filtered apertures are separated by a distance Δ

*c*. When used in a digital camera with a Bayer color filter array, the red color plane image will be formed by the light that passes through the red filtered aperture whereas the green and blue color plane images will be formed by the light that passes through the cyan filtered aperture. Due to the shift between the two apertures, objects outside of the plane of focus of the camera will be shifted by Δ

_{y}*y*between the red and green color planes and between the red and blue color planes. Therefore, if the correspondence between points in the red and green color planes or between the red and blue color planes may be found, then these correspondences may be used to estimate the distances of objects in the scene. More specifically, solving Eq. (16) for

*z*, we have

*z*of a point

**p**= (

*x*,

*y*,

*z*) may be found from the color shift Δ

*y*produced by the dual apertures along with some camera parameters that include the focal length, the location of the apertures, and the plane of focus of the camera. How to find these values is discussed in the following subsection.

#### 3.5. Camera calibration

Before Eq. (17) may be used to estimate the distance of an object from the shift Δ*y*, the values of *f*, *α*, Δ*c _{y}*,

*c*, and

_{z}*z*

_{0}need to be determined. For a fixed focal length camera,

*f*will be given in the lens specification. If a zoom lens is used, an additional step of calibration would be required. The value of

*α*that converts shifts in millimeters to shifts in pixels may be determined from the image sensor specifications as discussed in Section 3.3. The location of the apertures along the

*z*-axis,

*c*, may be found using a simple calibration procedure as follows. Suppose that two objects are placed at known distances,

_{z}*z*

_{0}and

*z*

_{1}, from the camera. Focusing the camera on the object located at

*z*

_{0}sets the plane of focus to a known value of

*z*

_{0}. Finding the shift Δ

*y*for the object at

*z*

_{1}, and solving Eq. (17) for

*c*we have

_{z}*c*, may be found with a precise physical measurement or with an additional calibration step [23].

_{y}The remaining variable necessary to find the distance *z* from the shift Δ*y* is the plane of focus, *z*_{0}. One approach would be to set the camera focus to infinity (or some very large distance). When *z*_{0} ≫ 0, then the distance given in Eq. (17) may be found approximately as follows

*z*

_{0}, from the camera.

#### 3.6. Distance resolution

Equation (17) may be used to find the distance of an object from the shift, Δ*y*, between the two projected points generated by the dual apertures. To determine the resolution of such an estimate, note that the derivative of Δ*y* with respect to *z* is

**p**moves in the object plane. We may then define the

*resolution*to be the magnitude of the inverse of this derivative,

*y*. Note that for the case in which

*z*

_{0}≫

*f*and

*z*≫

*f*, then the resolution is approximately,

*R*). Therefore, in order to maintain a constant resolution, as the distance

*z*gets larger the focal length of the lens should increase. A plot of the resolution

*R*versus

*z*(in meters) is shown in Fig. 8(a) for Δ

*c*= 20,

_{y}*c*= 0,

_{z}*f*= 150, and

*α*= 0.0052 (all in millimeters). Thus, an object at fifteen meters must move 0.5 meters in order to produce a shift of one pixel. Therefore, if we can estimate the shift Δ

*y*to a quarter pixel accuracy, then this would lead to a distance estimate that is accurate to within 0.125 meters. Note that, by comparison, an object at a distance of 45 meters must move 3.5 meters to produce a shift of one pixel, which means a distance estimate that is accurate to within one meter for estimate of Δ

*y*to within a quarter-pixel of accuracy. Figure 8(b) shows a plot of 100 ×

*R*(

*z*, Δ

*c*)/

_{y}*z*, which shows the accuracy of a distance estimate as a percentage of the distance that is being estimated. Thus, for example, at 50 meters the estimate will be accurate to within 9% for single pixel resolution whereas for an object at 10 meters it will be accurate to within 2%.

## 4. Experimental results

In this section, we evaluate the effectiveness of the DCA camera for distance estimation and verify the analysis of the off-aperture imaging relationships. For the experiments, a 12.2 megapixel Canon 450D DSLR camera with an APS-C CMOS sensor was used. To estimate distances in close range (less then five meters), the camera was equipped with a Canon EF-S 18–55 mm lens whereas for longer (five to twenty meters), a Tamron AF 55–200 mm lens was used. Both lenses were configured with a dual color filtered aperture with red and cyan filters. The size of the apertures and the aperture displacements Δ*c _{y}* are given in Table I, along with the specifications of the camera.

#### 4.1. Short range distance estimation

For short-range distance estimation, a white LED flashlight was used as the object. Fig. 10 shows the color shifting that results when the object is at distances between 40 cm and 500 cm when *f* = 50 mm and *z*_{0} = 115 cm. Note that when the object is in the near-focus position (*z* < 115 cm) then the red channel is shifted to the left of the cyan (green and blue) channel as shown in Figs. 10(a) and 10(b), and when when the object is in the far-focus position (*z*_{0} > 115 cm) then the shifting is in the opposite direction as shown in Figs. 10(c)–10(h).

To estimate the distance of the object, it is necessary to estimate the color shift Δ*y*. There are many ways that the shifts may be found automatically, but here we manually determined the color shifting value by moving the red channel along the *y*-axis until the color misalignment disappeared. Fig. 11 shows the results of the registration, and Fig. 12 shows a plot of the color shifting values as a function of distance along with a plot of the relationship between Δ*y* and *z* given in Eq. (16). It is clear from this plot that there is a close agreement between the experimental and theoretical values.

#### 4.2. Long range distance estimation

To evaluate the performance of the DCA camera for objects at longer distances, a printed cross pattern shown in Fig. 9(c) was used as the object, and images were taken of this object at distances that ranged from 5 to 30 meters at one meter interval. Equation (22) shows that the resolution of a distance estimate decreases (*R* increases) as the distance *z* increases for a fixed focal length. Therefore, for this experiment a longer focal length lens, a Tamron AF 55–200 mm lens, was used. Fig. 13 shows a plot of the experimentally measured color shifting values versus distance for objects at distances that varied from 5 to 30 meters with a 180 mm lens. As shown in the figure, these results are again consistent with those given by Eq. (16).

#### 4.3. Remarks on the limitation of image quality

In the view point of the acquired image quality, the major drawback of the DCA camera is the small, fixed-size apertures. The reduced amount of incoming light through a small aperture decreases the signal-to-noise ratio, but quantitative analysis of the aperture effect is omitted since the image quality is out of scope of the paper. The point spread function (PSF) of the fixed aperture also affects the image quality and the accuracy of the estimated distance of an object. Quantitative analysis of the PSF is performed by computing the blurry disk in the image plane as in Eq. (2).

## 5. Conclusion

In this paper, we presented the dual off-axis color filtered aperture (DCA)-based computational imaging system which can estimate the distance of the object in the single-camera framework. The DCA camera having red and cyan color filtered apertures was able to provide geometric disparity of two color channels in the single image. As a major contribution, we presented the mathematical model of the relationship between the color shifting value and the actual distance of the object based on the principle that the color shifting property enables to estimate the distance using the geometric structure of the DCA-based lens. Another contribution is that the proposed imaging system is implemented by simply inserting the DCA into any general optical system.

Experimental results show that the distance of the scene from the camera can be estimated using the proposed mathematical model of the proposed DCA. The proposed imaging system can be used for various photographic applications such as multifocusing, refocusing, and depth-based image segmentation. This system can also be applied to video processing applications such as video surveillance systems, intelligent driver assistant systems, and robot vision.

## Acknowledgments

This research was supported by Basic Science Research Program through National Research Foundation (NRF) of Korea funded by the Ministry of Education, Science and Technology ( 2009-0081059) and by the Distinguished Foreign Professor Program of Chung-Ang University in 2011.

## References and links

**1. **Y. Lim, J. Park, K. Kwon, and N. Kim, “Analysis on enhanced depth of field for integral imaging microscope,” Opt. Express **20**, 23480–23488 (2012). [CrossRef] [PubMed]

**2. **V. Aslantas, “A depth estimation algorithm with a single image” Opt. Express **15**, 5024–5029 (2007). [CrossRef] [PubMed]

**3. **Y. Frauel and B. Javidi, “Digital three-dimensional image correlation by use of computer-reconstructed integral imaging,” Appl. Opt. **41**, 5488–5496 (2002). [CrossRef] [PubMed]

**4. **T. Poon and T. Kim, “Optical image recognition of three-dimensional objects,” Appl. Opt. **38**, 370–381 (1999). [CrossRef]

**5. **U. Dhond and J. Aggarwal, “Structure from stereo-a review,” IEEE Trans. Sys., Man, Cyber. **19**, 1498–1510 (1989). [CrossRef]

**6. **D. Schastein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithm,” Int. J. Comput. Vis. **47**, 7–42 (2002). [CrossRef]

**7. **C. Tomasi and T. Kanade, “Shape and motion from image streams under orthography: a factorization method,” Int. J. Comput. Vis. **9**, 137–154 (1992). [CrossRef]

**8. **N. Asada, H. Fujiwara, and T. Matsuyama, “Edge and depth from focus,” Int. J. Comput. Vis. **26**, 153–163 (1998). [CrossRef]

**9. **P. Favaro and S. Soatto, “A geometric approach to shape from defocus,” IEEE Trans. Pattern Anal. Mach. Intell. **27**, 406–417 (2005). [CrossRef] [PubMed]

**10. **A. Pentland, “A new sense for depth of field,” IEEE Trans. Pattern Anal. Mach. Intell. **9**, 523–531 (1987). [CrossRef] [PubMed]

**11. **S. Zhuo and T. Sim, “On the recovery of depth from a single defocused image,” in *Proceeding of International Conference on Computer Analysis of Images and Patterns* (Seville, 2011), pp. 889–897.

**12. **P. Axelsson, “Processing of laser scanner data-algorithms and applications,” ISPRS Journal of Photogrammetry and Remote Sensing **54**, 138–147 (1999). [CrossRef]

**13. **S. Nayar, M. Watanabe, and M. Noguchi, “Real-time focus range sensor,” IEEE Trans. Pattern Anal. Mach. Intell. **18**, 1186–1198 (1996). [CrossRef]

**14. **L. Zhang and S. Nayar, “Projection defocus analysis for scene capture and image display,” ACM Trans. Graphics **25**, 907–915 (2006). [CrossRef]

**15. **S. Nayar, “Computational cameras: redefining the image,” Computer **39**30–38 (2006). [CrossRef]

**16. **C. Zhou and S. Nayar, “Computational cameras: convergence of optics and processing,” IEEE Trans. Image Process. **20**, 3322–3340 (2011). [CrossRef] [PubMed]

**17. **C. Zhou, S. Lin, and S. Nayar, “Coded aperture pairs for depth from defocus,” in *Proceedings of IEEE International Conference on Computer Vision* (Institute of Electrical and Electronics Engineers, Kyoto, 2009), pp. 325–332.

**18. **A. Levin, R. Fergus, F. Durand, and W. Freeman, “Image and depth from a conventional camera with a coded aperture,” ACM Trans. Graphics **26**, 70–79, 2007. [CrossRef]

**19. **Q. Dou and P. Favaro, “Off-axis aperture camera: 3D shape reconstruction and image restoration,” in *Proceedings of IEEE Conference on Computer Vision and Pattern Recognition* (Institute of Electrical and Electronics Engineers, Anchorage, 2008), pp. 1–7.

**20. **V. Maik, D. Cho, J. Shin, D. Har, and J. Paik, “Color shift model-based segmentation and fusion for digital autofocusing,” J. Imaging Sci. Technol. **51**, 368–379 (2007). [CrossRef]

**21. **S. Kim, E. Lee, M. Hayes, and J. Paik, “Multifocusing and depth estimation using a color shift model-based computational camera,” IEEE Trans. Image Process. **21**, 4152–4166 (2012). [CrossRef] [PubMed]

**22. **S. Bae and F. Durand, “Defocus magnification,” Comput. Graph. Forum. **26**, 571–579 (2007). [CrossRef]

**23. **S. Lee, J. Paik, and M. Hayes, “Distance estimation with a two or three aperture SLR digital camera,” in *Proceedings of Advanced Concepts for Intelligent Vision Systems* (Poznan, Poland, 2013).