A three-dimensional optical correlator using a lens array is proposed and demonstrated. The proposed method captures three-dimensional objects using the lens array and transforms them into sub-images. Through successive two-dimensional correlations between the sub-images, a three-dimensional optical correlation is accomplished. As a result, the proposed method is capable of detecting out-of-plane rotations of three-dimensional objects as well as three-dimensional shifts.
© 2005 Optical Society of America
Most of the traditional techniques for optical information processing have dealt with two-dimensional (2D) objects or 2D images. However, three-dimensional (3D) information processing has recently attracted attention because it can reflect the real world without any loss of information. As a basic building block of the entire 3D information processing, considerable efforts are concentrated on a 3D optical correlator. A 3D optical correlator provides a direct way to recognize and locate 3D objects distributed in a 3D space. Several schemes have been proposed to realize a 3D optical correlator using holography [1,2], 3D optical Fourier transform [3,4], Fourier transform profilometry , and integral imaging [6–8]. Most, however, are able to only find the 3D shift of the reference object and not out-of-plane rotations. Moreover they need to be more compact in order to be practical. In this paper, we propose a 3D optical correlator using a sub-image array. With a relatively compact configuration, the proposed method is capable of detecting out-of-plane rotations of a reference 3D object as well as its 3D shifts.
A schematic diagram of the proposed method is shown in Fig. 1. The reference and signal 3D objects are imaged by lens arrays. Each lens constituting the lens arrays forms a corresponding image of the object space. Each lens of the lens array is referred to as an elemental lens and the image formed by the elemental lens is referred to as an elemental image. Thus, an elemental image is an ordinary picture of the object space. The captured perspective of the object and its position in the elemental image depend on the position of the corresponding elemental lens relative to the position of the object. By imaging through the lens array, these elemental images are obtained and captured by charge-coupled devices (CCDs). The captured elemental image arrays for reference and signal objects are digitally transformed to sub-image arrays of each. Some sub-images of the reference object are then selected and correlated with every sub-image of the signal object by means of a conventional joint transform correlator (JTC) scheme, thus yielding information on the 3D shift and out-of-plane rotation of the signal object with respect to the reference object.
The most unique part of the proposed method is that a correlation operation is performed on the sub-images instead of the elemental images. The sub-image is a collection of pixels at the same position in all of the elemental images, or equivalently at the same relative position with respect to the optic axis of the corresponding elemental lens [9,10]. Figure 2 illustrates the generation of sub-images. In Fig. 2(a), 5 elemental lenses are shown and thus 5 elemental images. The pixels at the same location in the elemental images are collected to form the corresponding sub-image. For example, pixels at positions corresponding to a blue dot in every elemental image form one sub-image and pixels at green dot positions form another sub-image. Since there are 5 elemental images in Fig. 2(a), each sub-image will consist of 5 pixels. Figure 2(b) shows a 2D case. Pixels located at [1,1] in every elemental image (in Fig. 2(b), 6(H)×4(V) elemental images are shown) are collected to form the [1,1]th sub-image and pixels at [i,j] form the [i,j]th sub-image. Each sub-image in Fig. 2(b) consists of 6(H)×4(V) pixels since there are 6(H)×4(V) elemental images.
The sub-image has two useful features that can be exploited in the proposed method to realize a 3D correlation. One is the fact that each sub-image represents a specific angle in which the object is observed regardless of the 3D position of the object. For example, in Fig. 2(a), i-th sub-image (collection of blue dots in Fig. 2(a)) contains the perspective of the object observed in an angle given by
where yi is the position of the i-th pixel with respect to the optic axis of the corresponding elemental lens. Note that in an ordinary imaging system the angle of observation is determined by the relative position of the imaging lens with respect to the object. This observation-angle dependency on the object position, however, is removed in the sub-image. Figure 3 demonstrates this point. In the case of an ordinary imaging system shown in Fig. 3(a), the captured perspective of the object changes as the object moves from position 1 to position 2. With reference to Fig. 3(a), when the object is located at position 1, the imaging lens observes the object at an angle of θ observation and the corresponding oblique perspective of the object is captured. On the contrary, when the object is located at position 2, the imaging lens faces the object at an angle of 0° and thus a center perspective of the object is captured. In the sub-image, however, the perspective of the object contained in each sub-image is the same regardless of the object shift as shown in Fig. 3(b): the sub-image corresponding to red pixels observes the object with an angle of 0° and the sub-image corresponding to blue pixels observes the object withθ sub,i for both positions 1 and 2. The angle-invariance of the sub-image makes it possible to select certain angle of observation deterministically regardless of the object position.
Another useful property of the sub-image is that the perspective size is invariant, regardless of the object depth. In an ordinary imaging system, the perspective size is inversely proportional to the object depth. Therefore if the object moves farther from the imaging lens as shown in Fig. 4(a), the object perspective in the captured image becomes smaller. In the sub-image, however, the object is captured in the form of parallel lines with the sampling period of elemental lens pitch φ as shown in Fig. 2, and thus the size of the object perspective in the sub-image is constant. When the object depth changes, only the position of the object perspective is changed in each sub-image but the size itself is not changed. For example, suppose that an object whose transverse size covers 5 elemental lenses is imaged by the lens array shown in Fig. 4(b). The size of the object perspective in the sub-image is determined by the number of the sub-image parallel lines that intersect the object. In Fig. 4(b), it is 5 pixel size for the sub-image corresponding to the red dots, and 6 pixel size for the sub-image corresponding to the blue dots. When the object moves longitudinally as shown in lower diagram in Fig. 4(b), the number of parallel lines intersecting the object is still 5 for red dots and 6 for blue dots, and thus the size of the object perspective in those sub-images is not changed. Only the position of the object perspective in the sub-image is changed (by 2 pixels for the sub-image corresponding to the blue dots and 0 pixels for the sub-image corresponding to the red dots in Fig. 4(b)). This size-invariant feature removes the necessity for any scale-invariant detection techniques such as a Mellin transform, even though the signal object shifts in the depth direction.
Using these two features of the sub-image, i.e. size-invariance and angle-invariance, the 3D shift and the out-of-plane rotation can be detected by using a JTC scheme as follows. Suppose that the reference object is located at (yr, zr) and the signal object is located at (ys, zs) as shown in Fig. 2(a). First, let us assume that the signal object has no out-of-plane rotation for the sake of simplicity; i.e. θy-z=0° in Fig. 2(a). Since there is no out-of-plane rotation, the perspective of the object contained in the i-th sub-image of the signal object is the same as that contained in i-th sub-image of the reference object. Note that this is true irrespective of where the signal object is located with respect to the reference object due to the observation angle invariance property of the sub-image. Also note that the sizes of the perspectives in these two sub-images for the reference and signal objects are the same due to the size-invariance property. The position of the perspective in the i-th sub-image is given by ur,i=(1/φ)(yr+zrtanθ sub,y-z,i) for the reference object and us,i=(1/φ)(ys+zstanθ sub,y-z,i) for the signal object. Their position difference Δur,i,s,i can be written by
Since the i-th sub-images of the reference and signal objects contain the same perspective of an object with the same size, the position difference Δur,i,s,i can be detected by correlating the i-th sub-images of the reference and the signal objects using JTC. In Eq. (2), only ys and zs are unknowns and thus, the 3D shift in the signal object can be found through two correlation operations with different i′s.
When there is an out-of-plane rotation θy-z of the signal object, we cannot find the 3D shift by correlating the reference object sub-image with the signal object sub-image of the same index because they will, in general, contain different perspectives of the object. In this case, the sub-image pair that contains the same perspective of the object should be found first, in other words the out-of-plane rotation should be detected first. The 3D shift can then be found considering the out-of-plane rotation. The out-of-plane rotation angle θy-z of the signal object is detected by correlating one arbitrarily chosen sub-image for the reference object with every sub-image of the signal object successively. Among them, the sub-image pair yielding the strongest correlation peak will satisfy θ sub,y-z,i - θ sub,y-z,j=θy-z where θ sub,y-z,i is the angle of observation of the i-th sub-image of the reference object and θ sub,y-z,j is that of the j-th sub-image of the signal object, since they have the same perspective of the object. Therefore, by finding the sub-image pair that produces the strongest correlation peak, the out-of-plane rotation angle θy-z is detected. After θy-z is detected, the 3D position of the signal object can also be detected by correlating two more sub-image pairs as a no out-of-plane rotation case. In this case, however, we correlate the i-th sub-image of the reference object with the j-th sub-image of the signal object where θ sub,y-z,i - θ sub,y-z,j=θy-z since they have the same perspective. The position difference Δur,i,s,j of the object perspectives in i-th reference sub-image and j-th signal sub-image is given by
Therefore the 3D position of the signal object can be found through Eq. (3) by selecting two sub-image pairs corresponding to θ sub,y-z,i for the reference object and θ sub,y-z,i + θy-z for the signal object and measuring the positions of their correlation peaks. Figure 5 shows the overall procedure used in the proposed method.
In the detection of out-of-plane rotation, the minimum angle resolvable in the proposed method is determined by the difference between the observation angles of neighboring sub-images. Specifically, the angular resolution Δθ is given by Δθ=θ sub,i+1-θ sub,i. Since the observation angle of i-th sub-image θ sub,i is given by Eq. (1), the angular resolution Δθ becomes
where s is the pixel pitch at the image plane of the lens array. The angular range Ω that can be detected in the proposed method is determined by the range of the observation angle of the sub-images (range of θ sub). Since yi is restricted by -φ/2<yi<φ/2, the angular range Ω becomes
where Eq. (1) is used.
3. Experimental results
In the experiment, we used a 3D object consisting of two man-dolls, longitudinally separated by 30 mm, as reference and signal objects as shown in Fig. 6. The two man-dolls can be considered as two extreme ends of one 3D object. We captured this 3D object with a lens array consisting of 50×50 rectangular elemental lenses with a 3.3 mm focal length and a 1 mm lens pitch. The captured elemental images of the reference and signal objects are transformed to sub-image arrays as shown in Fig. 6. In our experiment, one elemental image consisted of 20×20 pixels of CCD, and thus 20×20 sub-images were generated for each of the reference and signal objects. In Fig. 6, we can see that each sub-image contains the corresponding perspective of the reference or signal object. A noteworthy point here is that the sizes of the perspectives are the same in the sub-images of the reference and signal objects, although their longitudinal positions zr and zs are different, which confirms the size-invariant property of the sub-image. In order to verify the out-of-plane rotation detection capability, we fixed the reference object at (xr, yr, zr)=(0 mm, 0 mm, 25 mm) and rotated the signal object located at (xs, ys, zs)=(5 mm, 0 mm, 40 mm) with θx-z=0°, 2°, 4°, and 6° and θy-z=0°. One reference sub-image is correlated with each sub-image of the signal object. In the correlation operation, the joint power spectrum (JPS) was obtained optically using a He-Ne laser, a spatial light modulator (SLM) with a 0.036 mm pixel pitch and CCD, and it was then Fourier transformed digitally to produce the correlation peak. Figure 7 shows an example of the JPS captured on CCD and the correlation peak obtained by Fourier transforming the JPS digitally. The correlation peak intensity profile over the sub-image index of the signal object is plotted in Fig. 8. The correlation peak was normalized by the energy of the signal sub-image. In Fig. 8, it can be seen that the peak intensity profile correctly reflects the out-of-plane rotation of the signal object. By finding the signal sub-image that yields the maximum correlation peak, the out-of-plane rotation angle (θx-z, θy-z) can be detected exactly. In our experiment, it was somewhat prone to errors due to the insufficient resolution of each elemental image, changing illumination conditions and shadings according to the rotations. However, as shown in Fig. 8, it still follows a correct tendency in this state and could be much enhanced considerably if more robust optical correlation methods were to be used.
After the rotation angle is detected, the 3D location of the signal object is detected by finding the correlation peak positions of two sub-image pairs using Eq. (3). Figures 9 and 10 show the detected positions of the correlation peak with various locations of the signal object with or without out-of-plane rotation. Equations (2) and (3) indicate that the slope of the tanθ sub,x-z,i-vs.-Δu line corresponds to (zs-zr) and its Δu-offset corresponds to (xs-xr) and zs. The experimental results shown in Figs. 9 and 10 demonstrate this point clearly. The slope increases as the signal object moves farther from the reference object longitudinally (see the second to sixth graphs in Figs. 9 and 10), and the Δu-offset reflects (xs-xr) in the case of no rotation (see the first graph in Figs. 9) or (xs-xr) and zs in the case of rotation (see every graph in Figs. 10). This provides convincing support for the 3D shift detection capability of the proposed method.
We proposed a novel optical 3D correlator using a sub-image array. The proposed method can detect the out-of-plane rotation of an object as well as its 3D shift by correlating the sub-images of the reference and the signal object using JTC. The feasibility of the proposed method was verified experimentally.
This work was supported by the Information Display R&D Center, one of the 21st Century Frontier R&D Programs funded by the Ministry of Commerce, Industry and Energy of Korea.
References and links
1. T.-C. Poon and T. Kim, “Optical image recognition of three-dimensional objects,” Appl. Opt. 38, 370–381 (1999). [CrossRef]
2. B. Javidi and E. Tajahuerce, “Three-dimensional object recognition by use of digital holography,” Opt. Lett. 25, 610–612 (2000). [CrossRef]
4. J. Rosen, “Three-dimensional joint transform correlator,” Appl. Opt. 37, 7438–7544 (1998). [CrossRef]
5. J. Esteve-Taboada, D. Mas, and J. Garcia, “Three-dimensional object recognition by Fourier transform profilometry,” Appl. Opt. 38, 4760–4765 (1999). [CrossRef]
6. O. Matoba, E. Tajahuerce, and B. Javidi, “Real-time three-dimensional object recognition with multiple perspectives imaging,” Appl. Opt. 40, 3318–3325 (2001). [CrossRef]
8. J.-H. Park, S. Jung, H. Choi, and B Lee, “Detection of the longitudinal and the lateral positions of a three-dimensional object using a lens array and joint transform correlator,” Opt. Mem. Neur. Net. 11, 181–188 (2002).
9. C. Wu, A. Aggoun, M. McCormick, and S.Y. Kung, “Depth extraction from unidirectional image using a modified multi-baseline technique,” in Conference on Stereoscopic Display and Virtual Reality Systems IX , A.J. Woods, J.O. Merritt, S.A. Benton, and M.T. Bolas eds., Proc. SPIE 4660, 135–145 (2002).
10. J.-H. Park, S. Jung, H. Choi, Y. Kim, and B. Lee, “Depth extraction by use of a rectangular lens array and one-dimensional elemental image modification,” Appl. Opt. 43, 4882–4895 (2004). [CrossRef] [PubMed]