In this paper, we present a novel volumetric computational reconstruction (VCR) method for improved 3D object correlator. Basically, VCR consists of magnification and superposition. This paper presents new scale-variant magnification as a technique for VCR. To introduce our technique, we discuss an interference problem among elemental images in VCR. We find that a large magnification causes interference among elemental images when they are applied to the superposition. Thus, the resolution of reconstructed images should be limited by this interference. To overcome the interference problem, we propose a method to calculate a minimum magnification factor while VCR is still valid. Magnification by a new factor enables the proposed method to reconstruct resolution-enhanced images. To confirm the feasibility of the proposed method, we apply our method to a VCR-based 3D object correlator. Experimental results indicate that our method outperforms the conventional VCR method.
©2008 Optical Society of America
Integral imaging has been an attractive technique [1–6] for autostereoscopic three-dimensional (3D) display due to the merits of full parallax, continuous viewing angle and full color display. In general, an integral imaging system consists of two parts; pickup and reconstruction. In the pickup part, rays emanating from a 3D object passing through a lenslet array are recorded digitally using an imaging sensor such as charge coupled device (CCD). Those recorded rays are called as elemental images. In the reconstruction part, two kinds of reconstruction methods have been studied for volumetric 3D image display. One is optical reconstruction [2–6] and the other is volumetric computational reconstruction (VCR) [7–13]. The optical reconstruction method, where 3D images are optically reconstructed by use of a lenslet array and a display panel, has some problems caused by physical limitations of optical devices such as diffraction and aberration. To overcome these problems, a VCR method based on a pinhole array model has been introduced [7, 9–10].
Recently, 3D object correlation and recognition have been actively investigated using integral imaging [12–15]. For examples, Frauel and Javidi suggested a method to estimate the longitudinal depth of 3D objects . Park et al. proposed a 3D optical correlator using sub-images that are derived from the elemental images . One of the authors presented a resolution-enhanced 3D object correlator using VCR to extract the location data of 3D objects .
In the conventional VCR method as shown in Fig. 1(a), however, there are some serious problems to be solved. For example, there are artifacts that reconstructed images have and the high computational load that is due to a large magnification factor at far distance . In addition, images that reconstructed by the VCR method suffer from blurring although it is reconstructed at the right position where a 3D object is located [7–9]. This is because the overlapping process of the conventional VCR method incurs interference among adjacent pixels when magnified elemental images are superimposed on each other. We call this interference as interpixel interference. Actually, the interference prevents VCR from reconstructing high resolution images for 3D objects. It should be noted that there is another interference problem in the pickup process due to the imperfect condition of a lenslet array and a camera. In this paper, we assume that the pickup process is ideal, that is, the input elemental images do not have noise and distortion arising from a lenslet and a camera. And thus we only consider the interpixel interference in VCR. To our best knowledge, the interpixel interference in VCR is not reported in the literature whereas the interference in the pickup is well-known.
To overcome the interpixel interference problem, we propose scale-variant magnification (SVM) for VCR. The novel VCR method is applied to 3D object correlator to evaluate its performance. We first explain the interpixel interference problem in VCR. To avoid the interference problem, we provide an algorithm for the scale-variant magnification in VCR. Minimizing the interpixel interference enables VCR to provide resolution-enhanced plane images. To confirm the feasibility of the proposed method, we apply the proposed method to a VCR-based 3D object correlator and show various experimental results.
2. Principle of VCR
The principle of the conventional VCR method is shown in Fig. 1(a). First, an elemental image is inversely projected through its corresponding pinhole. VCR reconstructs a plane image on an output plane that is located at distance z. We call the output plane as the reconstructed output plane (ROP) in this paper. The inversely projected elemental image is digitally magnified by a magnification factor M=z/g, where z is the distance between the virtual pinhole array and ROP and g is the distance between the pinhole array and the elemental image plane, respectively. Second, the magnified elemental images are superimposed on ROP as shown in Fig. 1(c). Basically, VCR employs a computer to construct plane images. This means that elemental images are treated as a discrete signal. Now, let us define the sampling interval of input elemental images as d. Then, the number of samples of the magnified elemental images must increase to keep the sampling interval of the magnified elemental images in ROP to be d. This implies that the sampling interval of ROP is to be d regardless of the distance z. This magnification provides that the physical properties such as the object size and length in ROP can be same as those of the original image plane. Note that one can choose another sampling interval in ROP (resampling in ROP not in elemental images). That is, one can choose a wider sampling interval or a shorter one. However, the change of sampling interval results in mismatch between the size of original objects and reconstructed objects. Also, choosing a wider sampling interval can cause information loss inevitably and choosing a shorter interval only increases the computational load without enhancing image quality. Thus, one should understand the facts before changing the sampling interval. To completely reconstruct a plane image on ROP at distance z, magnification and overlapping must be repeatedly conducted with the entire elemental images. Finally, normalization with respect to the plane image on ROP is required to eliminate the granular noise [9–10]. An iterative process for the ROPs of other locations produces a sequence of the plane images along the z-direction.
3. Proposed VCR method using SVM
Figure 1(b) illustrates the principle of the proposed VCR method using SVM (scale-variant magnification). VCR consists of three processes: magnification, superposition, and normalization. The main difference between the conventional VCR method and our method is that our method uses a new scale-variant magnification factor Md(z) to reduce interference between adjacent pixels in the superposition process. That is, in the proposed method, each pixel of elemental images is magnified by a factor of Md, so that the number of the magnified pixels is to be Md and the total size of the magnified pixels is to be d×Md as shown in Fig. 1(b). In most cases, Md is much smaller than the conventional magnification factor. Thus, our method uses less magnified elemental images. This is illustrated in Fig. 1(d). It is seen that less magnification factor provides us less number of overlapped elemental images on ROP. When the magnified pixels of elemental images are accumulated on ROP, empty spaces (blank pixels) disappear as all elemental images are superimposed. Thus there is no empty space in ROP and our method can minimize the interference by reducing the overlapped area of elemental images.
To understand the interference problem in the conventional VCR method, we consider a computational integral imaging system for a point object as shown in Fig. 2. Figure 2(a) shows a point object that is recorded with a lenslet array and its elemental images. Now, consider a pixel-to-pixel mapping from the pixels of elemental images to the pixels in ROP without magnification, as shown in Fig. 2(b). Here, we call the mapped pixels in ROP as effective pixels. It is easily seen that the number of effective pixels is the same as that of elemental images because the pixel-to-pixel mapping is a one-to-one mapping. However, the size of domain of ROP increases as the distance z increases. Thus, there may be empty area in ROP. In the conventional VCR method, effective pixels are magnified by a magnification factor M=z/g and then the magnified pixels are superimposed on ROP, as shown in Fig. 2(c). In this case, the reconstructed image of the point object can be obtained on the ROP at the position where the point object is placed. The superposition of the magnified pixels fills the empty space smoothly. However, effective pixels are interfered with each other due to the large magnification. Note that effective pixels represent the component pixels of 3D objects and have the exact intensity value of 3D objects. Thus, they should not be overlapped to obtain a high resolution image.
To avoid this interference problem, we design a new magnification factor Md(z) with respect to the distance z in order that the magnified pixels are unable to interfere their adjacent effective pixels. This implies that the factor Md(z) can be determined by the distance between the two nearest effective pixels. Figure 2(d) shows our method of determining the new factor of Md(z).
To exactly calculate the factor Md(z) with respect to z, we consider a ray diagram that shows rays emanating from the elemental images and passing through the pinholes, as shown in Fig. 3(a). In order to formulate the relation between elemental images and a plane image on ROP, we use a ray analysis using the ABCD matrix . We present a one-dimensional analysis to simplify the formulation for SVM. The extension to a two-dimensional analysis is straight forward. In Fig. 3(a), we assume the use of a pinhole array and consider only the rays starting from pixels of the elemental images and passing through each pinhole. Each elemental image region corresponding to each pinhole has N pixels. Let us denote the index of pixels in each elemental image region by n=1,2,…, N and the index of elemental images and corresponding pinholes by k=1,2,…, K. Now, consider a ray emanating from the n-th pixel of the k-th elemental image and passing through the k-th pinhole. When the ray reaches the ROP at z, it is considered to be a point (vector) on the output plane at z. Figure 3(a) shows the situation. Here, we define the vertical component (or height) of the point vector (or ray on ROP) by
where p is the size of each elemental image.
Now, to calculate the distance between the two effective pixels in ROP, we consider two rays on ROP as shown in Fig. 3(a). The first ray comes from the n 1-th pixel of the k 1-th elemental image and the second ray comes from the n 2-th pixel of the k 2-th elemental image. Then the vertical distance between two points (two rays on ROP) is given by
The normalized absolute distance between the two points on ROP is obtained by
where 0<|k 2-k 1|<K and 0<|n 2-n 1|<N. Using Eq. (3), we have various values of α for all k 2, k 1, n 2 and n 1 of entire elemental images. If α=0, there is no distance between the two rays on ROP. Then this implies that the two rays are incident at the same point on ROP. This case is not related with the magnification and thus we exclude the case of α=0 to find the minimization algorithm. For all α, the minimum value αmin means the minimum distance between the two adjacent effective pixels (or rays) on ROP. In other words, there is an empty space to fill-in. Therefore we use the minimum value αmin as a new magnification factor Md=αmin.
Note that the factor Md is calculated by a minimization algorithm so that it is hard to provide an analytical formula like the conventional method. However, the determination process of the SVM factor is simple. Figure 4 shows the proposed minimization process for the SVM factor. The basic concept of the proposed minimization process is that our method calculates all possible values of α (except α=0) and then chooses the minimum among the values. Refer Fig. 4 for the detail part of the minimization process.
Let us discuss the meaning and effect of the new magnification. For example, Figure 3(b) shows the factor Md with respect to distance, where N=34 and K=30. This result indicates that Md is much smaller than conventional magnification factor M except for z=51 mm. The exceptional case is where many rays are overlapped at the same point. Another characteristic is that Md is unit. This implies that the effective pixels cover the entire region of ROP. Consequently, our method sometimes does not require a magnification process because a perfect plane image on ROP is obtained by the pixel-to-pixel mapping. We call the condition Md=1 as perfect coverage. Figure 3(b) shows that there are many cases of perfect coverage. A plane image on ROP improves in terms of visual quality when the perfect coverage condition is satisfied. In next chapter, we will discuss experimental results including the improvement by the perfect coverage condition.
4. Experimental results
4.1 Computational experiments
We conducted computational experiments with three test images to objectively evaluate the performance of the proposed VCR method. Figure 5(a) illustrates our experimental setup of computational tests. A pinhole array in the experimental setup is composed of 30×30 pinholes and it is located at z=0 mm. The interval between pinholes is 1.08 mm and the gap g between the elemental images and the pinhole array is 3 mm. Three images named Lena, Car, and Cow are used as test images, as shown in Fig. 5(b). The size of each image is 1020×1020 pixels. After a test image is located at distance z, its elemental images are synthesized by the computational pickup based on the simple ray geometric analysis . The number of entire elemental images is 30×30 and each elemental image has 34×34 pixels. And then the synthesized elemental images are applied to two VCR methods. In our VCR method (where N=34 and K=30), elemental images are magnified by a factor of Md(z), as shown in Fig. 3(b), and are superimposed on the ROP at z to reconstruct a plane image. Finally, the normalization process is applied to the reconstructed plane image to eliminate the granular noise [9–10]. This reconstructed plane image is denoted by Rz(x,y), where z is the distance between the reconstructed plane image and the pinhole array.
To quantitatively evaluate our VCR method, the mean square error (MSE) is employed as an image quality measure. We calculate MSEs between an original test image O(x,y) and its reconstructed plane image Rz(x,y). The MSE is defined as
where x and y are the pixel coordinates of images of size u×v. For the three test images, the average MSE is calculated according to the distance z. The MSE results are presented in Fig. 6. As z increases, the MSE of the conventional method increases due to the increasing interference for a large value M. However, our method uses Md(z) in place of M. The MSE of our method is much less than that of the conventional method. Thus Fig. 6 indicates that the proposed method improves image quality when it is compared with the conventional method.
Figure 7(a) and 7(b) show plane images that are reconstructed at the distance z=21 mm and z=51 mm by the conventional method and the proposed method, respectively. Referring to Fig. 6, the plane image that is reconstructed by the proposed method on ROP of the distance z=21mm has zero MSE. Consequently, it is easily seen that the visual quality of the plane image that is reconstructed on ROP of z=21mm by our method outperforms that of the plane image that is reconstructed by the conventional method. On the other hand, the plane image that is reconstructed on ROP of the distance z=51mm by our method has the same MSE as the plane image that is reconstructed by the conventional method does. It is seen that the visual quality of the two reconstructed images are same as each other. Therefore, with the two results of Fig. 6 and 7, we can state that the worst performance of our method is same as the performance of the conventional method. Also, we can say that our method can provide a perfect reconstruction of the original test image, which is impossible in the conventional method.
We discussed the perfect coverage condition in previous chapter. The meaning of this condition is that the SVM factor Md is equal to 1. If Fig. 3(b) is compared with Fig. 6, it is found that the locations of zero MSE are same as those of Md=1. This means that the perfect coverage provides perfect reconstruction. On the other hand, the special case of distance z=51 mm indicates that the SVM factor Md is the same as the conventional magnification factor M (M=Md=17). Also, it shows that the MSE results of the two VCR methods are equal to each other and so do the visual results of Fig. 7. Based on these facts, we can state that the performance of VCR is highly related to the magnification factor. Actually, the performance of VCR improves as the magnification factor decreases.
4.2 Computational experiments for 3D object correlator
The basic concept of the VCR-based 3D object correlator is as follows. First elemental images of a reference object should be prepared by a lenslet array and a digital camera before a correlation process begins. VCR produces a plane image on ROP of predefined distance from the elemental images of the reference object, which is called a reference plane image in this paper. Then, a reference object region (template) on the reference plane image can be easily defined. After preparing the template, a 3D object correlator is feasible. VCR can also produce a series of plane images from an input elemental image, which can be considered to be a 3D image or a volumetric image. Thus, 3D object recognition can be accomplished using cross-correlation between the template and the input series of the plane images from the input elemental images. Largely, the performance of correlation or recognition depends on the image quality of VCR.
Next, we carried out computational experiments for 3D object correlator using the three test images as shown in Fig. 5(b). The experimental setup is shown in Fig. 8. The reference test images are located at the known distance z=30 mm and elemental images of the test images are captured through a pinhole array by using the computer-generated pickup method. The pinhole array used in these experiments consists of 30×30 pinholes and its pinhole interval is 1.08 mm. The synthesized elemental images have 1020×1020 pixels and each elemental image is composed of 34×34 pixels. To obtain templates for object correlation, the elemental images of the test images are applied to both the conventional and the proposed VCR method. And then the input signal objects are tested at various distances. That is, they are located on ROP of distance z that increases by a factor of 3 mm. The pickup in Fig. 8(a) captures them as the elemental images of the input signal objects. These captured elemental images are applied to the two VCR methods including our method.
To show the improved performance of the proposed method for 3D object recognition, we employ the correlation peak and its sharpness for a performance measure. Correlation between an original test image O(x,y) and its reconstructed plane image Rz(x,y) is defined as
where * denotes the complex conjugate. Using Eq. (5), we can calculate the correlation coefficients between O(x,y) and Rz(x,y) of distance z and find out the maximum of the correlation coefficients, which is called as the correlation peak of distance z and is denoted by Cpeak(z) in this paper. Consequently, it is seen that the correlation peak Cpeak(z) is a function of distance z and the behavior of Cpeak(z) such as sharpness can be a performance measure for 3D object recognition. Figure 9 shows two curves of Cpeak(z) of the conventional method and the proposed method. Each Cpeak(z) represents the average of correlation peaks for three test images. The results of Fig. 9 indicates that the highest correlation peak occur at z=30 mm where the test image is originally located for the both methods. However, it is seen that the proposed method provides higher sharpness of Cpeak(z) rather than the conventional method.
To understand the effect of sharpness of correlation, Let us set a threshold for a recognition process to be 0.8 for example. If we applied the threshold to Fig. 9, we can say that an object is located in the range of 24mm to 36mm using the conventional method. On the other hand, we can say that an object is located in the range of 28mm to 32mm using the proposed method. This indicates that the accuracy (or tolerance) of our method to locate an object is 3 times better than that of the conventional method.
4.3 Experiments for real 3D object
To show the possibility of practical implementation of our method, we also carried out experiments with a real 3D object. The experimental setup is identical as shown in Fig. 8. In the optical pickup, the elemental images of the 3D object ‘Tree’ are recorded by using a lenslet array and a CCD camera. The ‘tree’ object is located at the distance z=45 mm from the lenslet array. The lenslet array used in the experiments consists of 30×30 lenslets and the size of each lenslet is 1.08×1.08 mm. The recorded elemental images have 1020×1020 pixels as shown in Fig. 10(a) and each elemental image is composed of 34×34 pixels. Then, the 3D object correlator is implemented in a computer as shown in the right of Fig. 8. We reconstruct the template of the target 3D object and a series of plane images using two VCR methods as depicted in Fig. 1. The plane images of these experiments are shown in Fig. 10(a) and 10(b) by employing the conventional VCR and the proposed VCR method, respectively. They are the plane images that are reconstructed at z=42 mm, 45mm and 48 mm. The plane image of the distance z=45 mm, where the original 3D object is located, are well-focused.
To measure the correlation performances of two VCR methods, the plane image of z=45 mm is used as the template for the correlation. And then correlation peaks are calculated by conducting the correlation between the template and a series of the plane images of the two VCR methods. The correlation results are shown in Fig. 11. The curve of correlation peaks from our method along the distance z indicates that the ‘tree’ object is located at z=45 mm because the maximum of correlation peaks are obtained at this distance. Like the previous computational experiments of this paper, the proposed method provides much sharper characteristic of correlation peaks than the conventional method. This situation is obvious when we compare the resulting images from the conventional method with those from the proposed method, as shown in Fig. 10(a) and (b). The three plane images from the conventional method look very similar and thus it is hard to determine the accurate location of the tree object. On the contrary, the three images shown in Fig. 10(b) indicate that the image of distance 45mm looks very different from the others and it is considered to be the best focused image. Consequently, we select the distance 45mm as the location of the tree object. Figure 11 also shows the same contents discussed above. The correlation peaks of distance through 42mm to 48mm are almost the same and thus it is hard to say that the location of the tree object is 45mm or one just find that the location of the tree object is between 42mm and 48mm. On the contrary, the curve of correlation peaks from our method has sharp correlation characteristic, that is, one can find that the maximum location of correlation peaks is 45mm.
We have proposed a resolution-enhanced VCR method using SVM for improved 3D object correlator. In this paper we introduced the interference problem due to large magnification in the superposition process of VCR. To overcome the interference problem, the magnification factor is required to be minimized. Based on this, we proposed an algorithm to calculate the minimum magnification factor. In addition, we found that there can be a perfect coverage condition. The condition enables the proposed VCR method to have a perfect reconstruction without magnification but with less computational load than the conventional one. Computational experiments indicate that new magnification (SVM) enables the proposed method to reconstruct resolution-enhanced images. In addition, it was successfully applied to VCR-based 3D object correlator. Therefore, we expect that the proposed VCR method may be used for applications including 3D pattern recognition.
This work was supported by the Korea Research Foundation Grant funded by the Korean Government (MOEHRD, Basic Research Promotion Fund) (KRF-2007-331-D00332) in part and by the post Brain Korea 21 program in part
References and links
1. G. Lippmann, “La photographic integrale,” C.R. Acad. Sci. 146, 446–451 (1908).
2. F. Okano, H. Hoshino, J. Arai, and I. Yuyama, “Three-dimensional video system based on integral photography,” Opt. Eng. 38, 1072–1077 (1999). [CrossRef]
3. J.-S. Jang and B. Javidi, “Improved viewing resolution of three- dimensional integral imaging by use of nonstationary micro-optics,” Opt. Lett. 27, 324–326 (2002). [CrossRef]
4. B. Lee, S. Y. Jung, S.-W. Min, and J.-H. Park, “Three-dimensional display by use of integral photography with dynamically variable image planes,” Opt. Lett. 26, 1481–1482 (2001). [CrossRef]
5. M. Martínez-Corral, B. Javidi, R. Martínez-Cuenca, and G. Saavedra, “Multifacet structure of observed reconstructed integral images,” J. Opt. Soc. Am. A 22, 597–603 (2005). [CrossRef]
7. S. -H. Hong, J. -S. Jang, and B. Javidi, “Three-dimensional volumetric object reconstruction using computational integral imaging,” Opt. Express 12, 483–491 (2004), http://www.opticsinfobase.org/abstract.cfm?URI=oe-12-3-483. [CrossRef] [PubMed]
8. D.-H. Shin, E.-S. Kim, and B. Lee, “Computational reconstruction technique of three-dimensional object in integral imaging using a lenslet array,” Jpn. J. Appl. Phys. 44, 8016–8018 (2005). [CrossRef]
9. S. -H. Hong and B. Javidi, “Improved resolution 3D object reconstruction using computational integral imaging with time multiplexing,” Opt. Express 12, 4579–4588 (2004), http://www.opticsinfobase.org/abstract.cfm?URI=oe-12-19-4579. [CrossRef] [PubMed]
10. H. Yoo and D. -H. Shin, “Improved analysis on the signal property of computational integral imaging system,” Opt. Express 15, 14107–14114 (2007), http://www.opticsinfobase.org/abstract.cfm?URI=oe-15-21-14107. [CrossRef] [PubMed]
11. D. -H. Shin and H. Yoo, “Image quality enhancement in 3D computational integral imaging by use of interpolation methods,” Opt. Express 15, 12039–12049 (2007), http://www.opticsinfobase.org/abstract.cfm?URI=oe-15-19-12039. [CrossRef] [PubMed]
13. J.-S. Park, D.-C. Hwang, D.-H. Shin, and E.-S. Kim, “Resolution-enhanced three-dimensional image correlator using computationally reconstructed integral images,” Opt. Commun. 26, 72–79 (2007). [CrossRef]
15. J.-H. Park, J. Kim, and B. Lee, “Three-dimensional optical correlator using a sub-image array,” Opt. Express 13, 5116–5126 (2005), http://www.opticsinfobase.org/abstract.cfm?URI=oe-13-13-5116. [CrossRef] [PubMed]