As a promising three dimensional passive imaging modality, Integral Imaging (II) has been investigated widely within the research community. In virtually all of such investigations, there is an implicit assumption that the collection of elemental images lie on a simple geometric surface (e.g. flat, concave, etc), also known as pickup surface. In this paper, we present a generalized framework for 3D II with arbitrary pickup surface geometry and randomly distributed sensor configuration. In particular, we will study the case of Synthetic Aperture Integral Imaging (SAII) with random location of cameras in space, while all cameras have parallel optical axes but different distances from the 3D scene. We assume that the sensors are randomly distributed in 3D volume of pick up space. For 3D reconstruction, a finite number of sensors with known coordinates are randomly selected from within this volume. The mathematical framework for 3D scene reconstruction is developed based on an affine transform representation of imaging under geometrical optics regime. We demonstrate the feasibility of the methods proposed here by experimental results. To the best of our knowledge, this is the first report on 3D imaging using randomly distributed sensors.
©2008 Optical Society of America
Three dimensional imaging and display systems [1-6] have a variety of applications in life sciences , defense and homeland security , object recognition [9, 10] and consumer products . Therefore, numerous passive 3D imaging techniques have been studied including multi-perspective imaging. Among various techniques which can quantitatively measure one or more of the psychological depth cues, one major thrust is Integral Imaging (II) [12-17] (also known as integral photography ) which is based on the original work of Lippmann  with lenticular sheets [20, 21] and is classified under multi-perspective 3D imaging systems. Integral Imaging provides autosterioscopic images  essentially by recording the intensity and direction of light rays (i.e. the light field [23, 24]) in the form of a set of elemental images from slightly different perspectives. This technique is a promising method compared to other techniques due to its interesting features including continuous viewing angle, full parallax and full color display without the need for coherent sources of illumination and its relative simplicity of implementation.
Integral imaging display is pursued in two major streams, namely full optical [22, 25] and computational reconstruction [26-30]. Both of these streams, share the same optical imaging (pickup) stage to obtain the elemental images [31, 32]. However, optical reconstruction techniques are essentially based on reversing the pickup stage optically by displaying elemental images on a LCD panel and either projecting them on a lenslet array display or using parallax barriers  for 3D visual perception. Developments in this venue include use of gradient index lens arrays  to handle the orthoscopic to pseudoscopic conversion, also resolution improvement methods including use of moving lenslet technique (MALT)  and electronically synthesized moving Fresnel lenslets . However, the optical reconstruction approach suffers from low resolution, low sampling rate, quality degradation due to diffraction, limited dynamic range and overall visual quality due to limitation of electro-optical projection devices . On the other hand, computational reconstruction techniques operate in the electronic domain rather than optical, which alleviate the problems mentioned above and provides flexibility of digital manipulation of integral image data with the cost of computational burden.
Among different pickup configurations proposed for integral imaging, Synthetic Aperture Integral Imaging (SAII) [36-38] is the one in which the sensor scans an area in a regular grid pattern and each elemental image is acquired in full frame. This enables one to obtain larger field of view, high resolution elemental images because each elemental image makes full use of the detector array and the optical aperture. There have been efforts to compress the integral image data [39, 40]. In addition, SAII allows one to potentially create large pickup apertures, orders of magnitude larger than what is practical with lenslet arrays. Large pickup apertures are important in obtaining the required range resolution at longer distances. With such a process to acquire elemental images, a natural question is what happens if the elemental images are taken neither on a regular grid nor on the same plane. To the best of our knowledge, this is the first time that 3D imaging using randomly distributed sensors has been presented.
In this paper, we present a generalized framework for 3D integral imaging with arbitrary, 3D pickup geometry and randomly distributed sensor coordinates in all three spatial dimensions. For 3D reconstruction, a finite number of sensors with known coordinates are randomly selected from within this pick up volume. In particular, we will study the case of Synthetic Aperture Integral Imaging (SAII) where the sensor distribution is not controlled, that is, it is random; however, the locations of sensors in space are known. We assume that all cameras have parallel optical axes but different distances from the 3D object. A computational reconstruction framework based on the back projection method is developed using a variable affine transform between the image space and the object space. It can be shown that affine coordinate transformation corresponds to orthographic projection similar to what is needed in light back-projection based II reconstruction.
We believe the modeling proposed in this study as well as our earlier analysis on performance of integral imaging systems under uncertain pickup location , gives rise to novel applications of Integral Imaging, e.g. 3D aerial imaging, collaborative 3D imaging, etc. We envision one of the possible uses of the proposed 3D imaging method as the case where a single image acquisition platform scans over an area in different trajectories at different times or multiple platforms concurrently collecting the images in the pickup space. Either way, the sensor locations would be randomly distributed in space in all three dimensions.
2. Sensor distribution
Virtually all instances of integral imaging methods have been investigated under the assumption that elemental images are captured on a known geometrical surface (e.g. planar, concave and etc) and in a regular pattern. In what follows, a generic scheme for integral imaging is proposed in which the pickup location is random and/or sparse.
To tackle the problem of 3D integral imaging with sparsely distributed elemental images, a synthetic aperture case in studied in which each sensor is positioned independently and randomly in 3D space looking at the scene [see Fig. 1]. The pickup location of the ith elemental image, 𝒫i, is measured in a universal frame of reference in Cartesian coordinates, Φ: (x,y,z), which is used later for 3D computational reconstruction of the scene. The origin of the frame of reference is rather arbitrary, but it has to be fixed during all position measurements. However, since the proposed mathematical image reconstruction framework merely relies on relative distance of elemental images in space, it stays consistent if the origin moves and all position measurements are adjusted accordingly. Also, the local coordinate system, Ψi: (ui,νi,wi), is defined for each sensor with its origin lying on the sensor’s midpoint. We assume that the sensor size (Lx,Ly), effective focal length of the i-th imaging optics, gi, and the position of each sensor from the pickup stage is known. In our analysis, we make no assumption on the distribution of elemental images in the space to achieve a generic reconstruction scheme. Without loss of generality, to demonstrate the feasibility of the proposed technique, the random pickup locations, 𝒫i=(xi,yi,zi), are chosen from three independent uniform random variables as following:
where ~ signifies that left operand follows the distribution in the right operand, and U(a,b); b>a denotes a uniform distribution with probability distribution function of f(x)=(b-a)-1 for a≤x≤b and f(x)=0 elsewhere. The actual distribution of elemental images is dictated by the specific application of interest. We have used uniform distribution to give all locations in the pickup volume an equal chance to get selected as sensor positions.
Since the choice of origin for universal frame of reference is arbitrary, without loss of generality, it is taken such that the z axis lies on the optical axis of the reference elemental image, i.e. the elemental image acquired from the perspective that reconstruction is desired [see Fig. 1]. Also, as the numbering sequence of elemental images is arbitrary, we label this elemental image as 𝓔 0.
3. Computational image reconstruction
Several methods have been investigated for computational reconstruction of II data. In the Fourier domain, digital refocusing has been proposed  by applying Fourier slice theorem in 4D light fields. This technique is relatively fast with complexity of O(n 2logn), n being the total number of samples. However, this method is intrinsically based on the assumption of periodic sampling of the light field and thus may require heuristic adjustments if elemental images are not ordered regularly. In the spatial domain, a fast, ray tracing based reconstruction from the observers point of view is proposed  with complexity of O(m), m being the number of elemental images. Although fast and simple, this method yields low resolution reconstructions. Yet another spatial domain reconstruction method is based on series of 2D image back projections . This method offers a much better reconstruction resolution comparing to  at the expense of an algorithm with complexity of O(n), since usually n≫m. For instance, m is typically in the range of 100-200 elemental images, while n can be as large as 107 pixels. In the context of this paper, we stay within the spatial domain in order to provide a generic reconstruction algorithm with minimum assumptions about the pickup geometry.
Computational reconstruction based on back-projection [26-30], has certain assumptions which are only valid for lenslet based integral imaging systems. In this section, we develop a more generic reconstruction method based on affine transform, which has its roots in a rigorous affine space theory and is popularly used for orthographic image projection. A spatial affine transform consists of any combination of linear transformations (e.g. scale, rotation, shear and reflection) followed by a translation in space, which can be written in a general form for 3D Cartesian space as:
in which 𝒜 and 𝒫 denote the compound linear transformation and translation matrices, respectively. Since the numbering sequence of elemental images is completely arbitrary, we take the view point from which we want to reconstruct the 3D scene to be one of the elemental images we label by 𝓔0. Assume that the reconstruction at distance zr from the reference sensor at 𝒫0 is desired. According to Fig. 1, the distance of the i-th elemental image from the desired reconstruction plane is:
For the case of back projection of the i-th elemental image, matrix 𝒜i and translation vector 𝒫i can be written as:
in which 𝒫i denotes the position of i-th sensor and Mi=zi/gi is the associated magnification between the i-th elemental image and its projection at distance zi. According to Fig. 1, the position of the midpoint of the plane that we are interested in reconstructing can be written as:
In order to include both linear transformation and translation of Eq. (2) in one matrix expression, matrix 𝒜i needs to be augmented by vector 𝒫r-𝒫i, which yields the following expression:
where Φ=[x,y,z]T and Ψi=[ui,νi,wi]T denote the points in the reconstruction space and i-th elemental image space, respectively. Each elemental image captured (as shown in Fig. 1) can be expressed as a scalar function in space. For instance, for the ith elemental image:
where Lx and Ly are the size of CCD imaging sensor in horizontal (x) and vertical (y) directions, respectively. Using Eq. (4) one can expand Eq. (6) to obtain the relationship between Φ and Ψi explicitly as:
It is clear from Eq. (9) that one needs to magnify the ith elemental image with ratio Mi which may get very large for zi≫gi. Such magnification does not introduce new information to the information already contained in the collection of original elemental images. Here, we formulate a substitute reconstruction method which, with no sacrifice in terms of information content, is more tractable in the aspect of required computational resources and speed, specially in synthetic aperture applications in which each elemental image can (and usually is) captured on a full frame imaging sensor array of mega pixel size. In practice, magnifying the elemental images with large ratios is not memory efficient if not impractical, especially when one tries to reconstruct full frame elemental images captured with SAII technique. Since elemental images are taken at arbitrary z, their correspondingmagnification varies. Nevertheless, using Eq. (3) it is possible to decompose the magnification factor, Mi, into a fixed magnification with a differential term as follows:
in which Δpzi=pzi-pz 0 denotes the difference in the longitudinal distance of ith sensor position and that of the 0-th (reference) sensor. Also, Δgi=gi-gr signifies the difference between focal lengths of the optics used for image capture at ith and reference images, respectively. It should be noted that Δpzi and Δgi can be either positive or negative, resulting in varying γi. From Eq. Eq. (10) it is clear that if all gi>0, γ would stay bounded. In the experimental results section, we will show that for most applications of concern, γ remains close to 1.
i.e. instead of magnifying elemental images with large factors, Mi, and translating them laterally by large amounts (px/yi-px/y 0), we only adjust the size of elemental images by factor γi and reduce the lateral translation by factor Mi. Thus, Eq. (11) can be simply interpreted as magnifying (or demagnifying) the ith elemental image by factor γi and shifting it by a scaled version of its lateral distance with the reference elemental image.
In conventional reconstruction methods, where elemental images are magnified by ratio M=z/g, each back-projected elemental image would be M 2 times the size of original elemental image. This technique requires algebraic operations as well as memory resources exponentially increasing with the reconstruction distance. Such a resource demanding technique would make the end to end system costly and slows down the reconstruction process due to large machine cycles needed in addition to memory access delay.
The final reconstruction plane is achieved by superimposing the back-projected elemental images. Since the sensors are not required to reside on a regular grid, the Granular Noise (GN) does not have the periodic form as for the case in . For instance for x direction, assuming a pixel pitch equal to u, the shift (pxi-px 0)gi z -1 i u -1 has a fractional part that varies randomly for different elemental images. This is in part due to the non-periodic nature of (pxi-px 0) as well as varying magnification, Mi=gi/zi, for each elemental image. Essentially, in contrast to II with lenslet , in the scope of this paper which involves sensor positions randomly chosen from the sample space, it is very unlikely that a reconstruction plane with GN-free condition exists.
In this paper, we round the necessary shift for each elemental image to its nearest integer number. This would yield approximate reconstruction results. However, since elemental images are captured on full frame CCDs of mega pixel size, this sub-pixel approximation is negligible and is not expected to degrade the reconstruction results significantly as can be seen in the experimental results section. The final reconstruction result, for the area that all the sensors have common field of view, is then given by:
where N is the total number of elemental images. The described reconstruction technique, similar to its counterpart in , generates high resolution reconstructions; however, it is generalized to deal with elemental images captured in arbitrary locations in space.
4. Experimental results
We demonstrate the feasibility of 3D imaging with distributed random sensors with synthetic aperture method along with the proposed reconstruction technique. To achieve a reasonable resemblance with realistic cases, toy models of an U.S. Army tank and a sport car are used in lab to recreate a 3D scene. The tank can be enclosed in a box of the size (5×2.5×2 cm 3) whereas the car model fits in a volume of 4×3×2.5 cm 3. Also, the tank and the car are placed approximately 19cm and 24cm away from the reference elemental image, respectively.
As discussed in section 2, to obtain random pickup positions, a set of 100 positions, 𝒫i=(pxi,pyi,pzi), are obtained from three uniform random variable generators. For the lateral position variables, pxi and pyi, a parallax of 8cm is considered, i.e. pxi,pyi~U(-4cm,4cm). Whereas the variation in the longitudinal direction is set to 20% of object distances, i.e. pzi~U(25cm,27cm) assuming the desirable reconstruction range within [19cm,24cm] from the reference sensor. Figure 2 illustrates the random distribution of the cameras in the pickup stage.
The ith elemental image is then taken with a digital camera at its associated random pickup position, 𝒫i. The focal length for all lenses are set to be equal, i.e. gi=25mm, however, according to Eq. (11), the set of 100 randomly generated positions from Eq. (1) result in γi ∈ [0.943,1.065] due to variability of Δpzi ∈ [-0.91cm,1.04cm] for the entire reconstruction range zr ∈ [160mm,300mm]. This has the same effect as variability in gi. The CMOS sensor size is 22.7×15.1 mm 2 with 7µm pixel pitch. The field of view (FOV) for each elemental image is then 48°×33° in the horizontal and vertical directions respectively, which covers an area of 18×12 cm 2 at 20cm away from the pickup location in the object space. A single camera is translated between the acquisition points such that it only passes each location once while at each location a full frame image with size 3072×2048 pixels is captured. The camera is translated in x,y,z using off the shelve translation components.
Each image in Fig. 3 shows one of the elemental images taken under the above conditions. It can be seen that from each perspective view, the objects appear slightly different to the camera and in some elemental images, one object is partially occluding the other. The elemental images are then used in Eq. (11) and Eq. (12) to reconstruct the 3D scene at different distances from the viewpoint 𝒫0=(-0.2,1.8,25.9)cm with varying zr ∈ [160mm,300mm]. Two of such reconstruction planes are shown in Fig. 4 at zr=185mm and zr=240mm, respectively.
The method proposed in this paper eliminates the need for large image magnifications and thus is essentially M 2 r times more computational and memory efficient. For parameters used in the experimental results in which , this reduction translates into almost 41 to 144 times reduction in memory requirements and algorithm complexity.
In this paper, we have presented a novel concept that is based on a generalized framework for 3D multi–perspective imaging with arbitrary 3D pickup geometry and randomly distributed sensor coordinates in all three spatial dimensions. A finite number of sensors with known coordinates are randomly selected from within a pick up volume and are used for 3D object reconstruction. In this approach, while the sensor distribution is random, that is, it is not controlled, the locations of sensors in space are assumed to be known at the reconstruction stage. A computational reconstruction framework based on the back projection method is developed using a variable affine transform between the image space and the object space. The conventional back projection method [27-30], which is designed for a planar pickup surface, suffers from high computation power and memory cost which exponentially increases with reconstruction distance. We introduced a coordinate transformation to alleviate this problem, which in effect keeps the memory and computational demands constant for all reconstruction planes. We believe that the approach presented in this paper along with analysis presented in , takes computational 3D imaging methods one step closer to implementation in novel applications such as 3D aerial imaging.
References and links
1. B. Javidi and F. Okano, Eds., Three Dimensional Television, Video, and Display Technology (Springer-Verlag, Berlin, Germany, 2002).
3. T. Okoshi, “Three-dimensional displays,” Proceedings of the IEEE 68, 548–564 (1980). [CrossRef]
5. Y. Igarishi, H. Murata, and M. Ueda, “3D display system using a computer-generated integral photograph,” Jpn. J. Appl. Phys. 17, 1683–1684 (1978). [CrossRef]
6. A. R. L. Travis, “The display of three dimensional video images,” Proc. of the IEEE 85, 1817–1832 (1997). [CrossRef]
7. K. Itoh, W. Watanabe, H. Arimoto, and K. Isobe, “Coherence-based 3-D and spectral imaging and laser-scanning microscopy,” Proceedings of the IEEE 94, 608–628 (2006). [CrossRef]
8. B. Javidi Ed., Optical Imaging Sensors and Systems for Homeland Security Applications (Springer, NewYork, 2006). [CrossRef]
9. Y. Frauel, E. Tajahuerce, O. Matoba, A. Castro, and B. Javidi, “Comparison of passive ranging integral imaging and active imaging digital holography for three-dimensional object recognition,” Appl. Opt. 43, 452–462 (2004). [CrossRef] [PubMed]
11. L. Lypton, Foundation of Stereoscopic Cinema, A Study in Depth (Van Nostrand Reinhold, New York, 1982).
13. F. Okano, J. Arai, H. Hoshino, and I. Yuyama, “Three-dimensional video system based on integral photography,” Opt. Eng. 38, 1072–1078 (1999). [CrossRef]
14. H. Arimoto and B. Javidi, “Integral three-dimensional imaging with digital reconstruction,” Opt. Lett. 26, 157–159 (2001). [CrossRef]
15. A. Stern and B. Javidi, “Three-dimensional image sensing, visualization, and processing using integral imaging,” Proceedings of the IEEE 94, 591–607 (2006). [CrossRef]
16. R. Martìnez, A. Pons, G. Saavedra, M. Martinez-Corral, and B. Javidi, “Optically-corrected elemental images for undistorted integral image display,” Opt. Express 14, 9657–9663 (2006). [CrossRef]
17. R. Martìnez-Cuenca, G. Saavedra, M. Martnez-Corral, and B. Javidi, “Extended Depth-of-Field 3-D Display and Visualization by Combination of Amplitude-Modulated Microlenses and Deconvolution Tools,” IEEE J. Display Technol. 1, 321–327 (2005). [CrossRef]
18. T. Okoshi, Three-Dimensional Imaging Techniques (Academic, 1976).
19. M. G. Lippmann, “La photographie intégrale,” Comptes-rendus de l’Académie des Sciences 146, 446–451 (1908).
20. A. P. Sokolov, Autostereoscpy and Integral Photography by Professor Lippmanns Method (Moscow State Univ. Press, Moscow, 1911).
21. H. E. Ives, “Optical properties of a Lippmann lenticuled sheet,” J. Opt. Soc. Am. 21, 171–176 (1931). [CrossRef]
22. K. Perlin, S. Paxia, and J. S. Kollin, “An autostereoscopic display,” in Proceedings of the 27th Ann. Conf. on Computer Graphics and Interactive Techniques (ACM Press/Addison-Wesley, 2000), pp.319–326.
23. M. Levoy and P. Hanrahan, “Light field rendering,” in Proc. of ACM SIGGRAPH (New Orleans, 1996), pp. 31–42.
24. M. Levoy, “Light fields and computional imaging,” IEEE Computer 39, 46–55 (2006). [CrossRef]
25. J.-Y. Son, V. V. Saveljev, Y.-J. Choi, J.-E. Bahn, S.-K. Kim, and H. Choi, “Parameters for designing autostereoscopic imaging systems based on lenticular, parallax barrier, and integral photography plates,” Opt. Eng. 42, 3326–3333 (2003). [CrossRef]
32. R. Martìnez-Cuenca, G. Saavedra, M. Martnez-Corral, and B. Javidi, “Enhanced depth of field integral imaging with sensor resolution constraints,” Opt. Express 12, 5237–5242 (2004). [CrossRef] [PubMed]
33. J. Arai, F. Okano, H. Hoshino, and I. Yuyama, “Gradient-index lens-array method based on real-time integral photography for three-dimensional images,” Appl. Opt. 37, 2034–2045 (1998). [CrossRef]
34. J. S. Jang and B. Javidi, “Improved viewing resolution of 3-D integral imaging with nonstationary micro-optics,” Opt. Lett. 27, 324–326 (2002). [CrossRef]
35. J. -S. Jang and B. Javidi, “Three-dimensional integral imaging with electronically synthesized lenslet arrays,” Opt. Lett. 27, 1767–1769 (2002). [CrossRef]
36. J. -S. Jang and B. Javidi, “Three-dimensional synthetic aperture integral imaging,” Opt. Lett. 27, 1144–1146 (2002). [CrossRef]
37. Y. Hwang, S. Hong, and B. Javidi, “Free View 3-D Visualization of Occluded Objects by Using Computational Synthetic Aperture Integral Imaging,” IEEE J. Display Technol. 3, 64–70 (2007). [CrossRef]
39. R. Zaharia, A. Aggoun, and M. McCormick, “Adaptive 3D-DCT compression algorithm for continuous parallax 3-D integral imaging,” Signal Process. Image Commun. 17, 231–242 (2002). [CrossRef]
41. Ren Ng, “Fourier slice photography,” in Proceedings of ACM SIGGRAPH24, pp. 735–744 (2005).