In this paper, we propose generalized sampling approaches for measuring a multi-dimensional object using a compact compound-eye imaging system called thin observation module by bound optics (TOMBO). This paper shows the proposed system model, physical examples, and simulations to verify TOMBO imaging using generalized sampling. In the system, an object is sheared and multiplied by a weight distribution with physical coding, and the coded optical signal is integrated on to a detector array. A numerical estimation algorithm employing a sparsity constraint is used for object reconstruction.
©2010 Optical Society of America
A compound-eye imaging system is a promising computational imaging modality. Compound-eye optics have enabled light-field acquisition and device compactness[2, 3]. Thin observation module by bound optics (TOMBO) is a representative example of a compound-eye imaging system.
An advantage of compound-eye imaging systems is that they permit diverse data acquisition schemes. Different lenslets may create different encodings. For example, time detection based on the encoding concept has been proposed, and range detection in  can be considered as a system based on the concept. These compact systems reconstruct a three-dimensional object from a two-dimensional measurement where the size is the same as that of an axial plane of the object.
This paper proposes generalized sampling approaches for multi-dimensional object acquisition using TOMBO. In the proposed system, an object is acquired with coding and multiplexing in a two-dimensional snapshot. In particular, the coding schemes in [5, 6] are extended for multi-dimensional data acquisition of various objects.
There could be multiple choices for coding schemes for multi-dimensional object acquisition such as coded aperture imaging and multi-shot imaging. These schemes differ by design constraints. This paper considers the compactness of hardware and single-shot object acquisition capability as critical design constraints. As indicated by a large literature, TOMBO imaging modality is one of such techniques that can implement a compact system that can meet our design constraints. This motivates us to investigate its potential as a compressive imaging technique.
In this paper, the mathematical model of the proposed system and examples of the coding schemes for spectral and polarization imaging techniques are presented. Simulation results of the proposed system are shown. The implementations are inspired by [7, 8, 9]. The previously presented systems have a tradeoff between the spatial and axial resolutions. For example, in [7, 8], the number of the spectral or polarization channels is roughly proportional to that of the lenses. Increasing the number of lenses reduces the spatial resolution. The approaches proposed in this paper may compensate for the tradeoff by leveraging compressive sampling.
A constrained optimization technique to incorporate sparsity in some basis of an object estimate is used for reconstruction. The reconstruction method is inspired by compressive sampling . In compressive sampling, the systems should satisfy some assumptions stated in section2 for accurate reconstruction. The proposed system is compared to a theoretical baseline sensing system which is a Gaussian random sensing matrix. Several systems based on sparse reconstruction have been demonstrated and have shown promising results[11, 12, 13].
Section 2 provides a brief background on TOMBO and compressive sampling. Section 3 describes a general model for multi-dimensional TOMBO imaging. Section 4 presents examples of coding schemes. Simulation results are given in section 5.
In a simplified conceptual model, TOMBO consists of lenslets and a detector array as shown in Fig. 1. An imaging structure associated with a lenslet is called a unit. Each unit produces a low-resolution (LR) image on the detector array.
When the number of units is Nu × Nu in a square arrangement, the focal-length and the diameter of the lenslet need to be Nu times smaller than those of the corresponding conventional full-aperture system to obtain the same field of views. This results in a LR image whose size is Nu times smaller than that of an image produced by the full-aperture system. The thickness and depth-of-field of TOMBO are Nu times shorter and Nu 2 times longer, respectively. This allows for compact hardware with a large depth-of-field. Objects are often assumed to be located within the depth-of-field, and the lenslets are assumed to be aberration-free[4, 14]. These assumptions are made throughout this paper, unless otherwise stated.
2.2. Compressive sampling
The proposed system model in this paper forms an underdetermined linear system of equations as described in section 3. Compressive sampling (CS) is a theoretical framework for solving an underdetermined system[10, 15]. The reconstruction method in this paper is inspired by CS.
A linear system model can be written as
where , , , , and are a measurement vector, a sensing matrix, an object vector, a basis matrix, and a transform coefficient vector, respectively. denotes a Ni × Nj matrix of real numbers. We consider the case where Ng ≪ Nf.
Let s denote the number of non-zero coefficients in β. CS indicates that, for accurate reconstruction, Θ should satisfy a sufficient condition for any s-sparse β. The condition is called the restricted isometric property (RIP) defined by,
where cs ∈ (0,1) is a constant and ||·||2 2 denotes an ℓ2-norm. Λ is a subset of indices supporting s nonzero coefficients in β. β Λ and Θ Λ are elements of β and columns of Θ that support the s coefficients. If cs is close to 0, Eq. (2) indicates that Θ Λ preserves the Euclidean length of defined as
is called coherence. Φ(i, :), Ψ(:, j), and 〈,〉 are the i-th row of Φ, the j-th column of Ψ, and an inner product, respectively. When the coherence is small, Φ and Ψ are said to be incoherent. The number of measurement components required for accurate reconstruction is given as
where c is a constant. According to CS theory, if Θ satisfies RIP (Eq. (2)), measurements are with high probability sufficient to accurately estimate β. An accurate estimate of the s nonzero coefficients in β can be obtained by solving
where || · ||1 denotes ℓ1 norm.
3. A mathmatical model for proposed acquisition schemes
Let F (x,y,z 0, ⋯, denote a continuous density function representing a multi-dimensional object. x and y represent spatial dimensions, and z 0, ⋯, represent the other dimensions dependent on the application. x = 0 and y = 0 are defined as the center of a detector array. For simplicity, the y dimension is omitted. Extending to higher dimensions may be readily achieved with small modifications of the model.
3.1. Continuous model
In the proposed system, a multi-dimensional object is integrated on to detectors with one of two coding schemes, as demonstrated in Fig. 2. In one of the coded integrations inspired by , an object is sheared by an optical element, and the sheared optical signal is integrated on to a detector array. In the shear-transformation, each axial plane in an object is shifted along the x axis as shown in Fig. 2(a). In , the shift corresponds to a parallax. In another coded integration inspired by , an object is multiplied with a weight distribution, and the weighted optical signal is integrated on to a detector array as shown in Fig. 2(b). The weight distribution is a continuous function of z. In , the weight distribution corresponds to an exposure time. The two schemes are referred to as sheared integration (SI) and weighted integration (WI), respectively.
We denote integrated data associated with the u-th unit as Gu(ν), where ν denotes the spatial dimension in a unit as shown in Fig. 1. ν in the u-th unit is defined as ν = x − Ou, where Ou is the center of the u-th unit. Gu(ν) is expressed as
where Sn,u(zn) and Wn,u(zn) show a shift in SI and a weight distribution in WI of the zn dimension in the u-th unit, respectively. Lu is the center position of the u-th lenslet on the ν axis as shown in Fig. 1. For simplicity, Nn = 1 is assumed, and a subscript n is omitted. Eq. (6) can be rewritten as
3.2. Discretization model
A discrete object and a discrete integrated data are denoted by F̃ (l,m) = F(lΔx,mΔz) and G̃′u(i) = Gu(iΔx) using notation similar to that in , where a tilde indicates a discrete data. G̃′ is an intermediate data before sampling by the detectors. l, m, and i are integer variables of the x, z, and ν axes in a discretization model, respectively. Δx and Δz are the pixel pitches along the x and z axes in a discrete object. G̃′u(i) is sampled by detectors. The measurement data is expressed as G̃u(j) = ∑i G̃′u(i) rect((iΔx − jΔν − Du)/Δν), where Nν, j, Δν, and Du are the number of detectors in a unit, an index for the detectors in a unit, the pixel pitch of the detectors, and the center of the center detector on the ν axis in the u-th unit. Then, the measurement data G̃ can be written as
where S̄u(m) = ⌊(Lu + Su(mΔz) − Du)/Δx + 0.5⌊ and W̄u(m) = Wu(mΔz)Δz· ⌊·⌋ is the floor function.
3.3. System matrix
We assume that Δx = Δν/Nu and Nx = Nν Nu, which are both typical assumptions in TOMBO imaging [4, 5, 6, 14]. Thus, the numbers of elements in the measurement data and the object are Ng = Nx and Nf = Nx Nz.
As indicated in Eq. (8), the m-th axial plane in F̃ is shifted by S̄u(m) and multiplied with W̄u(m). which denotes the coding operation for the m-th axial plane in F̃ in the u-th unit is expressed as,
where C′ m,u (p,q) is the (p,q)-th element in the matrix C′ m,u.
represents the coding operation implemented by the TOMBO system on the object f (F̃ in vector form) in the u-th unit and is written as
where O is a Nx × Nx zero matrix.
The matrix which sums all of the axial layers is defined by
where denotes an identity matrix.
The downsampling matrix can be defined by
where 1 and 0 denote a Nu × 1 vector whose elements are all 1 and a Nu × 1 vector whose elements are all 0, respectively. A superscript T indicates a transpose of a matrix.
Therefore, the measurement data G̃ on the u-th unit is TQCuf, which means, firstly, an object is coded in each unit by Cu, secondly, the coded data is integrated on to a detector array by Q, and lastly, the integrated data is downsampled with detectors by T. The sensing matrix is expressed by
4. Implementation of proposed acquisition schemes
The proposed coding schemes can be implemented for a wide array of practical applications. Each application would rely on some physical optical elements to implement the coding scheme expressed by Eqs. (6) or (7). In this section, we present examples of the coding scheme for spectral imaging and polarization imaging. Using similar schemes, physical coding strategies for range, time, spectrum, polarization, large dynamic range, and wide field-of-view may be available.
Physical codings for spectral imaging using SI and WI are illustrated in Fig. 3. SI for spectral imaging can be implemented by using dispersive elements (e.g., prisms). The elements in each unit have different dispersion directions as shown in Fig. 3(a). The dispersion results in different shifts for each spectral slice. In Eq. (7), z represents the wavelength. The shift corresponds to Su(z).
WI for spectral imaging may be implemented with multi-band pass filters placed above or below the lenslet as shown in Fig. 3(b). Each of the filters has different pass-bands. Pass-bands and stop-bands are represented with Wu(z) = 1 and Wu(z) = 0 in Eq. (7), respectively. A stack of bandstop filters or a patch of bandpass filters may be used to substitute for the multi-band pass filter.
Figure 4 shows a conceptual diagram for polarization imaging with the proposed codings. SI for polarization imaging may be performed with birefringent linear polarizers. The elements split an incident ray into two polarized rays. Hence, an image at each polarization angle is shifted. Each unit has different shift for each polarization angle as shown in Fig. 4(a). Here, z represents a linear polarization angle. The shift corresponds to Su(z) in Eq. (7).
WI for polarization imaging may be performed with polarization plates. Polarization plates with different linear polarization angle are placed above or below the lenslet as shown in Fig. 4(b). The weight distribution is expressed as Wu(z) = cos2(Pu − z), where Pu is the polarization angle in the u-th unit. A patch of polarization plates, where each plate has a different polarization angle, allows flexibility in the design of a weight distribution.
5. Simulation of the proposed concept
The concept of multi-dimensional TOMBO imaging was verified through application independent simulations. These general simulations could readily modified for a specific application like those mentioned in the previous section.
A method called two-step iterative shrinkage/thresholding algorithm (TwIST)  was used for reconstruction. TwIST is an interative convex optimization algorithm that uses two previous estimates to improve convergence properties for the problem described by Eq. (5).
For simplicity, a shift in SI was assumed as Su(z) = (Auz + Bu)Δx. Au and Bu are a gradient and a bias, respectively, of the shear-transformation in the u-th unit defined as Au = (−2u/(Nu − 1) +1)A 0 and Bu = −AuNzΔz/2. For example, A 0 = 1.0 of Nu = 3 indicates that A 0 = 1.0, A 1 = 0.0, and A 2 = −1.0. A shift at the center axial plane, where z = NzΔz/2, is Su(z) = 0.0. Au and Bu of the y axis is the same as those of the x axis. A weight distribution in WI was assumed to be a binary pattern. In the m-th axial plane, h units were set as Wu(mΔz) = 1 in Eq. (7). The h units were randomly chosen, while the other Nu 2 − h units were set as Wu(mΔz) = 0. In this case, the maximum number of separable axial planes is fixed . A lenslet’s position Lu in Eq. (8) was randomly set in each unit. The range was [−Δν/2, Δν/2], where Δν is the pixel pitch of the detectors. The position of the center detector in a unit is Du = 0.
Figure 5 shows a simulation of four-dimensional data acquisition using the two TOMBO coding schemes. An object whose size is 128 × 128 × 4 × 2 and measurement data whose size is 128 × 128 are shown in Figs. 5(a) and 5(b). The compression ratio is 8, which is calculated as Nf/Ng, where Nf and Ng are the numbers of elements in an object and measurement data, respectively. In Fig. 5, the object and the simulation results are reshaped to 128 × 128 × 8 for display. SI with A 0 = 1.0 and WI with h = 3 were used for the z 0 and z 1 axes, respectively. The measurement signal-to-noise ratio (SNR) in the presence of additive white Gaussian noise and the number of units were 30 dB and 2 × 2, respectively. The object estimate sparsity in gradients was enforced using the total variation (TV) . Two-dimensional TV was applied independently for each axial plane as , where lx and ly are indices for the x and y axes in a discrete object. ∇[·]lx,ly is a two-dimensional gradient vector for the x and y directions, and ∣ · ∣ denotes the magnitude of the gradient vector. The object consists of multiple Shepp-Logan phantoms, which is sparse in two-dimensional TV domain. The total number of non-zero gradient values was s = 3242. The reconstruction results with TwIST and the Richardson Lucy method (RL)[21, 22] are compared in Figs. 5(c) and 5(d). Their peak signal-to-noise ratios (PSNR) were 32.1 dB and 19.4 dB, respectively. The PSNR is found by computing 20log10( ), where MAX and MSE represent the maximum of the signal values and the mean squared error between two signals, respectively.
CS object reconstruction accuracy may be estimated using a correlation between columns of Θ, which is the multiplication of a sensing matrix Φ and a basis matrix Ψ, in Eq. (1). When a correlation between two columns in Θ is high, it is difficult to resolve the two components in a transform coefficient vector β corresponding to the columns in Θ. So that, the reconstruction accuracy depends on not only Φ but also Ψ.
When a two-dimensional basis is used for each axial plane as in the previous simulation, the reconstruction accuracy along the axial direction in an object estimate may be roughly predicted based on the correlation between columns of Φ corresponding to two axial planes. Let denote
respectively, where and are a two-dimensional basis matrix for each axial plane and a Nx × Nx zero matrix. If the correlation between a column in ϕm and that on another axial plane is high, then the corresponding correlation between a column in ϕmψ and that on another axial plane may be high. In this case, it is difficult to resolve the axial planes. When ∣A 0∣ is small or h is large, the correlation between a column in ϕm and that on another axial plane is high. For example, Fig. 5(e) shows a reconstruction result where SI with A 0 = 0.2 was used for the z 0 axis. The reconstruction accuracy along the axial direction with A 0 = 0.2 was lower than that with A 0 = 1.0.
Figure 6 shows another simulation of five-dimensional data acquisition with the discrete wavelet transform (DWT). The sizes of the object in Fig. 6(a) and the measurement data in Fig. 6(b) were 128 × 128 × 2 × 2 × 2 and 128 × 128. The compression ratio is 8. Two-dimensional DWT was applied for each axial plane. The object consists of multiple natural images where the small coefficients in two-dimensional DWT were truncated. In two-dimensional DWT domain, the total number of non-zero DWT coefficients across all the planes was s = 2000. SI with A 0 = 3.0, WI with h = 12, and WI with h = 12 were used for the z 0, z 1, and z 2 axes, respectively. The measurement SNR and the number of the units were 30 dB and 4 × 4, respectively. The reconstruction results with TwIST and RL are compared in Figs. 6(c) and 6(d). Their PSNRs were 24.5 dB and 15.4 dB, respectively. Figure 6(e) shows a reconstruction result where WI with h = 15 was used for the z 1 and z 2 axes. The reconstruction accuracy along the axial direction with h = 15 was lower than that with h = 12.
Figure 7 illustrates the sensitivity of the reconstructions to noise as represented by a curve relating the measurement SNR to the reconstruction PSNRs. Also, the performance is compared to that of an ideal Gaussian random compressive sensing matrix, which is known to require a (optimally) small number of measurements to satisfy the RIP compared to what the proposed systems would require. Since the proposed systems usually have a worse RIP meaning that more measurements are required to obtain higher reconstruction accuracy, they present worse reconstruction accuracy compared to that of the Gaussian random matrix. However, such random sensing matrices would be very difficult to physically implement in general with the current technology. In addition, it is not clear how such random sensing systems may provide the compactness of physical systems and snapshot acquisition functionality, which are benefits of the proposed approach.
We proposed a generalized sampling approach for multi-dimensional object acquisition using TOMBO. The sampling uses multi-dimensional sheared and weighted integration in each unit. The mathematical model and some examples of the proposed measurement approach were presented. The simulation demonstrated reconstruction of an object with the number of elements totaling eight times that of the measurement data. A method inspired by compressive sampling was used in the reconstruction. These schemes enable us to acquire a multi-dimensional object with a single two-dimensional measurement by a compact imaging system. Also, these schemes extend abilities of compound-eye imaging systems to various applications.
A useful avenue for future work is to analyze theoretical properties of the proposed systems. It would be interesting to see how many more measurements would be required in general for the proposed systems to produce a certain accuracy, which is related to the validity of the sparsity assumption in the proposed systems. Also, it would be very useful to find a more efficient sparsity transformation that provides a better RIP and a better sparse representation of the objects of interest. Furthermore, we plan to investigate other coding schemes that may provide a better RIP overall to better exploit the sparsity assumption.
References and links
1. R. Ng, “Fourier slice photography,” in “SIGGRAPH ’05: ACM SIGGRAPH 2005 Papers,” (ACM, New York, NY, USA, 2005), pp. 735–744.
3. R. Athale, D. M. Healy, D. J. Brady, and M. A. Neifeld, “Reinventing the camera,” Opt. Photon. News 19, 32–37 (2008). [CrossRef]
4. J. Tanida, T. Kumagai, K. Yamada, S. Miyatake, K. Ishida, T. Morimoto, N. Kondou, D. Miyazaki, and Y. Ichioka, “Thin observation module by bound optics (TOMBO): concept and experimental verification,” Appl. Opt. 40, 1806–1813 (2001). [CrossRef]
6. R. Horisaki, S. Irie, Y. Ogura, and J. Tanida, “Three-dimensional information acquisition using a compound imaging system,” Optical Review 14, 347–350 (2007). [CrossRef]
7. W. Zhou and J. Leger, “Grin-optics-based hyperspectral imaging micro-sensor,” Proc. SPIE 6765, 676502 (2007). [CrossRef]
8. R. J. Plemmons, S. Prasad, S. Matthews, M. Mirotznik, R. Barnard, B. Gray, V. P. Pauca, T. C. Torgersen, J. van der Gracht, and G. Behrmann, “Periodic: Integrated computational array imaging technology,” in “Computational Optical Sensing and Imaging,” (2007), p. CMA1.
9. R. Horstmeyer, R. Athale, and G. Euliss, “Modified light field architecture for reconfigurable multimode imaging,” Proc. SPIE 7468, 746804 (2009). [CrossRef]
10. E. J. Candes and M. B. Wakin, “An introduction to compressive sampling,” Signal Processing Magazine, IEEE 25, 21–30 (2008). [CrossRef]
11. M. Wakin, J. Laska, M. Duarte, D. Baron, S. Sarvotham, D. Takhar, K. Kelly, and R. Baraniuk, “An architecture for compressive imaging,” in “ICIP06,” (2006), pp. 1273–1276.
14. K. Nitta, R. Shogenji, S. Miyatake, and J. Tanida, “Image reconstruction for thin observation module by bound optics by using the iterative backprojection method,” Appl. Opt. 45, 2893–2900 (2006). [CrossRef] [PubMed]
15. Y. Tsaig and D. L. Donoho, “Compressed sensing,” IEEE Transactions on Information Theory 52, 1289–1306 (2006). [CrossRef]
16. E. J. Candes and T. Tao, “Decoding by linear programming,” IEEE Trans. Info. Theory 51, 4203–4215 (2005). [CrossRef]
18. E. Hecht, Optics (Addison Wesley, 2001), 4th ed.
19. J. M. Bioucas-Dias and M. A. T. Figueiredo, “A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration,” IEEE Trans. Image Proc. 16, 2992–3004 (2007). [CrossRef]
20. L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Phys. D 60, 259–268 (1992). [CrossRef]
21. W. H. Richardson, “Bayesian-based iterative method of image restoration,” J. Opt. Soc. Am. 62, 55–59 (1972). [CrossRef]
22. L. B. Lucy, “An iterative technique for the rectification of observed distributions,” Astron. J. 79, 745–754 (1974). [CrossRef]
23. Q. Huynh-Thu and M. Ghanbari, “Scope of validity of PSNR in image/video quality assessment,” Electron. Lett. 44, 800–801 (2008). [CrossRef]
24. R. Gribonval and M. Nielsen, “Sparse representations in unions of bases,” IEEE Trans. Info. Theory 49, 3320–3325 (2003). [CrossRef]