In this paper, we address 3D object visualization and recognition with multi-wavelength digital holography. Color features of 3D objects are obtained by the multiple-wavelengths. Perfect superimposition technique generates reconstructed images of the same size. Statistical pattern recognition techniques: principal component analysis and mixture discriminant analysis analyze multi-spectral information in the reconstructed images. Class-conditional probability density functions are estimated during the training process. Maximum likelihood decision rule categorizes unlabeled images into one of trained-classes. It is shown that a small number of training images is sufficient for the color object classification.
© 2007 Optical Society of America
There has been growing interest in object recognition in two-dimensional (2D) and three-dimensional (3D) environments [1–10]. In digital holography, 3D information is recorded by optical interferometry. Recorded 3D information can be numerically processed to reconstruct holographic images at different longitudinal distances and perspectives. In the literature, various techniques using digital holography have been developed for 3D visualization, image encryption, and object recognition [3, 5–17]. A single wavelength is used in most of the researches. There has been a recognition research using color digital holography and the conventional correlation method .
In this paper, we address 3D color object visualization and statistical pattern recognition using multi-wavelength digital holography where multi-spectral 3D spatial information is recorded and reconstructed [12–15]. Holographic images are numerically reconstructed by the Fresnel transform method. The size of the reconstructed images can be controlled independently for perfect superimposition at different distances and wavelengths . This technique is effective for the fast process such as real-time display and investigation including pattern recognition.
Statistical pattern recognition analyzes the multi-spectral information from color objects simultaneously. Two statistical pattern recognition approaches are adopted: principal component analysis (PCA) and mixture discriminant analysis (MDA). The PCA is a well-known dimensionality reduction technique to extract low dimensional features from high dimensional data [18–21]. The high dimensional vectors are projected onto a subspace spanned by eigenvectors of the covariance matrix of the training vectors. We extract low dimensional features which are characterized by different colors. Since the images are reconstructed by means of the superimposition reconstruction technique, no matching process is needed between multi-spectral information.
The multi-spectral features are simultaneously trained by the MDA. The multi-spectral features are complementary to one another for classification. The population distribution of one class is modeled by the weighted sum of the probability density functions of several components (clusters). In the Gaussian mixture model, each probability density function for the component is assumed to be Gaussian. Expectation maximization (EM) is a common approach to estimate the parameters of the Gaussian mixture model [19–27]. The MDA has been often adopted for cluster analysis, classification task, and density estimation [22–27]. The class-conditional probability density function for the population of each class is obtained using the PCA and the MDA during the training. For the decision making, the class of unlabeled images is determined by the maximum likelihood (ML) decision rule.
The organization of the paper is as follows. 3D object recording and reconstruction with multi-wavelength digital holography are described in Section 2. The PCA and MDA with the EM algorithm are described in Section 3. The ML decision rule and evaluation metrics are also discussed in Section 3. Experimental results are presented in Section 4. Conclusions follow in Section 5.
2. Multi-wavelength digital holography
The use of multi-wavelength digital holography has been researched for 3D color display [12–15]. Figure 1 illustrates the optical setup for multi-wavelength digital holography. In our approach, two lasers (l 1 and l 2) with different wavelengths are used. One is in the red region (λ1=632.8 nm), and the other is in the green region (λ2=532.0 nm). The optical configuration is arranged to allow the two lasers to propagate along the same paths either for the reference or the object beams. The reflecting prism which is in the path of the red laser beam permits the matching of the optical paths of the two interfering beams inside the optical coherent length of the laser. The object (toy warrior) as shown in Fig. 1 is placed at a distance 500 mm from the CCD (charge-coupled detector) array. Two holograms are recorded with two wavelengths. The holograms are reconstructed separately by the Fresnel transformation method .
It is noted that different details are visible in reconstructed images since different color information is recorded according to the wavelengths. Moreover, the sizes of reconstructed images are different since the reconstruction pixel size depends on the wavelength. Reconstruction is obtained through the pixel size (Δx, Δy) of the CCD array which is different from the pixel size (Δx′,Δy′) in the image plane, where Δx and Δy are the resolutions of the CCD in the x and y directions, respectively, and Δx′ and Δy′ are the resolutions of the image plane in the x and y directions, respectively. They are related by Δx′=dsλ/Nx Δx and Δy′=dsλ/NyΔy, where ds is a longitudinal depth, λ is the wavelength, and Nx and Ny are the number of CCD pixels in the x and y directions, respectively.
Recently, it has been demonstrated that the size of the reconstructed images can be controlled independently for perfect superimposition between the reconstructed images . For the perfect superimposition, we enlarge the number of pixels by zero padding. Let us say that N x1 is an increased size of the hologram plane by zero-padding in the x direction. If one hologram has been recorded with wavelength λ1, the other one with λ2, and λ1>λ2, then, the number of pixels of the hologram of the wavelength λ1 is changed to
where N x2 is the size of the hologram plane in the x direction with the wavelength λ 2. Consequently, we obtain the same resolution for reconstructed images for the holograms of different wavelengths:
where Δx 1′ and Δx 2′ are the resolutions of the image plane in the x direction with wavelength λ 1 and λ 2, respectively. The reconstructed image size in the y direction is controlled in the same way. In the experiments, the size of the hologram plane (N x2) is 1024 pixels, therefore, N x1 becomes 1218 pixels.
3. Statistical pattern recognition techniques
In this section, first, the PCA is presented, and then the MDA with the EM algorithm is discussed. Finally, the ML decision rule and evaluation metrics are presented.
3.1 Principal component analysis
Let a column vector composed of the pixel values of the reconstructed images be one realization of a random vector x∈R d×1, where R d×1 is d-dimensional Euclidean space, and d is the same with the number of pixels in the reconstructed image. The PCA projects the d-dimensional vectors onto the l dimension subspace (l≤d) [18–21]. For a real d-dimensional random vector x, let the mean vector be µx=E(x), and the covariance matrix be Σxx=E(x-µx)(x-µx)t, where the superscript t denotes transpose. The space for the PCA is spanned by the orthonormal eigenvectors of the covariance matrix, that is, ΣxxE=EV where the column vectors of E are normalized eigenvectors e i’s, i.e. E=[e 1,…,e d], and the diagonal matrix V is composed of eigenvalues vi’s, i.e. V=diag(v 1,…, vd). For the PCA, the projection matrix Wp is the same as the eigenvector matrix E. Therefore, a projected vector y by Wp is
The PCA diagonalizes the covariance matrix of y, i.e. Σyy=E(y-µy)(y-µy)t=V where µy=E(y). If we choose the projection matrix Wp=[e 1,…,e l], the subspace is spanned by the corresponding l eigenvectors. It is a well known property of the PCA that by choosing l eigenvectors of the largest eigenvalues, the mean-squared error between a vector x and a restored vector x̂ is minimized. The mean-squared error is defined as
and the restored vector x̂ is defined as
where vi’s are eigenvalues of vd≤v d-1,…, v 2≤v 1. The PCA can reduce the dimension of the vectors while retaining dominant features of the object structure and reducing redundant and noisy data.
3.2 Mixture discriminant analysis
To deal with different visible features, the population distribution of the color holographic images is modeled as the mixture of several component probability density functions [22–27]. If the class j is composed of Gj components of the probability density functions, the class-conditional probability density function is
where the vector y is a projected vector onto the subspace by the PCA, wjk denotes an event that the vector y belongs to the component k of the class j, Nc is the number of classes under investigation, P(wjk) is the probability that the event wjk occurs.
In the Gaussian mixture model, the probability density function of each component is assumed to be multivariate Gaussian as
where N(·) denotes the multivariate Gaussian distribution, and µ jk and Σjk are the mean vector and the covariance matrix of the component k in the class j, respectively. Therefore, solving the MDA with the Gaussian mixture model is equivalent to estimating three unknown parameters (P(wjk), µ jk, Σjk) for each Gaussian component of the classes.
Let the log-likelihood function of the joint density with nj training images of the class j be
where jk µ̂, Σ̂jk, and P̂(wjk) are the estimators for the mean, covariance and mixing weight, respectively.
As shown in Eqs. (9)–(12), the maximum likelihood solution is a set of highly non-linear coupled equations. The EM algorithm is a popular approach to solve them. The EM algorithm uses a set of training data to iteratively estimate the parameters until convergence. Figure 2 shows the block diagram of the EM algorithm for the class j; i represents the number of the iteration, and i max and ε are the termination criteria for the iteration. The EM procedure is as follows: for initialization, we set for k=1,…,Gj; during E step, we compute P;̂(wjk|y jt) for k=1,…,Gj, and t=1,…,nj as in Eq. (9); and during M step, we compute µ;̂jk, Σ̂jk, P;̂(wjk) for k=1,…,Gj as in Eqs. (10)–(12). The E and M steps are iterated until or i>i max. In the experiment, we set ε at 10-50 and i max at 100. These E-step and M-step are repeated for j=1,…, Nc.
The dimension of the vector y on the PCA subspace should be pre-determined for the PCA. In the experiments, several dimensions for y are tested. The number of clusters Gj in Eq. (6) is another factor to be considered carefully. We chose the cluster number to be the same as the number of wavelengths which is two in the experiments. The initialization of the parameters is the other important factor. We use the Linde-Buzo-Gray initialization satisfying the Lloyd’s optimality conditions [28,29]. In our experiments, the performance of Linde-Buzo-Gray initialization has shown to be better than initialization using typical k-means clustering .
3.3 Maximum likelihood decision rule
Our decision rule compares the class-conditional probability density functions with unknown test image. Let a test vector y test be the unlabeled vector on the PCA subspace as in Eq. (3). In the experiments, a test vector corresponds to a reconstructed image from one hologram. Therefore, a test vector y test contains a single spectral feature of the object.
The class-conditional density functions determines the class of the test vector as
where Cĵ is the set of the class ĵ, and p̂j(y) is the class-conditional probability density function with the estimated parameters.
To evaluate the performance, we calculate two performance measures: correct classification rate and false classification rate which are, respectively defined as
4. Experiments and simulation results
Color information of three objects (toy warriors) is recorded with two different wavelengths as illustrated in Fig. 1. The height of the objects is around 12 mm. They are placed at a distance of 500 mm from the CCD array since it is the minimum distance to obtain the whole object reconstruction image. The power levels of λ1 and λ2 are 35 mW and 50 mW, respectively, and the sizes of the beam diameter for λ1 and λ2 are 1.2 mm and 1.8 mm, respectively. We control the reflecting prism (RP) to equalize the paths of the object beam and the reference beam, in such a way that the optical path difference is inside the coherence length of the laser. It is noted that the coherence length of the red laser is about 20 cm, while the green is in meters. The CCD array is JAI Mod. CV M4. The spatial resolution of the CCD is 1280×1024 pixels and the size of each pixel is 6.7 µm.
Holographic images are computationally reconstructed at various image planes of different depths. The perfect superimposition technique in  is used to obtain holographic images of three objects. One hundred images are reconstructed at every millimeter from 451.0 mm to 550.0 mm for each hologram. Figures 3–5 show the movies of reconstructed images for three objects captured by two wavelengths. Different visible features appear in the same object images according to the wavelengths. Only 50 images are shown in each movie which are reconstructed at 451, 453, 549 mm. In the figures, the holographic images are cropped to be 500×400 pixels assuming that the images are segmented. We present the color (RGB) images in Fig. 6. The objects are reconstructed at 500 mm. The red component is identical with the holographic images using the red laser (λ1) and the green component is the green laser (λ2).
For training, two images are randomly chosen from a set of 100 holographic images, therefore, the total number of training images is four for each class. We classify other 196 images which are not trained. The same experiment is repeated 100 runs with differently selected training images. The correct classification and false classification rates in Eqs. (14) and (15) are averaged over 100 runs. It is noted that we train the multi-spectral information of objects but classify unknown images with a single visible feature. Although the multi-spectral information is embedded into the system, we can recognize the objects with the single spectral feature requiring less resource for recording and reconstruction.
The same experiments are performed with 6, 10, and 20 training images from one class. Figure 7 shows the average correct and false classification rates when the dimension of the PCA subspace is one. With the increased number of training data higher correct classification rates and lower false classification rates are produced since the class-conditional probability density functions are estimated more accurately. Figure 8 illustrates the results when the dimension of the PCA subspace is two.
It is noted that there is no image processing techniques applied to the reconstructed images for noise cancellation although the noise is caused by the DC term and coherent light diffraction. We may expect better visualization and recognition results with any conventional noise reduction technique on the images.
In this paper, the multi-spectral information of 3D objects is recorded and reconstructed with multi-wavelength digital holography. The same size images are reconstructed using perfect superimposition technique. The multi-wavelength digital holography provides spatial and spectral information of 3D objects. The statistical pattern recognition techniques handle the color information of objects simultaneously. The PCA extract low dimensional feature vectors from the reconstructed images and the MDA trains the multiple features to estimate the class-conditional density function. The proposed system is proven to provide the discrimination capability for different color objects with a few training images.
References and links
1. A. Mahalanobis and F. Goudail, “Methods for automatic target recognition by use of electro-optic sensors: introduction to the feature Issue,” Appl. Opt. 43, 207–209 (2004). [CrossRef]
3. B. Javidi, ed., Optical Imaging Sensors and Systems for Homeland Security Applications (Springer, NewYork, 2005).
4. F. Goudail and P. Refregier, “Statistical algorithms for target detection in coherent active polarimetric images,” J. Opt. Soc. Am. 18, 3049–3060 (2001). [CrossRef]
5. B. Javidi and E. Tajahuerce, “Three-dimensional object recognition by use of digital holography,” Opt. Lett. 25, 610–612 (2000). [CrossRef]
6. Y. Frauel, E. Tajahuerce, M. Castro, and B. Javidi, “Distortion-tolerant three-dimensional object recognition with digital holography,” Appl. Opt. 40, 3887–3893 (2001). [CrossRef]
7. Y. Frauel and B. Javidi, “Neural network for three-dimensional object recognition based on digital holography,” Opt. Lett. 26, 1478–1480 (2001). [CrossRef]
9. B. Javidi, I. Moon, S. Yeom, and E. Carapezza, “Three-dimensional imaging and recognition of microorganism using single-exposure on-line (SEOL) digital holography,” Opt. Express 13, 4492–4506 (2005). [CrossRef] [PubMed]
10. S. Yeom, I Moon, and B. Javidi, “Real-time 3D sensing, visualization and recognition of dynamic biological micro-organisms,” Proc. IEEE 94, 550–566 (2006). [CrossRef]
11. J. Maycock, T. Naughton, B. Hennely, J. McDonald, and B. Javidi, “Three-dimensional scene reconstruction of partially occluded objects using digital holograms,” Appl. Opt. 45, 2975–2985 (2006). [CrossRef] [PubMed]
12. I. Yamaguchi, T. Matsumura, and J. Kato, “Phase-shifting color digital holography,” Opt. Lett. 27, 1108–1110 (2002). [CrossRef]
13. J. Kato, I. Yamaguchi, and T. Matsumura, “Multicolor digital holography with an achromatic phase shifter,” Opt. Lett. 27, 1403–1405 (2003). [CrossRef]
14. P. Ferraro, S. De Nicola, G. Coppola, A. Finizio, D. Alfieri, and G. Pierattini, “Controlling image size as a function of distance and wavelength in Fresnel-transform reconstruction of digital holograms,” Opt. Lett. 29, 854–855 (2004). [CrossRef] [PubMed]
15. D. Alfieri, G. Coppola, S. D. Nicola, P. Ferraro, A. Finizio, G. Pierattini, and B. Javidi, “Method for superposing reconstructed images from digital holograms of the same object recorded at different distance and wavelength,” Opt. Commun. 260, 113–116 (2006). [CrossRef]
18. A. K. Jain, Fundamentals of digital image processing (Prentice-Hall Inc., 1989).
19. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification 2nd ed. (Wiley Interscience, New York, 2001).
20. K. Fukunaga, Introduction to Statistical Pattern Recognition 2nd ed. (Academic Press, Boston, 1990).
21. C. M. Bishop, Neural Networks for Pattern Recognition (Oxford University Press, New York, 1995).
22. C. Fraley and A. E. Raftery, “Model-based clustering, discriminant analysis, and density estimation,” J. of Am. Stat. Assoc. 97, 611–631 (2002). [CrossRef]
23. T. Hastie and R. Tibshirani, “Discriminant analysis by Gaussian mixtures,” J. Royal Statistical Society B 58, 155–176 (1996).
24. G. J. McLachlan, Discriminant analysis and statistical pattern recognition (Wiley, New York, 1992). [CrossRef]
25. M. M. Dundar and D. Landgrebe, “A model-based mixture-supervised classification approach in hyperspectral data analysis,” IEEE. Trans. on Geoscience and remote sensing 40, 2692–2699 (2002). [CrossRef]
26. M. H. C. Law, M. A. T. Figueiredo, and A. K. Jain, “Simultaneous feature selection and clustering using mixture models,” IEEE. Trans. on Pattern Anal. Mach. Intell. 26, 1154–1166 (2004) [CrossRef]
27. B. J. Frey and N. Jojic, “Transformation-invariant clustering using the EM algorithm,” IEEE Trans. on Pattern Anal. Mach. Intell. 25, 1–17 (2003). [CrossRef]
28. Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantizer design,” IEEE Trans. Commun. COM-2884–95 (1980). [CrossRef]
29. J. Jang, S. Yeom, and B. Javidi, “Compression of ray information in three-dimensional integral imaging,” Opt. Eng. 44, 12700-1~10 (2005). [CrossRef]