Abstract

In this paper, we propose a new supervised manifold learning approach, supervised preserving projection (SPP), for the depth images of a 3D imaging sensor based on the time-of-flight (TOF) principle. We present a novel manifold sense to learn scene information produced by the TOF camera along with depth images. First, we use a local surface patch to approximate the underlying manifold structures represented by the scene information. The fundamental idea is that, because TOF data have nonstatic noise and distance ambiguity problems, the surface patches can more efficiently approximate the local neighborhood structures of the underlying manifold than TOF data points, and they are robust to the nonstatic noise of TOF data. Second, we propose SPP to preserve the pairwise similarity between the local neighboring patches in TOF depth images. Moreover, SPP accomplishes the low-dimensional embedding by adding the scene region class label information accompanying the training samples and obtains the predictive mapping by incorporating the local geometrical properties of the dataset. The proposed approach has advantages of both classical linear and nonlinear manifold learning, and real-time estimation results of the test samples are obtained by the low-dimensional embedding and the predictive mapping. Experiments show that our approach obtains information effectively from three scenes and is robust to the nonstatic noise of 3D imaging sensor data.

© 2013 Optical Society of America

Full Article  |  PDF Article

References

  • View by:
  • |
  • |
  • |

  1. A. Kolb, E. Barth, R. Koch, and R. Larsen, “Time-of-flight sensors in computer graphics,” in Eurographics 2009 (2009), pp. 119–134.
  2. V. Ganapathi, C. Plagemann, D. Koller, and S. Thrun, “Real time motion capture using a single time-of-flight camera,” in IEEE Conference on Computer Vision and Pattern Recognition (2010), pp. 755–762.
  3. L. Zhu, J. Zhou, J. Song, Z. Yan, and Q. Gu, “A practical algorithm for learning scene information from monocular video,” Opt. Express 16, 1448–1459 (2008).
    [CrossRef]
  4. J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science 290, 2319–2323 (2000).
    [CrossRef]
  5. M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Comput. 15, 1373–1396 (2003).
    [CrossRef]
  6. S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science 290, 2323–2326 (2000).
    [CrossRef]
  7. X. He and P. Niyogi, “Locality preserving projections,” Adv. Neural Inf. Process. Syst. 16, 100–200 (2004).
  8. X. He, D. Cai, S. Yan, and H. Zhang, “Neighborhood preserving embedding,” in Tenth IEEE International Conference on Computer Vision (2005), Vol. 2, pp. 1208–1213.
  9. C. BenAbdelkader, “Robust head pose estimation using supervised manifold learning,” in Computer Vision—ECCV 2010, Vol. 6136 of Lecture Notes in Computer Science (Springer, 2010), pp. 518–531.
  10. Z. Zhang and H. Zha, “Principal manifolds and nonlinear dimension reduction via local tangent space alignment,” SIAM J. Sci. Comput. 26, 313–338 (2004).
    [CrossRef]
  11. T. Lin and H. Zha, “Riemannian manifold learning,” IEEE Trans. Pattern Anal. Mach. Intell. 30, 796–809 (2008).
    [CrossRef]
  12. L. Jovanov, A. Pižurica, and W. Philips, “Fuzzy logic-based approach to wavelet denoising of 3D images produced by time-of-flight cameras,” Opt. Express 18, 22651–22676 (2010).
    [CrossRef]
  13. M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold regularization: a geometric framework for learning from labeled and unlabeled examples,” J. Mach. Learn. Res. 7, 2399–2434 (2006).
  14. M. Sezgin and B. Sankur, “Survey over image thresholding techniques and quantitative performance evaluation,” J. Electron. Imaging 13, 146–168 (2004).
    [CrossRef]
  15. V. N. Balasubramanian, J. Ye, and S. Panchanathan, “Biased manifold embedding: a framework for person-independent head pose estimation,” in IEEE Conference on Computer Vision and Pattern Recognition, 2007 (2007), pp. 1–7.
  16. S. Yan, D. Xu, B. Zhang, and H. Zhang, “Graph embedding and extensions: a general framework for dimensionality reduction,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 40–51 (2007).
    [CrossRef]
  17. X. He, M. Ji, and H. Bao, “Graph embedding with constraints,” in IJCAI’09 Proceedings of the 21st International Joint Conference on Artificial Intelligence (2009), pp. 1065–1070.

2010 (1)

2008 (2)

2007 (1)

S. Yan, D. Xu, B. Zhang, and H. Zhang, “Graph embedding and extensions: a general framework for dimensionality reduction,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 40–51 (2007).
[CrossRef]

2006 (1)

M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold regularization: a geometric framework for learning from labeled and unlabeled examples,” J. Mach. Learn. Res. 7, 2399–2434 (2006).

2004 (3)

M. Sezgin and B. Sankur, “Survey over image thresholding techniques and quantitative performance evaluation,” J. Electron. Imaging 13, 146–168 (2004).
[CrossRef]

X. He and P. Niyogi, “Locality preserving projections,” Adv. Neural Inf. Process. Syst. 16, 100–200 (2004).

Z. Zhang and H. Zha, “Principal manifolds and nonlinear dimension reduction via local tangent space alignment,” SIAM J. Sci. Comput. 26, 313–338 (2004).
[CrossRef]

2003 (1)

M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Comput. 15, 1373–1396 (2003).
[CrossRef]

2000 (2)

S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science 290, 2323–2326 (2000).
[CrossRef]

J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science 290, 2319–2323 (2000).
[CrossRef]

Balasubramanian, V. N.

V. N. Balasubramanian, J. Ye, and S. Panchanathan, “Biased manifold embedding: a framework for person-independent head pose estimation,” in IEEE Conference on Computer Vision and Pattern Recognition, 2007 (2007), pp. 1–7.

Bao, H.

X. He, M. Ji, and H. Bao, “Graph embedding with constraints,” in IJCAI’09 Proceedings of the 21st International Joint Conference on Artificial Intelligence (2009), pp. 1065–1070.

Barth, E.

A. Kolb, E. Barth, R. Koch, and R. Larsen, “Time-of-flight sensors in computer graphics,” in Eurographics 2009 (2009), pp. 119–134.

Belkin, M.

M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold regularization: a geometric framework for learning from labeled and unlabeled examples,” J. Mach. Learn. Res. 7, 2399–2434 (2006).

M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Comput. 15, 1373–1396 (2003).
[CrossRef]

BenAbdelkader, C.

C. BenAbdelkader, “Robust head pose estimation using supervised manifold learning,” in Computer Vision—ECCV 2010, Vol. 6136 of Lecture Notes in Computer Science (Springer, 2010), pp. 518–531.

Cai, D.

X. He, D. Cai, S. Yan, and H. Zhang, “Neighborhood preserving embedding,” in Tenth IEEE International Conference on Computer Vision (2005), Vol. 2, pp. 1208–1213.

de Silva, V.

J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science 290, 2319–2323 (2000).
[CrossRef]

Ganapathi, V.

V. Ganapathi, C. Plagemann, D. Koller, and S. Thrun, “Real time motion capture using a single time-of-flight camera,” in IEEE Conference on Computer Vision and Pattern Recognition (2010), pp. 755–762.

Gu, Q.

He, X.

X. He and P. Niyogi, “Locality preserving projections,” Adv. Neural Inf. Process. Syst. 16, 100–200 (2004).

X. He, D. Cai, S. Yan, and H. Zhang, “Neighborhood preserving embedding,” in Tenth IEEE International Conference on Computer Vision (2005), Vol. 2, pp. 1208–1213.

X. He, M. Ji, and H. Bao, “Graph embedding with constraints,” in IJCAI’09 Proceedings of the 21st International Joint Conference on Artificial Intelligence (2009), pp. 1065–1070.

Ji, M.

X. He, M. Ji, and H. Bao, “Graph embedding with constraints,” in IJCAI’09 Proceedings of the 21st International Joint Conference on Artificial Intelligence (2009), pp. 1065–1070.

Jovanov, L.

Koch, R.

A. Kolb, E. Barth, R. Koch, and R. Larsen, “Time-of-flight sensors in computer graphics,” in Eurographics 2009 (2009), pp. 119–134.

Kolb, A.

A. Kolb, E. Barth, R. Koch, and R. Larsen, “Time-of-flight sensors in computer graphics,” in Eurographics 2009 (2009), pp. 119–134.

Koller, D.

V. Ganapathi, C. Plagemann, D. Koller, and S. Thrun, “Real time motion capture using a single time-of-flight camera,” in IEEE Conference on Computer Vision and Pattern Recognition (2010), pp. 755–762.

Langford, J. C.

J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science 290, 2319–2323 (2000).
[CrossRef]

Larsen, R.

A. Kolb, E. Barth, R. Koch, and R. Larsen, “Time-of-flight sensors in computer graphics,” in Eurographics 2009 (2009), pp. 119–134.

Lin, T.

T. Lin and H. Zha, “Riemannian manifold learning,” IEEE Trans. Pattern Anal. Mach. Intell. 30, 796–809 (2008).
[CrossRef]

Niyogi, P.

M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold regularization: a geometric framework for learning from labeled and unlabeled examples,” J. Mach. Learn. Res. 7, 2399–2434 (2006).

X. He and P. Niyogi, “Locality preserving projections,” Adv. Neural Inf. Process. Syst. 16, 100–200 (2004).

M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Comput. 15, 1373–1396 (2003).
[CrossRef]

Panchanathan, S.

V. N. Balasubramanian, J. Ye, and S. Panchanathan, “Biased manifold embedding: a framework for person-independent head pose estimation,” in IEEE Conference on Computer Vision and Pattern Recognition, 2007 (2007), pp. 1–7.

Philips, W.

Pižurica, A.

Plagemann, C.

V. Ganapathi, C. Plagemann, D. Koller, and S. Thrun, “Real time motion capture using a single time-of-flight camera,” in IEEE Conference on Computer Vision and Pattern Recognition (2010), pp. 755–762.

Roweis, S. T.

S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science 290, 2323–2326 (2000).
[CrossRef]

Sankur, B.

M. Sezgin and B. Sankur, “Survey over image thresholding techniques and quantitative performance evaluation,” J. Electron. Imaging 13, 146–168 (2004).
[CrossRef]

Saul, L. K.

S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science 290, 2323–2326 (2000).
[CrossRef]

Sezgin, M.

M. Sezgin and B. Sankur, “Survey over image thresholding techniques and quantitative performance evaluation,” J. Electron. Imaging 13, 146–168 (2004).
[CrossRef]

Sindhwani, V.

M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold regularization: a geometric framework for learning from labeled and unlabeled examples,” J. Mach. Learn. Res. 7, 2399–2434 (2006).

Song, J.

Tenenbaum, J. B.

J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science 290, 2319–2323 (2000).
[CrossRef]

Thrun, S.

V. Ganapathi, C. Plagemann, D. Koller, and S. Thrun, “Real time motion capture using a single time-of-flight camera,” in IEEE Conference on Computer Vision and Pattern Recognition (2010), pp. 755–762.

Xu, D.

S. Yan, D. Xu, B. Zhang, and H. Zhang, “Graph embedding and extensions: a general framework for dimensionality reduction,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 40–51 (2007).
[CrossRef]

Yan, S.

S. Yan, D. Xu, B. Zhang, and H. Zhang, “Graph embedding and extensions: a general framework for dimensionality reduction,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 40–51 (2007).
[CrossRef]

X. He, D. Cai, S. Yan, and H. Zhang, “Neighborhood preserving embedding,” in Tenth IEEE International Conference on Computer Vision (2005), Vol. 2, pp. 1208–1213.

Yan, Z.

Ye, J.

V. N. Balasubramanian, J. Ye, and S. Panchanathan, “Biased manifold embedding: a framework for person-independent head pose estimation,” in IEEE Conference on Computer Vision and Pattern Recognition, 2007 (2007), pp. 1–7.

Zha, H.

T. Lin and H. Zha, “Riemannian manifold learning,” IEEE Trans. Pattern Anal. Mach. Intell. 30, 796–809 (2008).
[CrossRef]

Z. Zhang and H. Zha, “Principal manifolds and nonlinear dimension reduction via local tangent space alignment,” SIAM J. Sci. Comput. 26, 313–338 (2004).
[CrossRef]

Zhang, B.

S. Yan, D. Xu, B. Zhang, and H. Zhang, “Graph embedding and extensions: a general framework for dimensionality reduction,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 40–51 (2007).
[CrossRef]

Zhang, H.

S. Yan, D. Xu, B. Zhang, and H. Zhang, “Graph embedding and extensions: a general framework for dimensionality reduction,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 40–51 (2007).
[CrossRef]

X. He, D. Cai, S. Yan, and H. Zhang, “Neighborhood preserving embedding,” in Tenth IEEE International Conference on Computer Vision (2005), Vol. 2, pp. 1208–1213.

Zhang, Z.

Z. Zhang and H. Zha, “Principal manifolds and nonlinear dimension reduction via local tangent space alignment,” SIAM J. Sci. Comput. 26, 313–338 (2004).
[CrossRef]

Zhou, J.

Zhu, L.

Adv. Neural Inf. Process. Syst. (1)

X. He and P. Niyogi, “Locality preserving projections,” Adv. Neural Inf. Process. Syst. 16, 100–200 (2004).

IEEE Trans. Pattern Anal. Mach. Intell. (2)

T. Lin and H. Zha, “Riemannian manifold learning,” IEEE Trans. Pattern Anal. Mach. Intell. 30, 796–809 (2008).
[CrossRef]

S. Yan, D. Xu, B. Zhang, and H. Zhang, “Graph embedding and extensions: a general framework for dimensionality reduction,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 40–51 (2007).
[CrossRef]

J. Electron. Imaging (1)

M. Sezgin and B. Sankur, “Survey over image thresholding techniques and quantitative performance evaluation,” J. Electron. Imaging 13, 146–168 (2004).
[CrossRef]

J. Mach. Learn. Res. (1)

M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold regularization: a geometric framework for learning from labeled and unlabeled examples,” J. Mach. Learn. Res. 7, 2399–2434 (2006).

Neural Comput. (1)

M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Comput. 15, 1373–1396 (2003).
[CrossRef]

Opt. Express (2)

Science (2)

J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science 290, 2319–2323 (2000).
[CrossRef]

S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science 290, 2323–2326 (2000).
[CrossRef]

SIAM J. Sci. Comput. (1)

Z. Zhang and H. Zha, “Principal manifolds and nonlinear dimension reduction via local tangent space alignment,” SIAM J. Sci. Comput. 26, 313–338 (2004).
[CrossRef]

Other (6)

V. N. Balasubramanian, J. Ye, and S. Panchanathan, “Biased manifold embedding: a framework for person-independent head pose estimation,” in IEEE Conference on Computer Vision and Pattern Recognition, 2007 (2007), pp. 1–7.

X. He, M. Ji, and H. Bao, “Graph embedding with constraints,” in IJCAI’09 Proceedings of the 21st International Joint Conference on Artificial Intelligence (2009), pp. 1065–1070.

X. He, D. Cai, S. Yan, and H. Zhang, “Neighborhood preserving embedding,” in Tenth IEEE International Conference on Computer Vision (2005), Vol. 2, pp. 1208–1213.

C. BenAbdelkader, “Robust head pose estimation using supervised manifold learning,” in Computer Vision—ECCV 2010, Vol. 6136 of Lecture Notes in Computer Science (Springer, 2010), pp. 518–531.

A. Kolb, E. Barth, R. Koch, and R. Larsen, “Time-of-flight sensors in computer graphics,” in Eurographics 2009 (2009), pp. 119–134.

V. Ganapathi, C. Plagemann, D. Koller, and S. Thrun, “Real time motion capture using a single time-of-flight camera,” in IEEE Conference on Computer Vision and Pattern Recognition (2010), pp. 755–762.

Cited By

OSA participates in CrossRef's Cited-By Linking service. Citing articles from OSA journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1.
Fig. 1.

3D data point and manifold of the Swiss roll dataset. (a) Discrete data point approximation of the manifold. (b) Local surface patch approximation of the manifold.

Fig. 2.
Fig. 2.

Patches (i.e., the training samples) from the different scene regions in the TOF depth images; the black rectangles mark the patches. (a) Slope region, (b) ground region, and (c) the pedestrian region.

Fig. 3.
Fig. 3.

Process of depth ambiguity elimination. (a) Intensity image; (b) depth image after median filter; (c) gray image of depth differences matrix; (d) edge binary image of image (c); (e) first ambiguity boundary line; (f) early nonambiguity extraction; (g) foreground of intensity image; (h) foreground of intensity image without ambiguity; and (i) depth image after ambiguity elimination.

Fig. 4.
Fig. 4.

Examples of all kinds of images. (a) TOF scene depth images, (b) TOF pseudocolor depth images, and (c) TOF pseudocolor depth images obtained by removing the distance ambiguity.

Fig. 5.
Fig. 5.

TOF depth information representation of the scene images corresponding to Fig. 4. (a) Images with distance ambiguity and (b) images without the distance ambiguity.

Fig. 6.
Fig. 6.

Based on the proposed SPP approach, the 2D and 3D embedding of the scene region patches in training samples. Blue points represent the ground region patches, red points represent the pedestrian region patches, and the green points represent the slope region patches.

Fig. 7.
Fig. 7.

Comparisons of the scene region estimation results for test patches in TOF depth images. We mark the general outlines of the three different scene regions, and the approaches are listed in Table 1. (a) TOF pseudocolor depth images of the scenes, (b) estimation results of the corresponding scene obtained by using SPP, (c) estimation results obtained by using Supervised LE [9] + LapRLS [13], and (d) estimation results obtained by using LE [15] + LapRLS [13].

Fig. 8.
Fig. 8.

Analysis of the test patches from scene region estimation results in TOF depth images. (The unit of numbers in the tables is millimeters.)

Tables (1)

Tables Icon

Table 1. SPP and Other Approach Implementations

Equations (13)

Equations on this page are rendered with MathJax. Learn more.

d=Lφ2π,
Φ(i,j)=max(|I(i,j)I(i+1,j)|,|I(i,j)I(i,j1)|),
T(x,y)={0Φ˜(x,y)<t*1Φ˜(x,y)t*,
σt2=ω0(μ0μ)2+ω1(μ1μ)2,
σt2=ω0ω1(μ0μ1)2.
t*=max(σt2).
d˜(i,j)=d(i,j)f(i,j),
Sij={exp(d˜(i,j)2/2σ2),ifxiandxjare neighbors0,otherwise.
minyi,jnyiyj2Sij=yTLys.t.yTDy=1,
Λij={exp(|lilj|2/2σ2),ifxiandxjare neighbors0,otherwise.
minyi,jyiyj2Sij+αi,jyiyj2Λij=yT(L+αL˜)ys.t.yT(D+αD˜)y=1.
minfiyif(xi)2+λi,jf(xi)f(xj)2Si,j+γf2.
minfiyif(xi)2+λi,jf(xi)f(xj)2Si,j+φi,jf(xi)f(xj)2Λi,j+γf2.

Metrics