Abstract

This paper presents a probabilistic object recognition and pose estimation method using multiple interpretation generation in cluttered indoor environments. How to handle pose ambiguity and uncertainty is the main challenge in most recognition systems. In order to solve this problem, we approach it in a probabilistic manner. First, given a three-dimensional (3D) polyhedral object model, the parallel and perpendicular line pairs, which are detected from stereo images and 3D point clouds, generate pose hypotheses as multiple interpretations, with ambiguity from partial occlusion and fragmentation of 3D lines especially taken into account. Different from the previous methods, each pose interpretation is represented as a region instead of a point in pose space reflecting the measurement uncertainty. Then, for each pose interpretation, more features around the estimated pose are further utilized as additional evidence for computing the probability using the Bayesian principle in terms of likelihood and unlikelihood. Finally, fusion strategy is applied to the top ranked interpretations with high probabilities, which are further verified and refined to give a more accurate pose estimation in real time. The experimental results show the performance and potential of the proposed approach in real cluttered domestic environments.

© 2011 Optical Society of America

Full Article  |  PDF Article

References

  • View by:
  • |
  • |
  • |

  1. M. DaneshPanah, B. Javidi, and E. A. Watson, “Three dimensional object recognition with photon counting imagery in the presence of noise,” Opt. Express 18, 26450–26460 (2010).
    [CrossRef] [PubMed]
  2. S.-H. Hong and B. Javidi, “Distortion-tolerant 3d recognition of occluded objects using computational integral imaging,” Opt. Express 14, 12085–12095 (2006).
    [CrossRef] [PubMed]
  3. B. Javidi, R. Ponce-Diaz, and S. H. Hong, “Three-dimensional recognition of occluded objects by using computational integral imaging,” Opt. Lett. 31, 1106–1108 (2006).
    [CrossRef] [PubMed]
  4. V. Lepetit and P. Fua, Monocular Model-based 3D Tracking of Rigid Objects, Foundations and Trends in Computer Graphics and Vision (2005), Vol. 1, pp. 1–89.
  5. S. Kim and I. Kweon, “Automatic model-based 3d object recognition by combining feature matching with tracking,” Mach.Vision Appl. 16, 267–272 (2005).
    [CrossRef]
  6. Z. Lu, S. Baek, and S. Lee, “Robust 3D line extraction from stereo point clouds,” in 2008 IEEE Conference Robotics, Automation and Mechatronics (IEEE, 2008).
    [CrossRef]
  7. I. Shimshoni and J. Ponce, “Probabilistic 3D object recognition,” Int. J. Comput. Vis. 36, 51–70 (2000).
    [CrossRef]
  8. P. David and D. DeMenthon, “Object recognition in high clutter images using line features,” in Tenth IEEE International Conference on Computer Vision (IEEE, 2005).
  9. L. G. Roberts, “Machine perception of three-dimensional solids,” in Optical and Electrooptical Information Processing, J.T.Tipett, ed. (MIT, 1965).
  10. M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981).
    [CrossRef]
  11. J. S. Beis and D. G. Lowe, “Indexing without invariants in 3d object recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 1000–1015 (1999).
    [CrossRef]
  12. M. S. Costa and L. G. Shapiro, “3D object recognition and pose with relational indexing,” Comput. Vis. Image Underst. 79, 364–407 (2000).
    [CrossRef]
  13. P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: Simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
    [CrossRef]
  14. M. A. Vicente, P. O. Hoyer, and A. Hyvarinen, “Equivalence of some common linear feature extraction techniques for appearance-based object recognition tasks,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 896–900 (2007).
    [CrossRef]
  15. C. M. Do, R. Martinez-Cuenca, and B. Javidi, “Three-dimensional object-distortion-tolerant recognition for integral imaging using independent component analysis,” J. Opt. Soc. Am. A 26, 245–251 (2009).
    [CrossRef]
  16. S. Min, S. Hao, S. Savarese, and F.-F. Li, “A multi-view probabilistic model for 3d object classes,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2009).
  17. S. Ekvall, D. Kragic, and F. Hoffmann, “Object recognition and pose estimation using color cooccurrence histograms and geometric modeling,” Image Vis. Comput. 23, 943–955(2005).
    [CrossRef]
  18. C. Harris and M. Stephens, “A combined corner and edge detection,” in Proceedings of The Fourth Alvey Vision Conference (1988).
  19. C. Schmid and R. Mohr, “Local grayvalue invariants for image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell. 19, 530–535(1997).
    [CrossRef]
  20. D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis. 60, 91–110 (2004).
    [CrossRef]
  21. A. E. Johnson and M. Hebert, “Using spin images for efficient object recognition in cluttered 3d scenes,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 433–449 (1999).
    [CrossRef]
  22. A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik, “Recognizing objects in range data using regional point descriptors,” in Proceedings of the European Conference on Computer Vision (ECCV, 2004), Vol. 3023, pp. 224–237.
  23. Z. Zhang and O. D. Faugeras, “Determining motion from 3d line segment matches: a comparative study,” Image Vis. Comput. 9, 10–19 (1991).
    [CrossRef]
  24. C. Guerra and V. Pascucci, “Matching sets of 3D segments,” (SPIE, 1999).
  25. B. Kamgar-Parsi, “Algorithms for matching 3d line sets,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 582–593 (2004).
    [CrossRef] [PubMed]
  26. Throughout this paper, the bold font is referred to vector or matrix.
  27. C. Bregler and J. Malik, “Tracking people with twists and exponential maps,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1998).
  28. H. Schneiderman and T. Kanade, “Object detection using the statistics of parts,” Int. J. Comput. Vis. 56, 151–177 (2004).
    [CrossRef]
  29. R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning object categories from google’s image search,” in Tenth IEEE International Conference on Computer Vision (IEEE, 2005).
  30. G. Dashan, H. Sunhyoung, and N. Vasconcelos, “Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 31, 989–1005 (2009).
    [CrossRef]
  31. P. Wang and H. Qiao, “Adaptive probabilistic tracking with reliable particle selection,” Electron. Lett. 45, 1160–1161(2009).
    [CrossRef]
  32. K. Nummiaro, E. Koller-Meier, and L. Van Gool, “An adaptive color-based particle filter,” Image Vis. Comput. 21, 99–110(2003).
    [CrossRef]
  33. D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Trans. Pattern Anal. Mach. Intell. 25, 564–577(2003).
    [CrossRef]
  34. C. Genest and J. V. Zidek, “Combining probability distributions: A critique and an annotated bibliography,” Statist. Sci. 1, 114–135 (1986).
    [CrossRef]
  35. D. F. Dementhon and L. S. Davis, “Model-based object pose in 25 lines of code,” Int. J. Comput. Vis. 15, 123–141 (1995).
    [CrossRef]
  36. P. P. Loutrel, “A solution to the hidden-line problem for computer-drawn polyhedra,” IEEE Trans. Comput. C-19, 205–213 (1970).
    [CrossRef]

2010

2009

C. M. Do, R. Martinez-Cuenca, and B. Javidi, “Three-dimensional object-distortion-tolerant recognition for integral imaging using independent component analysis,” J. Opt. Soc. Am. A 26, 245–251 (2009).
[CrossRef]

G. Dashan, H. Sunhyoung, and N. Vasconcelos, “Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 31, 989–1005 (2009).
[CrossRef]

P. Wang and H. Qiao, “Adaptive probabilistic tracking with reliable particle selection,” Electron. Lett. 45, 1160–1161(2009).
[CrossRef]

2007

M. A. Vicente, P. O. Hoyer, and A. Hyvarinen, “Equivalence of some common linear feature extraction techniques for appearance-based object recognition tasks,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 896–900 (2007).
[CrossRef]

2006

2005

S. Ekvall, D. Kragic, and F. Hoffmann, “Object recognition and pose estimation using color cooccurrence histograms and geometric modeling,” Image Vis. Comput. 23, 943–955(2005).
[CrossRef]

S. Kim and I. Kweon, “Automatic model-based 3d object recognition by combining feature matching with tracking,” Mach.Vision Appl. 16, 267–272 (2005).
[CrossRef]

2004

D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis. 60, 91–110 (2004).
[CrossRef]

B. Kamgar-Parsi, “Algorithms for matching 3d line sets,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 582–593 (2004).
[CrossRef] [PubMed]

H. Schneiderman and T. Kanade, “Object detection using the statistics of parts,” Int. J. Comput. Vis. 56, 151–177 (2004).
[CrossRef]

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: Simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

2003

K. Nummiaro, E. Koller-Meier, and L. Van Gool, “An adaptive color-based particle filter,” Image Vis. Comput. 21, 99–110(2003).
[CrossRef]

D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Trans. Pattern Anal. Mach. Intell. 25, 564–577(2003).
[CrossRef]

2000

I. Shimshoni and J. Ponce, “Probabilistic 3D object recognition,” Int. J. Comput. Vis. 36, 51–70 (2000).
[CrossRef]

M. S. Costa and L. G. Shapiro, “3D object recognition and pose with relational indexing,” Comput. Vis. Image Underst. 79, 364–407 (2000).
[CrossRef]

1999

J. S. Beis and D. G. Lowe, “Indexing without invariants in 3d object recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 1000–1015 (1999).
[CrossRef]

A. E. Johnson and M. Hebert, “Using spin images for efficient object recognition in cluttered 3d scenes,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 433–449 (1999).
[CrossRef]

1997

C. Schmid and R. Mohr, “Local grayvalue invariants for image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell. 19, 530–535(1997).
[CrossRef]

1995

D. F. Dementhon and L. S. Davis, “Model-based object pose in 25 lines of code,” Int. J. Comput. Vis. 15, 123–141 (1995).
[CrossRef]

1991

Z. Zhang and O. D. Faugeras, “Determining motion from 3d line segment matches: a comparative study,” Image Vis. Comput. 9, 10–19 (1991).
[CrossRef]

1986

C. Genest and J. V. Zidek, “Combining probability distributions: A critique and an annotated bibliography,” Statist. Sci. 1, 114–135 (1986).
[CrossRef]

1981

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981).
[CrossRef]

1970

P. P. Loutrel, “A solution to the hidden-line problem for computer-drawn polyhedra,” IEEE Trans. Comput. C-19, 205–213 (1970).
[CrossRef]

Baek, S.

Z. Lu, S. Baek, and S. Lee, “Robust 3D line extraction from stereo point clouds,” in 2008 IEEE Conference Robotics, Automation and Mechatronics (IEEE, 2008).
[CrossRef]

Beis, J. S.

J. S. Beis and D. G. Lowe, “Indexing without invariants in 3d object recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 1000–1015 (1999).
[CrossRef]

Bolles, R. C.

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981).
[CrossRef]

Bregler, C.

C. Bregler and J. Malik, “Tracking people with twists and exponential maps,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1998).

Bulow, T.

A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik, “Recognizing objects in range data using regional point descriptors,” in Proceedings of the European Conference on Computer Vision (ECCV, 2004), Vol. 3023, pp. 224–237.

Comaniciu, D.

D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Trans. Pattern Anal. Mach. Intell. 25, 564–577(2003).
[CrossRef]

Costa, M. S.

M. S. Costa and L. G. Shapiro, “3D object recognition and pose with relational indexing,” Comput. Vis. Image Underst. 79, 364–407 (2000).
[CrossRef]

DaneshPanah, M.

Dashan, G.

G. Dashan, H. Sunhyoung, and N. Vasconcelos, “Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 31, 989–1005 (2009).
[CrossRef]

David, P.

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: Simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

P. David and D. DeMenthon, “Object recognition in high clutter images using line features,” in Tenth IEEE International Conference on Computer Vision (IEEE, 2005).

Davis, L. S.

D. F. Dementhon and L. S. Davis, “Model-based object pose in 25 lines of code,” Int. J. Comput. Vis. 15, 123–141 (1995).
[CrossRef]

Dementhon, D.

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: Simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

P. David and D. DeMenthon, “Object recognition in high clutter images using line features,” in Tenth IEEE International Conference on Computer Vision (IEEE, 2005).

Dementhon, D. F.

D. F. Dementhon and L. S. Davis, “Model-based object pose in 25 lines of code,” Int. J. Comput. Vis. 15, 123–141 (1995).
[CrossRef]

Do, C. M.

Duraiswami, R.

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: Simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

Ekvall, S.

S. Ekvall, D. Kragic, and F. Hoffmann, “Object recognition and pose estimation using color cooccurrence histograms and geometric modeling,” Image Vis. Comput. 23, 943–955(2005).
[CrossRef]

Faugeras, O. D.

Z. Zhang and O. D. Faugeras, “Determining motion from 3d line segment matches: a comparative study,” Image Vis. Comput. 9, 10–19 (1991).
[CrossRef]

Fei-Fei, L.

R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning object categories from google’s image search,” in Tenth IEEE International Conference on Computer Vision (IEEE, 2005).

Fergus, R.

R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning object categories from google’s image search,” in Tenth IEEE International Conference on Computer Vision (IEEE, 2005).

Fischler, M. A.

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981).
[CrossRef]

Frome, A.

A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik, “Recognizing objects in range data using regional point descriptors,” in Proceedings of the European Conference on Computer Vision (ECCV, 2004), Vol. 3023, pp. 224–237.

Fua, P.

V. Lepetit and P. Fua, Monocular Model-based 3D Tracking of Rigid Objects, Foundations and Trends in Computer Graphics and Vision (2005), Vol. 1, pp. 1–89.

Genest, C.

C. Genest and J. V. Zidek, “Combining probability distributions: A critique and an annotated bibliography,” Statist. Sci. 1, 114–135 (1986).
[CrossRef]

Guerra, C.

C. Guerra and V. Pascucci, “Matching sets of 3D segments,” (SPIE, 1999).

Hao, S.

S. Min, S. Hao, S. Savarese, and F.-F. Li, “A multi-view probabilistic model for 3d object classes,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2009).

Harris, C.

C. Harris and M. Stephens, “A combined corner and edge detection,” in Proceedings of The Fourth Alvey Vision Conference (1988).

Hebert, M.

A. E. Johnson and M. Hebert, “Using spin images for efficient object recognition in cluttered 3d scenes,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 433–449 (1999).
[CrossRef]

Hoffmann, F.

S. Ekvall, D. Kragic, and F. Hoffmann, “Object recognition and pose estimation using color cooccurrence histograms and geometric modeling,” Image Vis. Comput. 23, 943–955(2005).
[CrossRef]

Hong, S. H.

Hong, S.-H.

Hoyer, P. O.

M. A. Vicente, P. O. Hoyer, and A. Hyvarinen, “Equivalence of some common linear feature extraction techniques for appearance-based object recognition tasks,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 896–900 (2007).
[CrossRef]

Huber, D.

A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik, “Recognizing objects in range data using regional point descriptors,” in Proceedings of the European Conference on Computer Vision (ECCV, 2004), Vol. 3023, pp. 224–237.

Hyvarinen, A.

M. A. Vicente, P. O. Hoyer, and A. Hyvarinen, “Equivalence of some common linear feature extraction techniques for appearance-based object recognition tasks,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 896–900 (2007).
[CrossRef]

Javidi, B.

Johnson, A. E.

A. E. Johnson and M. Hebert, “Using spin images for efficient object recognition in cluttered 3d scenes,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 433–449 (1999).
[CrossRef]

Kamgar-Parsi, B.

B. Kamgar-Parsi, “Algorithms for matching 3d line sets,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 582–593 (2004).
[CrossRef] [PubMed]

Kanade, T.

H. Schneiderman and T. Kanade, “Object detection using the statistics of parts,” Int. J. Comput. Vis. 56, 151–177 (2004).
[CrossRef]

Kim, S.

S. Kim and I. Kweon, “Automatic model-based 3d object recognition by combining feature matching with tracking,” Mach.Vision Appl. 16, 267–272 (2005).
[CrossRef]

Koller-Meier, E.

K. Nummiaro, E. Koller-Meier, and L. Van Gool, “An adaptive color-based particle filter,” Image Vis. Comput. 21, 99–110(2003).
[CrossRef]

Kolluri, R.

A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik, “Recognizing objects in range data using regional point descriptors,” in Proceedings of the European Conference on Computer Vision (ECCV, 2004), Vol. 3023, pp. 224–237.

Kragic, D.

S. Ekvall, D. Kragic, and F. Hoffmann, “Object recognition and pose estimation using color cooccurrence histograms and geometric modeling,” Image Vis. Comput. 23, 943–955(2005).
[CrossRef]

Kweon, I.

S. Kim and I. Kweon, “Automatic model-based 3d object recognition by combining feature matching with tracking,” Mach.Vision Appl. 16, 267–272 (2005).
[CrossRef]

Lee, S.

Z. Lu, S. Baek, and S. Lee, “Robust 3D line extraction from stereo point clouds,” in 2008 IEEE Conference Robotics, Automation and Mechatronics (IEEE, 2008).
[CrossRef]

Lepetit, V.

V. Lepetit and P. Fua, Monocular Model-based 3D Tracking of Rigid Objects, Foundations and Trends in Computer Graphics and Vision (2005), Vol. 1, pp. 1–89.

Li, F.-F.

S. Min, S. Hao, S. Savarese, and F.-F. Li, “A multi-view probabilistic model for 3d object classes,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2009).

Loutrel, P. P.

P. P. Loutrel, “A solution to the hidden-line problem for computer-drawn polyhedra,” IEEE Trans. Comput. C-19, 205–213 (1970).
[CrossRef]

Lowe, D. G.

D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis. 60, 91–110 (2004).
[CrossRef]

J. S. Beis and D. G. Lowe, “Indexing without invariants in 3d object recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 1000–1015 (1999).
[CrossRef]

Lu, Z.

Z. Lu, S. Baek, and S. Lee, “Robust 3D line extraction from stereo point clouds,” in 2008 IEEE Conference Robotics, Automation and Mechatronics (IEEE, 2008).
[CrossRef]

Malik, J.

A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik, “Recognizing objects in range data using regional point descriptors,” in Proceedings of the European Conference on Computer Vision (ECCV, 2004), Vol. 3023, pp. 224–237.

C. Bregler and J. Malik, “Tracking people with twists and exponential maps,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1998).

Martinez-Cuenca, R.

Meer, P.

D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Trans. Pattern Anal. Mach. Intell. 25, 564–577(2003).
[CrossRef]

Min, S.

S. Min, S. Hao, S. Savarese, and F.-F. Li, “A multi-view probabilistic model for 3d object classes,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2009).

Mohr, R.

C. Schmid and R. Mohr, “Local grayvalue invariants for image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell. 19, 530–535(1997).
[CrossRef]

Nummiaro, K.

K. Nummiaro, E. Koller-Meier, and L. Van Gool, “An adaptive color-based particle filter,” Image Vis. Comput. 21, 99–110(2003).
[CrossRef]

Pascucci, V.

C. Guerra and V. Pascucci, “Matching sets of 3D segments,” (SPIE, 1999).

Perona, P.

R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning object categories from google’s image search,” in Tenth IEEE International Conference on Computer Vision (IEEE, 2005).

Ponce, J.

I. Shimshoni and J. Ponce, “Probabilistic 3D object recognition,” Int. J. Comput. Vis. 36, 51–70 (2000).
[CrossRef]

Ponce-Diaz, R.

Qiao, H.

P. Wang and H. Qiao, “Adaptive probabilistic tracking with reliable particle selection,” Electron. Lett. 45, 1160–1161(2009).
[CrossRef]

Ramesh, V.

D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Trans. Pattern Anal. Mach. Intell. 25, 564–577(2003).
[CrossRef]

Roberts, L. G.

L. G. Roberts, “Machine perception of three-dimensional solids,” in Optical and Electrooptical Information Processing, J.T.Tipett, ed. (MIT, 1965).

Samet, H.

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: Simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

Savarese, S.

S. Min, S. Hao, S. Savarese, and F.-F. Li, “A multi-view probabilistic model for 3d object classes,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2009).

Schmid, C.

C. Schmid and R. Mohr, “Local grayvalue invariants for image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell. 19, 530–535(1997).
[CrossRef]

Schneiderman, H.

H. Schneiderman and T. Kanade, “Object detection using the statistics of parts,” Int. J. Comput. Vis. 56, 151–177 (2004).
[CrossRef]

Shapiro, L. G.

M. S. Costa and L. G. Shapiro, “3D object recognition and pose with relational indexing,” Comput. Vis. Image Underst. 79, 364–407 (2000).
[CrossRef]

Shimshoni, I.

I. Shimshoni and J. Ponce, “Probabilistic 3D object recognition,” Int. J. Comput. Vis. 36, 51–70 (2000).
[CrossRef]

Stephens, M.

C. Harris and M. Stephens, “A combined corner and edge detection,” in Proceedings of The Fourth Alvey Vision Conference (1988).

Sunhyoung, H.

G. Dashan, H. Sunhyoung, and N. Vasconcelos, “Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 31, 989–1005 (2009).
[CrossRef]

Van Gool, L.

K. Nummiaro, E. Koller-Meier, and L. Van Gool, “An adaptive color-based particle filter,” Image Vis. Comput. 21, 99–110(2003).
[CrossRef]

Vasconcelos, N.

G. Dashan, H. Sunhyoung, and N. Vasconcelos, “Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 31, 989–1005 (2009).
[CrossRef]

Vicente, M. A.

M. A. Vicente, P. O. Hoyer, and A. Hyvarinen, “Equivalence of some common linear feature extraction techniques for appearance-based object recognition tasks,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 896–900 (2007).
[CrossRef]

Wang, P.

P. Wang and H. Qiao, “Adaptive probabilistic tracking with reliable particle selection,” Electron. Lett. 45, 1160–1161(2009).
[CrossRef]

Watson, E. A.

Zhang, Z.

Z. Zhang and O. D. Faugeras, “Determining motion from 3d line segment matches: a comparative study,” Image Vis. Comput. 9, 10–19 (1991).
[CrossRef]

Zidek, J. V.

C. Genest and J. V. Zidek, “Combining probability distributions: A critique and an annotated bibliography,” Statist. Sci. 1, 114–135 (1986).
[CrossRef]

Zisserman, A.

R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning object categories from google’s image search,” in Tenth IEEE International Conference on Computer Vision (IEEE, 2005).

Commun. ACM

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981).
[CrossRef]

Comput. Vis. Image Underst.

M. S. Costa and L. G. Shapiro, “3D object recognition and pose with relational indexing,” Comput. Vis. Image Underst. 79, 364–407 (2000).
[CrossRef]

Electron. Lett.

P. Wang and H. Qiao, “Adaptive probabilistic tracking with reliable particle selection,” Electron. Lett. 45, 1160–1161(2009).
[CrossRef]

IEEE Trans. Comput.

P. P. Loutrel, “A solution to the hidden-line problem for computer-drawn polyhedra,” IEEE Trans. Comput. C-19, 205–213 (1970).
[CrossRef]

IEEE Trans. Pattern Anal. Mach. Intell.

D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Trans. Pattern Anal. Mach. Intell. 25, 564–577(2003).
[CrossRef]

J. S. Beis and D. G. Lowe, “Indexing without invariants in 3d object recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 1000–1015 (1999).
[CrossRef]

M. A. Vicente, P. O. Hoyer, and A. Hyvarinen, “Equivalence of some common linear feature extraction techniques for appearance-based object recognition tasks,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 896–900 (2007).
[CrossRef]

G. Dashan, H. Sunhyoung, and N. Vasconcelos, “Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 31, 989–1005 (2009).
[CrossRef]

B. Kamgar-Parsi, “Algorithms for matching 3d line sets,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 582–593 (2004).
[CrossRef] [PubMed]

C. Schmid and R. Mohr, “Local grayvalue invariants for image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell. 19, 530–535(1997).
[CrossRef]

A. E. Johnson and M. Hebert, “Using spin images for efficient object recognition in cluttered 3d scenes,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 433–449 (1999).
[CrossRef]

Image Vis. Comput.

Z. Zhang and O. D. Faugeras, “Determining motion from 3d line segment matches: a comparative study,” Image Vis. Comput. 9, 10–19 (1991).
[CrossRef]

S. Ekvall, D. Kragic, and F. Hoffmann, “Object recognition and pose estimation using color cooccurrence histograms and geometric modeling,” Image Vis. Comput. 23, 943–955(2005).
[CrossRef]

K. Nummiaro, E. Koller-Meier, and L. Van Gool, “An adaptive color-based particle filter,” Image Vis. Comput. 21, 99–110(2003).
[CrossRef]

Int. J. Comput. Vis.

H. Schneiderman and T. Kanade, “Object detection using the statistics of parts,” Int. J. Comput. Vis. 56, 151–177 (2004).
[CrossRef]

D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis. 60, 91–110 (2004).
[CrossRef]

D. F. Dementhon and L. S. Davis, “Model-based object pose in 25 lines of code,” Int. J. Comput. Vis. 15, 123–141 (1995).
[CrossRef]

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: Simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

I. Shimshoni and J. Ponce, “Probabilistic 3D object recognition,” Int. J. Comput. Vis. 36, 51–70 (2000).
[CrossRef]

J. Opt. Soc. Am. A

Mach.Vision Appl.

S. Kim and I. Kweon, “Automatic model-based 3d object recognition by combining feature matching with tracking,” Mach.Vision Appl. 16, 267–272 (2005).
[CrossRef]

Opt. Express

Opt. Lett.

Statist. Sci.

C. Genest and J. V. Zidek, “Combining probability distributions: A critique and an annotated bibliography,” Statist. Sci. 1, 114–135 (1986).
[CrossRef]

Other

V. Lepetit and P. Fua, Monocular Model-based 3D Tracking of Rigid Objects, Foundations and Trends in Computer Graphics and Vision (2005), Vol. 1, pp. 1–89.

Z. Lu, S. Baek, and S. Lee, “Robust 3D line extraction from stereo point clouds,” in 2008 IEEE Conference Robotics, Automation and Mechatronics (IEEE, 2008).
[CrossRef]

P. David and D. DeMenthon, “Object recognition in high clutter images using line features,” in Tenth IEEE International Conference on Computer Vision (IEEE, 2005).

L. G. Roberts, “Machine perception of three-dimensional solids,” in Optical and Electrooptical Information Processing, J.T.Tipett, ed. (MIT, 1965).

C. Harris and M. Stephens, “A combined corner and edge detection,” in Proceedings of The Fourth Alvey Vision Conference (1988).

C. Guerra and V. Pascucci, “Matching sets of 3D segments,” (SPIE, 1999).

A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik, “Recognizing objects in range data using regional point descriptors,” in Proceedings of the European Conference on Computer Vision (ECCV, 2004), Vol. 3023, pp. 224–237.

R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning object categories from google’s image search,” in Tenth IEEE International Conference on Computer Vision (IEEE, 2005).

Throughout this paper, the bold font is referred to vector or matrix.

C. Bregler and J. Malik, “Tracking people with twists and exponential maps,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1998).

S. Min, S. Hao, S. Savarese, and F.-F. Li, “A multi-view probabilistic model for 3d object classes,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2009).

Supplementary Material (1)

» Media 1: MOV (2911 KB)     

Cited By

OSA participates in CrossRef's Cited-By Linking service. Citing articles from OSA journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (17)

Fig. 1
Fig. 1

Flow chart of the proposed method. First, images are captured by stereo camera, followed by 3D line extraction; 3D parallel lines and perpendicular lines are selected based on the model constraints. Second, multiple interpretations are generated by matching the image features with those of the model. If the generated interpretations satisfy the visibility test [36], then probability and pose distribution are computed. Finally, a set of top ranked interpretations are further verified and refined for final recognition decision.

Fig. 2
Fig. 2

Multiple interpretations: (a) target object, (b) parallel line pairs based interpretations (parallel line pair superimposed upon the object model O), (c) perpendicular line pairs based interpretations. H 1 , , H m represent hypotheses from different feature sets of the model.

Fig. 3
Fig. 3

Illustration of subinterpretations, where F 1 , F 2 , F k are extended feature sets (represented by the dashed blue line).

Fig. 4
Fig. 4

Pose uncertainty estimation: Given two 3D lines L 1 , L 2 with corresponding four endpoints as ( p 1 1 , p 1 2 ) and ( p 2 1 , p 2 2 ), respectively. The error bound of each line is modeled as an elliptic cylinder. Therefore, the pose uncertainty is estimated based on the centroid point p m c .

Fig. 5
Fig. 5

Generated interpretation and support 3D line evidences, where (a) shows neighboring 3D measured lines that belong to the estimated pose, blue 3D lines are neighboring but red are not, (b) shows the geometric constraints requirement of the support 3D line evidences, and (c) illustrates the coverage of support line features.

Fig. 6
Fig. 6

Illustration of distribution and coverage of support line evidence.

Fig. 7
Fig. 7

Statistic analysis of the distribution value for the target object and nonobject.

Fig. 8
Fig. 8

Multipart HSV color histogram: each region is divided into four subregions. (a) is the target model H t j ( u ) , (b) is the hypothetical observation H o j ( u ) , and (c) is background H b j ( u ) , which is used for dissimilarity measurement.

Fig. 9
Fig. 9

Selected textured/textureless daily used objects, which include milk box, biscuit box, refrigerator, trash can, dishwasher, book, etc.

Fig. 10
Fig. 10

Recognition results of multiple interpretations generation; results in both 2D images and 3D point clouds are illustrated. Each row shows selected interpretations for one object. From first to sixth row, the objects are: kitchen refuse bin, refrigerator, biscuit box, milk box, tissue box, and dishwasher. The first column shows the true recognition result and correct pose estimation, and the second and third columns show the multiple interpretations with incorrect results. (Media 1)

Fig. 11
Fig. 11

Recognition of 3D nonbox objects: sweet corn bottle (top left), Pocari Sweat can (top right), Gatorade can (bottom left), and coffee cup (bottom right). The appearance of target objects are shown in the top middle of each image, and the model of recognized objects overlaid with pink color on the original image.

Fig. 12
Fig. 12

Probability computation of interpretations. (a) is an original 2D image overlapped with 2D lines, the right-most box on the white table is the target object. (b)–(d) are three generated interpretations with estimated pose (green color) in 3D space. The probabilities of each interpretation are 0.32, 0.27, 0.76, respectively, where only (d) is the true interpretation of the target object. The figure is best viewed in color and with PDF magnification.

Fig. 13
Fig. 13

Multiple interpretations generation and probability assignment. Top row: The left image is the original 2D image, the right image is the 3D point clouds, and multiple interpretations are generated where the target object is the milk box that is bounded by a white ellipse, and two more sets of interpretations are for nonobject bounded by the dashed yellow ellipses. Bottom row: Selecting one interpretation from each set in the top right image, the probabilities of each are 0.37, 0.89, and 0.42, respectively, where the highest probability correctly indicates the true object. The figure is best viewed in color and with PDF magnification.

Fig. 14
Fig. 14

Color efficiency for the objects with similar geometric shape. (a) Shows the correct recognition and pose estimation of the true object with a probability 0.92. Both (b) and (c) are incorrect interpretations with probabilities of 0.57 and 0.64, respectively. Surface templates for color features are attached in each image, where (a) is Seoul milk, (b) is Maeil milk and (c) is Namyang milk.

Fig. 15
Fig. 15

Pose accuracy evaluation.

Fig. 16
Fig. 16

Selected images for performance evaluation. The first row is case 1 with only a single object in the foreground, but the background is still cluttered; The second row is case 2, where the object is partially occluded; The third row is case 3, which is the most difficult case, where partial occlusion and several similar objects coexist with the target object. Both 2D and 3D recognition results are shown for each image. For clarity, the 3D results are enlarged to fit the window. The figure is best viewed in color and with PDF magnification. (See Media 1.)

Fig. 17
Fig. 17

Performance statistics, showing that the correct recognition can be obtained from only a small number of top ranked interpretations.

Tables (2)

Tables Icon

Table 1 Pose Accuracy Analysis, Test Objects are Shown in First Column of Fig. 10

Tables Icon

Table 2 Average Computation Time for Each Image

Equations (20)

Equations on this page are rendered with MathJax. Learn more.

P ( x , H m , O | F ) = P ( H m , O | F ) P ( x | F , H m , O ) ,
P ( x , H m , O | F ) = k = 1 K P ( x , H m , O | F k ) P ( F k | F ) ,
F k = T m k H m ,
I m k = { π m k , N ( θ m k , Σ m k ) } ,
P ( H m , O | F k ) = P ( F k | H m , O ) P ( H m , O ) P ( F k ) = 1 1 + α ,
e j = min { E max , 1 N j i = 1 N j ( μ d i j d ¯ 2 + ( 1 μ ) tan 2 θ i j tan 2 θ ¯ ) } , c j = max { C min , i = 1 N j l i j L j } ,
e = 1 N r j = 1 N r e j , c = 1 N r j = 1 N r c j .
P line ( F k | H m , O ) = c ( 1 e 2 ) .
D ( N ( θ m k ) ) = j = 1 N r ( c ˜ j · log ( c ˜ j ) ) .
D ˜ ( N ( θ m k ) ) = D ( N ( θ m k ) ) log ( N r ) .
P line ( F k | H m , O ¯ ) = 1 D ˜ ( N ( θ m k ) ) .
P color ( F k | H m , O ) = 1 2 π σ exp ( d 2 ( H t ( u ) , H o ( u ) ) 2 σ 2 ) , P color ( F k | H m , O ¯ ) = 1 2 π σ exp ( 1 d 2 ( H b ( u ) , H o ( u ) ) 2 σ 2 ) ,
d ( H 1 , H 2 ) = j = 1 r c j 1 u = 1 N H 1 j ( u ) H 2 j ( u ) ,
P ( F k | H m , O ) = P line ( F k | H m , O ) P color ( F k | H m , O ) , P ( F k | H m , O ¯ ) = P line ( F k | H m , O ¯ ) P color ( F k | H m , O ¯ ) .
P ( H m , O | F k ) = 1 1 + α = 1 1 + P line ( F k | H m , O ¯ ) P line ( F k | H m , O ) · P color ( F k | H m , O ¯ ) P color ( F k | H m , O ) .
d ( I 1 , I 2 ) = ( θ 1 θ 2 ) T ( Σ 1 + Σ 2 2 ) 1 ( θ 1 θ 2 )
P ( h 1 , h 2 , O | f 1 , f 2 ) = P ( f 1 , f 2 | h 1 , h 2 , O ) P ( f 1 , f 2 ) = 1 1 + α 12 ,
P ( x | f 1 , f 2 , h 1 , h 2 , O ) = P ( x | f 1 , h 1 , O ) P ( x | f 2 , h 2 , O ) .
P ( x | f 1 , f 2 , h 1 , h 2 , O ) = P ( x | f 1 , h 1 , O ) ω 1 P ( x | f 2 , h 2 , O ) ω 2 ,
Σ 1 = ω 1 Σ 1 1 + ω 2 Σ 2 1 θ = Σ ( ω 1 Σ 1 1 θ 1 + ω 2 Σ 2 1 θ 2 )

Metrics