Abstract

One of the most important problems in computer vision is the computation of the two-dimensional projective transformation (homography) that maps features of planar objects in different images and videos. This computation is required by many applications such as image mosaicking, image registration, and augmented reality. The real-time performance imposes constraints on the methods used. In this paper, we address the real-time detection and tracking of planar objects in a video sequence where the object of interest is given by a reference image template. Most existing approaches for homography estimation are based on two steps: feature extraction (first step) followed by a combinatorial optimization method (second step) to match features between the reference template and the scene frame. This paper has two main contributions. First, we detect both planar and nonplanar objects via efficient object feature classification in the input images, which is applied prior to performing the matching step. Second, for the tracking part (planar objects), we propose a fast method for the computation of the homography that is based on the transferred object features and their associated local raw brightness. The advantage of the proposed schemes is a fast matching as well as fast and robust object registration that is given by either a homography or three-dimensional pose.

© 2012 Optical Society of America

Full Article  |  PDF Article

References

  • View by:
  • |
  • |
  • |

  1. S. Prince, K. Xu, and A. Cheok, “Augmented reality camera tracking with homographies,” IEEE Comput. Graph. Appl. 22, 39–45 (2002).
    [CrossRef]
  2. M. Ozuysal, P. Fua, and V. Lepetit, “Fast keypoint recognition in ten lines of code,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1–8.
  3. P. Viola and M. Jones, “Robust real-time object detection,” Int. J. Comput. Vis. 57, 137–154 (2004).
    [CrossRef]
  4. P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
    [CrossRef]
  5. B. Boufama and D. Oconnell, “Identification and matching of planes in a pair of uncalibrated images,” Int. J. Pattern Recogn. Artif. Intell. 17, 1127–1143 (2003).
    [CrossRef]
  6. E. Tola, P. Fua, and V. Lepetit, “A fast local descriptor for dense matching,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2008), pp. 1–8.
  7. T. Zhang and C. Tomasi, “Fast, robust, and consistent camera motion estimation,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 1999), pp. 164–170.
  8. B. Zitova and J. Flusser, “Image registration methods: a survey,” Image Vis. Comput. 21, 977–1000 (2003).
    [CrossRef]
  9. J. R. Cozar, N. Guil, and E. Zapata, “Planar object detection under scaled orthographic projection,” Pattern Recogn. Lett. 23, 719–729 (2002).
    [CrossRef]
  10. A. Yilmaz, O. Javed, and M. Shah, “Object tracking: a survey,” ACM Comput. Surv. 38, 1–45 (2006).
    [CrossRef]
  11. P. Dollar, B. Babenko, S. Belongie, P. Perona, and Z. Tu, “Multiple component learning for object detection,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 211–224.
  12. A. Heidari and P. Aarabi, “Real-time object tracking on iPhone,” in Proceedings of the International Symposium on Visual Computing, Vol. 6938 of Lecture Notes in Computer Science (Springer-Verlag, 2011), pp. 768–777.
  13. M. Brown and D. G. Lowe, “Automatic panoramic image stitching using invariant features,” Int. J. Comput. Vis. 74, 59–73 (2007).
    [CrossRef]
  14. Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, “Full frame video stabilization with motion inpainting,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1150–1163 (2006).
    [CrossRef]
  15. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision (Cambridge University, 2000).
  16. P. Perez and N. Garcia, “Robust and accurate registration of images with unknown relative orientation and exposure,” in IEEE International Conference on Image Processing (IEEE, 2005), Vol. III, pp. 1104–1107.
  17. F. Dornaika and J. Elder, “Image registration for foveated omnidirectional sensing,” in Proceedings of the European Conference on Computer Vision, Vol. 2353 of Lecture Notes in Computer Science (Springer-Verlag, 2002), pp. 606–620.
  18. B. Pires and P. Aguiar, “Featureless global alignment of multiple images,” in IEEE International Conference on Image Processing (IEEE, 2005), Vol. I, pp. 57–60.
  19. F. Dornaika, “Registering conventional images with low resolution panoramic images,” in Proceedings of the International Conference on Computer Vision Systems (Applied Computer Science Group, 2007), pp. 1–10.
  20. R. Guerreiro and P. Aguiar, “Global motion estimation: feature-based, featureless, or both?!” in International Conference on Image Analysis and Recognition (Springer-Verlag, 2006), pp. 721–730.
  21. J. M. Buenaposada and L. Baumela, “Real-time tracking and estimation of plane pose,” in IEEE International Conference on Pattern Recognition (IEEE, 2002), pp. 697–700.
  22. Y. Altunbasak, R. Merserau, and A. Patti, “A fast parametric motion estimation algorithm with illumination and lens distortion correction,” IEEE Trans. Image Process. 12, 395–408 (2003).
    [CrossRef]
  23. J. Kannala, M. Salo, and J. Heikkila, “Algorithms for computing a planar homography from conics in correspondence,” in British Machine Vision Conference (BMVA, 2006), pp. 77–86.
  24. O. Faugeras and Q. Luong, The Geometry of Multiple Images (MIT, 2001).
  25. D. Lowe, “Distinctive image features from scale invariant keypoints,” Int. J. Comput. Vis. 60, 91–110 (2004).
    [CrossRef]
  26. H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: Speeded Up Robust Features,” Comput. Vis. Image Underst. 110, 346–359 (2008).
    [CrossRef]
  27. M. Agrawal, K. Konolige, and M. Blas, “CenSurE: center surround extremas for realtime feature detection and matching,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 102–115.
  28. C. Cortes and V. Vapnick, “Support-vector networks,” Mach. Learn. 20, 273–297 (1995).
    [CrossRef]
  29. C. M. Bishop, Pattern Recognition and Machine Learning (Springer, 2006).
  30. B. Scholkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R. Williamson, “Estimating the support of a high-dimensional distribution,” Tech. Rep. (Microsoft Research, 2000).
  31. M. de Berg, O. Cheong, M. Kreveld, and M. Overmars, Computational Geometry: Algorithms and Applications (Springer, 2008).
  32. M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981).
    [CrossRef]
  33. W. H. Press, S. A. Teukolsky, W. T. Wetterling, and B. P. Flannery, Numerical Recipes, The Art of Scientific Computing (Cambridge University, 1992).

2008

H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: Speeded Up Robust Features,” Comput. Vis. Image Underst. 110, 346–359 (2008).
[CrossRef]

2007

M. Brown and D. G. Lowe, “Automatic panoramic image stitching using invariant features,” Int. J. Comput. Vis. 74, 59–73 (2007).
[CrossRef]

2006

Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, “Full frame video stabilization with motion inpainting,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1150–1163 (2006).
[CrossRef]

A. Yilmaz, O. Javed, and M. Shah, “Object tracking: a survey,” ACM Comput. Surv. 38, 1–45 (2006).
[CrossRef]

2004

P. Viola and M. Jones, “Robust real-time object detection,” Int. J. Comput. Vis. 57, 137–154 (2004).
[CrossRef]

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

D. Lowe, “Distinctive image features from scale invariant keypoints,” Int. J. Comput. Vis. 60, 91–110 (2004).
[CrossRef]

2003

B. Boufama and D. Oconnell, “Identification and matching of planes in a pair of uncalibrated images,” Int. J. Pattern Recogn. Artif. Intell. 17, 1127–1143 (2003).
[CrossRef]

B. Zitova and J. Flusser, “Image registration methods: a survey,” Image Vis. Comput. 21, 977–1000 (2003).
[CrossRef]

Y. Altunbasak, R. Merserau, and A. Patti, “A fast parametric motion estimation algorithm with illumination and lens distortion correction,” IEEE Trans. Image Process. 12, 395–408 (2003).
[CrossRef]

2002

S. Prince, K. Xu, and A. Cheok, “Augmented reality camera tracking with homographies,” IEEE Comput. Graph. Appl. 22, 39–45 (2002).
[CrossRef]

J. R. Cozar, N. Guil, and E. Zapata, “Planar object detection under scaled orthographic projection,” Pattern Recogn. Lett. 23, 719–729 (2002).
[CrossRef]

1995

C. Cortes and V. Vapnick, “Support-vector networks,” Mach. Learn. 20, 273–297 (1995).
[CrossRef]

1981

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981).
[CrossRef]

Aarabi, P.

A. Heidari and P. Aarabi, “Real-time object tracking on iPhone,” in Proceedings of the International Symposium on Visual Computing, Vol. 6938 of Lecture Notes in Computer Science (Springer-Verlag, 2011), pp. 768–777.

Agrawal, M.

M. Agrawal, K. Konolige, and M. Blas, “CenSurE: center surround extremas for realtime feature detection and matching,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 102–115.

Aguiar, P.

R. Guerreiro and P. Aguiar, “Global motion estimation: feature-based, featureless, or both?!” in International Conference on Image Analysis and Recognition (Springer-Verlag, 2006), pp. 721–730.

B. Pires and P. Aguiar, “Featureless global alignment of multiple images,” in IEEE International Conference on Image Processing (IEEE, 2005), Vol. I, pp. 57–60.

Altunbasak, Y.

Y. Altunbasak, R. Merserau, and A. Patti, “A fast parametric motion estimation algorithm with illumination and lens distortion correction,” IEEE Trans. Image Process. 12, 395–408 (2003).
[CrossRef]

Babenko, B.

P. Dollar, B. Babenko, S. Belongie, P. Perona, and Z. Tu, “Multiple component learning for object detection,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 211–224.

Baumela, L.

J. M. Buenaposada and L. Baumela, “Real-time tracking and estimation of plane pose,” in IEEE International Conference on Pattern Recognition (IEEE, 2002), pp. 697–700.

Bay, H.

H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: Speeded Up Robust Features,” Comput. Vis. Image Underst. 110, 346–359 (2008).
[CrossRef]

Belongie, S.

P. Dollar, B. Babenko, S. Belongie, P. Perona, and Z. Tu, “Multiple component learning for object detection,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 211–224.

Bishop, C. M.

C. M. Bishop, Pattern Recognition and Machine Learning (Springer, 2006).

Blas, M.

M. Agrawal, K. Konolige, and M. Blas, “CenSurE: center surround extremas for realtime feature detection and matching,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 102–115.

Bolles, R. C.

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981).
[CrossRef]

Boufama, B.

B. Boufama and D. Oconnell, “Identification and matching of planes in a pair of uncalibrated images,” Int. J. Pattern Recogn. Artif. Intell. 17, 1127–1143 (2003).
[CrossRef]

Brown, M.

M. Brown and D. G. Lowe, “Automatic panoramic image stitching using invariant features,” Int. J. Comput. Vis. 74, 59–73 (2007).
[CrossRef]

Buenaposada, J. M.

J. M. Buenaposada and L. Baumela, “Real-time tracking and estimation of plane pose,” in IEEE International Conference on Pattern Recognition (IEEE, 2002), pp. 697–700.

Cheok, A.

S. Prince, K. Xu, and A. Cheok, “Augmented reality camera tracking with homographies,” IEEE Comput. Graph. Appl. 22, 39–45 (2002).
[CrossRef]

Cheong, O.

M. de Berg, O. Cheong, M. Kreveld, and M. Overmars, Computational Geometry: Algorithms and Applications (Springer, 2008).

Cortes, C.

C. Cortes and V. Vapnick, “Support-vector networks,” Mach. Learn. 20, 273–297 (1995).
[CrossRef]

Cozar, J. R.

J. R. Cozar, N. Guil, and E. Zapata, “Planar object detection under scaled orthographic projection,” Pattern Recogn. Lett. 23, 719–729 (2002).
[CrossRef]

David, P.

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

de Berg, M.

M. de Berg, O. Cheong, M. Kreveld, and M. Overmars, Computational Geometry: Algorithms and Applications (Springer, 2008).

Dementhon, D.

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

Dollar, P.

P. Dollar, B. Babenko, S. Belongie, P. Perona, and Z. Tu, “Multiple component learning for object detection,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 211–224.

Dornaika, F.

F. Dornaika and J. Elder, “Image registration for foveated omnidirectional sensing,” in Proceedings of the European Conference on Computer Vision, Vol. 2353 of Lecture Notes in Computer Science (Springer-Verlag, 2002), pp. 606–620.

F. Dornaika, “Registering conventional images with low resolution panoramic images,” in Proceedings of the International Conference on Computer Vision Systems (Applied Computer Science Group, 2007), pp. 1–10.

Duraiswami, R.

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

Elder, J.

F. Dornaika and J. Elder, “Image registration for foveated omnidirectional sensing,” in Proceedings of the European Conference on Computer Vision, Vol. 2353 of Lecture Notes in Computer Science (Springer-Verlag, 2002), pp. 606–620.

Ess, A.

H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: Speeded Up Robust Features,” Comput. Vis. Image Underst. 110, 346–359 (2008).
[CrossRef]

Faugeras, O.

O. Faugeras and Q. Luong, The Geometry of Multiple Images (MIT, 2001).

Fischler, M. A.

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981).
[CrossRef]

Flannery, B. P.

W. H. Press, S. A. Teukolsky, W. T. Wetterling, and B. P. Flannery, Numerical Recipes, The Art of Scientific Computing (Cambridge University, 1992).

Flusser, J.

B. Zitova and J. Flusser, “Image registration methods: a survey,” Image Vis. Comput. 21, 977–1000 (2003).
[CrossRef]

Fua, P.

E. Tola, P. Fua, and V. Lepetit, “A fast local descriptor for dense matching,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2008), pp. 1–8.

M. Ozuysal, P. Fua, and V. Lepetit, “Fast keypoint recognition in ten lines of code,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1–8.

Garcia, N.

P. Perez and N. Garcia, “Robust and accurate registration of images with unknown relative orientation and exposure,” in IEEE International Conference on Image Processing (IEEE, 2005), Vol. III, pp. 1104–1107.

Ge, W.

Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, “Full frame video stabilization with motion inpainting,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1150–1163 (2006).
[CrossRef]

Gool, L. V.

H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: Speeded Up Robust Features,” Comput. Vis. Image Underst. 110, 346–359 (2008).
[CrossRef]

Guerreiro, R.

R. Guerreiro and P. Aguiar, “Global motion estimation: feature-based, featureless, or both?!” in International Conference on Image Analysis and Recognition (Springer-Verlag, 2006), pp. 721–730.

Guil, N.

J. R. Cozar, N. Guil, and E. Zapata, “Planar object detection under scaled orthographic projection,” Pattern Recogn. Lett. 23, 719–729 (2002).
[CrossRef]

Hartley, R.

R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision (Cambridge University, 2000).

Heidari, A.

A. Heidari and P. Aarabi, “Real-time object tracking on iPhone,” in Proceedings of the International Symposium on Visual Computing, Vol. 6938 of Lecture Notes in Computer Science (Springer-Verlag, 2011), pp. 768–777.

Heikkila, J.

J. Kannala, M. Salo, and J. Heikkila, “Algorithms for computing a planar homography from conics in correspondence,” in British Machine Vision Conference (BMVA, 2006), pp. 77–86.

Javed, O.

A. Yilmaz, O. Javed, and M. Shah, “Object tracking: a survey,” ACM Comput. Surv. 38, 1–45 (2006).
[CrossRef]

Jones, M.

P. Viola and M. Jones, “Robust real-time object detection,” Int. J. Comput. Vis. 57, 137–154 (2004).
[CrossRef]

Kannala, J.

J. Kannala, M. Salo, and J. Heikkila, “Algorithms for computing a planar homography from conics in correspondence,” in British Machine Vision Conference (BMVA, 2006), pp. 77–86.

Konolige, K.

M. Agrawal, K. Konolige, and M. Blas, “CenSurE: center surround extremas for realtime feature detection and matching,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 102–115.

Kreveld, M.

M. de Berg, O. Cheong, M. Kreveld, and M. Overmars, Computational Geometry: Algorithms and Applications (Springer, 2008).

Lepetit, V.

M. Ozuysal, P. Fua, and V. Lepetit, “Fast keypoint recognition in ten lines of code,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1–8.

E. Tola, P. Fua, and V. Lepetit, “A fast local descriptor for dense matching,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2008), pp. 1–8.

Lowe, D.

D. Lowe, “Distinctive image features from scale invariant keypoints,” Int. J. Comput. Vis. 60, 91–110 (2004).
[CrossRef]

Lowe, D. G.

M. Brown and D. G. Lowe, “Automatic panoramic image stitching using invariant features,” Int. J. Comput. Vis. 74, 59–73 (2007).
[CrossRef]

Luong, Q.

O. Faugeras and Q. Luong, The Geometry of Multiple Images (MIT, 2001).

Matsushita, Y.

Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, “Full frame video stabilization with motion inpainting,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1150–1163 (2006).
[CrossRef]

Merserau, R.

Y. Altunbasak, R. Merserau, and A. Patti, “A fast parametric motion estimation algorithm with illumination and lens distortion correction,” IEEE Trans. Image Process. 12, 395–408 (2003).
[CrossRef]

Oconnell, D.

B. Boufama and D. Oconnell, “Identification and matching of planes in a pair of uncalibrated images,” Int. J. Pattern Recogn. Artif. Intell. 17, 1127–1143 (2003).
[CrossRef]

Ofek, E.

Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, “Full frame video stabilization with motion inpainting,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1150–1163 (2006).
[CrossRef]

Overmars, M.

M. de Berg, O. Cheong, M. Kreveld, and M. Overmars, Computational Geometry: Algorithms and Applications (Springer, 2008).

Ozuysal, M.

M. Ozuysal, P. Fua, and V. Lepetit, “Fast keypoint recognition in ten lines of code,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1–8.

Patti, A.

Y. Altunbasak, R. Merserau, and A. Patti, “A fast parametric motion estimation algorithm with illumination and lens distortion correction,” IEEE Trans. Image Process. 12, 395–408 (2003).
[CrossRef]

Perez, P.

P. Perez and N. Garcia, “Robust and accurate registration of images with unknown relative orientation and exposure,” in IEEE International Conference on Image Processing (IEEE, 2005), Vol. III, pp. 1104–1107.

Perona, P.

P. Dollar, B. Babenko, S. Belongie, P. Perona, and Z. Tu, “Multiple component learning for object detection,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 211–224.

Pires, B.

B. Pires and P. Aguiar, “Featureless global alignment of multiple images,” in IEEE International Conference on Image Processing (IEEE, 2005), Vol. I, pp. 57–60.

Platt, J.

B. Scholkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R. Williamson, “Estimating the support of a high-dimensional distribution,” Tech. Rep. (Microsoft Research, 2000).

Press, W. H.

W. H. Press, S. A. Teukolsky, W. T. Wetterling, and B. P. Flannery, Numerical Recipes, The Art of Scientific Computing (Cambridge University, 1992).

Prince, S.

S. Prince, K. Xu, and A. Cheok, “Augmented reality camera tracking with homographies,” IEEE Comput. Graph. Appl. 22, 39–45 (2002).
[CrossRef]

Salo, M.

J. Kannala, M. Salo, and J. Heikkila, “Algorithms for computing a planar homography from conics in correspondence,” in British Machine Vision Conference (BMVA, 2006), pp. 77–86.

Samet, H.

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

Scholkopf, B.

B. Scholkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R. Williamson, “Estimating the support of a high-dimensional distribution,” Tech. Rep. (Microsoft Research, 2000).

Shah, M.

A. Yilmaz, O. Javed, and M. Shah, “Object tracking: a survey,” ACM Comput. Surv. 38, 1–45 (2006).
[CrossRef]

Shawe-Taylor, J.

B. Scholkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R. Williamson, “Estimating the support of a high-dimensional distribution,” Tech. Rep. (Microsoft Research, 2000).

Shum, H.

Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, “Full frame video stabilization with motion inpainting,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1150–1163 (2006).
[CrossRef]

Smola, A.

B. Scholkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R. Williamson, “Estimating the support of a high-dimensional distribution,” Tech. Rep. (Microsoft Research, 2000).

Tang, X.

Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, “Full frame video stabilization with motion inpainting,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1150–1163 (2006).
[CrossRef]

Teukolsky, S. A.

W. H. Press, S. A. Teukolsky, W. T. Wetterling, and B. P. Flannery, Numerical Recipes, The Art of Scientific Computing (Cambridge University, 1992).

Tola, E.

E. Tola, P. Fua, and V. Lepetit, “A fast local descriptor for dense matching,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2008), pp. 1–8.

Tomasi, C.

T. Zhang and C. Tomasi, “Fast, robust, and consistent camera motion estimation,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 1999), pp. 164–170.

Tu, Z.

P. Dollar, B. Babenko, S. Belongie, P. Perona, and Z. Tu, “Multiple component learning for object detection,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 211–224.

Tuytelaars, T.

H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: Speeded Up Robust Features,” Comput. Vis. Image Underst. 110, 346–359 (2008).
[CrossRef]

Vapnick, V.

C. Cortes and V. Vapnick, “Support-vector networks,” Mach. Learn. 20, 273–297 (1995).
[CrossRef]

Viola, P.

P. Viola and M. Jones, “Robust real-time object detection,” Int. J. Comput. Vis. 57, 137–154 (2004).
[CrossRef]

Wetterling, W. T.

W. H. Press, S. A. Teukolsky, W. T. Wetterling, and B. P. Flannery, Numerical Recipes, The Art of Scientific Computing (Cambridge University, 1992).

Williamson, R.

B. Scholkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R. Williamson, “Estimating the support of a high-dimensional distribution,” Tech. Rep. (Microsoft Research, 2000).

Xu, K.

S. Prince, K. Xu, and A. Cheok, “Augmented reality camera tracking with homographies,” IEEE Comput. Graph. Appl. 22, 39–45 (2002).
[CrossRef]

Yilmaz, A.

A. Yilmaz, O. Javed, and M. Shah, “Object tracking: a survey,” ACM Comput. Surv. 38, 1–45 (2006).
[CrossRef]

Zapata, E.

J. R. Cozar, N. Guil, and E. Zapata, “Planar object detection under scaled orthographic projection,” Pattern Recogn. Lett. 23, 719–729 (2002).
[CrossRef]

Zhang, T.

T. Zhang and C. Tomasi, “Fast, robust, and consistent camera motion estimation,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 1999), pp. 164–170.

Zisserman, A.

R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision (Cambridge University, 2000).

Zitova, B.

B. Zitova and J. Flusser, “Image registration methods: a survey,” Image Vis. Comput. 21, 977–1000 (2003).
[CrossRef]

ACM Comput. Surv.

A. Yilmaz, O. Javed, and M. Shah, “Object tracking: a survey,” ACM Comput. Surv. 38, 1–45 (2006).
[CrossRef]

Commun. ACM

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981).
[CrossRef]

Comput. Vis. Image Underst.

H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: Speeded Up Robust Features,” Comput. Vis. Image Underst. 110, 346–359 (2008).
[CrossRef]

IEEE Comput. Graph. Appl.

S. Prince, K. Xu, and A. Cheok, “Augmented reality camera tracking with homographies,” IEEE Comput. Graph. Appl. 22, 39–45 (2002).
[CrossRef]

IEEE Trans. Image Process.

Y. Altunbasak, R. Merserau, and A. Patti, “A fast parametric motion estimation algorithm with illumination and lens distortion correction,” IEEE Trans. Image Process. 12, 395–408 (2003).
[CrossRef]

IEEE Trans. Pattern Anal. Mach. Intell.

Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, “Full frame video stabilization with motion inpainting,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1150–1163 (2006).
[CrossRef]

Image Vis. Comput.

B. Zitova and J. Flusser, “Image registration methods: a survey,” Image Vis. Comput. 21, 977–1000 (2003).
[CrossRef]

Int. J. Comput. Vis.

P. Viola and M. Jones, “Robust real-time object detection,” Int. J. Comput. Vis. 57, 137–154 (2004).
[CrossRef]

P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: simultaneous pose and correspondence determination,” Int. J. Comput. Vis. 59, 259–284 (2004).
[CrossRef]

M. Brown and D. G. Lowe, “Automatic panoramic image stitching using invariant features,” Int. J. Comput. Vis. 74, 59–73 (2007).
[CrossRef]

D. Lowe, “Distinctive image features from scale invariant keypoints,” Int. J. Comput. Vis. 60, 91–110 (2004).
[CrossRef]

Int. J. Pattern Recogn. Artif. Intell.

B. Boufama and D. Oconnell, “Identification and matching of planes in a pair of uncalibrated images,” Int. J. Pattern Recogn. Artif. Intell. 17, 1127–1143 (2003).
[CrossRef]

Mach. Learn.

C. Cortes and V. Vapnick, “Support-vector networks,” Mach. Learn. 20, 273–297 (1995).
[CrossRef]

Pattern Recogn. Lett.

J. R. Cozar, N. Guil, and E. Zapata, “Planar object detection under scaled orthographic projection,” Pattern Recogn. Lett. 23, 719–729 (2002).
[CrossRef]

Other

M. Ozuysal, P. Fua, and V. Lepetit, “Fast keypoint recognition in ten lines of code,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1–8.

E. Tola, P. Fua, and V. Lepetit, “A fast local descriptor for dense matching,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2008), pp. 1–8.

T. Zhang and C. Tomasi, “Fast, robust, and consistent camera motion estimation,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 1999), pp. 164–170.

R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision (Cambridge University, 2000).

P. Perez and N. Garcia, “Robust and accurate registration of images with unknown relative orientation and exposure,” in IEEE International Conference on Image Processing (IEEE, 2005), Vol. III, pp. 1104–1107.

F. Dornaika and J. Elder, “Image registration for foveated omnidirectional sensing,” in Proceedings of the European Conference on Computer Vision, Vol. 2353 of Lecture Notes in Computer Science (Springer-Verlag, 2002), pp. 606–620.

B. Pires and P. Aguiar, “Featureless global alignment of multiple images,” in IEEE International Conference on Image Processing (IEEE, 2005), Vol. I, pp. 57–60.

F. Dornaika, “Registering conventional images with low resolution panoramic images,” in Proceedings of the International Conference on Computer Vision Systems (Applied Computer Science Group, 2007), pp. 1–10.

R. Guerreiro and P. Aguiar, “Global motion estimation: feature-based, featureless, or both?!” in International Conference on Image Analysis and Recognition (Springer-Verlag, 2006), pp. 721–730.

J. M. Buenaposada and L. Baumela, “Real-time tracking and estimation of plane pose,” in IEEE International Conference on Pattern Recognition (IEEE, 2002), pp. 697–700.

C. M. Bishop, Pattern Recognition and Machine Learning (Springer, 2006).

B. Scholkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R. Williamson, “Estimating the support of a high-dimensional distribution,” Tech. Rep. (Microsoft Research, 2000).

M. de Berg, O. Cheong, M. Kreveld, and M. Overmars, Computational Geometry: Algorithms and Applications (Springer, 2008).

W. H. Press, S. A. Teukolsky, W. T. Wetterling, and B. P. Flannery, Numerical Recipes, The Art of Scientific Computing (Cambridge University, 1992).

M. Agrawal, K. Konolige, and M. Blas, “CenSurE: center surround extremas for realtime feature detection and matching,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 102–115.

J. Kannala, M. Salo, and J. Heikkila, “Algorithms for computing a planar homography from conics in correspondence,” in British Machine Vision Conference (BMVA, 2006), pp. 77–86.

O. Faugeras and Q. Luong, The Geometry of Multiple Images (MIT, 2001).

P. Dollar, B. Babenko, S. Belongie, P. Perona, and Z. Tu, “Multiple component learning for object detection,” in European Conference on Computer Vision (Springer-Verlag, 2008), pp. 211–224.

A. Heidari and P. Aarabi, “Real-time object tracking on iPhone,” in Proceedings of the International Symposium on Visual Computing, Vol. 6938 of Lecture Notes in Computer Science (Springer-Verlag, 2011), pp. 768–777.

Cited By

OSA participates in CrossRef's Cited-By Linking service. Citing articles from OSA journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1.
Fig. 1.

Object detection and tracking consists in estimating the homography between the object model (reference template) and the current input video frame.

Fig. 2.
Fig. 2.

Main stages of the proposed approach for object detection and matching.

Fig. 3.
Fig. 3.

(a) Reference template associated with the object. The extracted SIFT features are shown in red. (b) Input video frame together with the extracted SIFT features (shown in green). (c) Features that are labeled as object points (the classification results given by the trained SVMs). (d) Obtained matches after applying the RANSAC technique. (e) Augmentation results based on the computed homography.

Fig. 4.
Fig. 4.

(a) Reference template associated with the object. The extracted SIFT features are shown in red. (b) Input frame together with the extracted and classified features (shown in green and red). The red features are the image features that are classified as object features (positive features) by SVMs adopting one class. (c) Same extracted and classified features (shown in green and red) using SVMs adopting two classes.

Fig. 5.
Fig. 5.

The proposed tracking scheme is based on a fast featureless registration technique that uses a subset of object pixels. (1) Reference object features are extracted offline. (2) Some reference object features are transferred to frame It1 using the known homography Ht1. Some small rectangular patches are centered on the obtained transferred features in frame It1. (3) A featureless approach (SSD technique) is used to infer the homography between frame It1 and frame It1.

Fig. 6.
Fig. 6.

The accuracy of the estimated homography (image registration) is assessed by the alignment of the two images: the warped image and the destination image. (a) Two consecutive images. The left half shows the first image (source image) and the right half shows the second image (destination image). (b) The left half depicts the warped version of the source image computed by the estimated homography.

Fig. 7.
Fig. 7.

Image augmentation associated with a video sequence. Only 14 frames are shown. The warping is performed using the object tracking based on the proposed fast featureless registration.

Fig. 8.
Fig. 8.

The transfer error is defined by the difference in 2D location between the transferred object (obtained by the estimated homography) and the ground-truth object location (obtained by the ground-truth homography).

Tables (3)

Tables Icon

Table 1. Comparing Object Matching with and without the Prestep of Feature Classificationa

Tables Icon

Table 2. Image Transfer Errors Associated with Two Homography Computation Techniquesa

Tables Icon

Table 3. Average CPU Time Associated with Two Homography Computation Techniquesa

Equations (7)

Equations on this page are rendered with MathJax. Learn more.

H=(h11h12h13h21h21h23h31h32h33).
(u2v21)H(u1v21),
u2=λ(u1h11+v1h12+h13)v2=λ(u1h21+v1h22+h23)1=λ(u1h31+v1h32+1).
[u1v11000-u2u1-u2v1000u1v11-v2u1-v2v1][h11h12h13h21h22h23h31h32]=[u2v2].
minHpO(I1(p)-I2(Hp))2,
f(Ht1,t)=pO(It1(p)It(Ht1,tp))2,
minHt-1,t=pO(It-1(p)-It(Ht-1,tp))2,

Metrics