Abstract

We propose a new hierarchical architecture for visual pattern classification. The new architecture consists of a set of fixed, directional filters and a set of adaptive filters arranged in a cascade structure. The fixed filters are used to extract primitive features such as orientations and edges that are present in a wide range of objects, whereas the adaptive filters can be trained to find complex features that are specific to a given object. Both types of filter are based on the biological mechanism of shunting inhibition. The proposed architecture is applied to two problems: pedestrian detection and car detection. Evaluation results on benchmark data sets demonstrate that the proposed architecture outperforms several existing ones.

© 2010 Optical Society of America

Full Article  |  PDF Article

References

  • View by:
  • |
  • |
  • |

  1. K. Fukushima, “Neocognitron: a hierarchical neural network capable of visual pattern recognition,” Neural Networks 1, 119-130 (1988).
    [CrossRef]
  2. D. H. Hubel and T. N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,” J. Physiol. (London) 160, 106-154 (1962).
  3. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1, 541-551 (1989).
    [CrossRef]
  4. S. L. Phung and A. Bouzerdoum, “A pyramidal neural network for visual pattern recognition,” IEEE Trans. Neural Netw. 18, 329-343 (2007).
    [CrossRef]
  5. M. Riesenhuber and T. Poggio, “Hierarchical models of object recognition in cortex,” Nat. Neurosci. 2, 1019-1025 (1999).
    [CrossRef]
  6. G. Kreiman, C. Koch, and I. Fried, “Category-specific visual responses of single neurons in the human medial temporal lobe,” Nat. Neurosci. 3, 946-953 (2000).
    [CrossRef]
  7. E. B. Goldstein, Sensation and Perception (Wadsworth, 2007).
  8. L. J. Borg-Graham, C. Monier, and Y. Fregnac, “Visual input evokes transient and strong shunting inhibition in visual cortical neurons,” Nature 393, 369-373 (1998).
  9. T. Hammadou and A. Bouzerdoum, “Novel image enhancement technique using shunting inhibitory cellular neural networks,” IEEE Trans. Consumer Electron. 47, 934-940 (2001).
    [CrossRef]
  10. M. Riedmiller, “Advanced supervised learning in multilayer perceptrons--from backpropagation to adaptive learning algorithms,” Comput. Standards Interfaces 16, 265-275 (1994).
  11. S. E. Fahlman, “An empirical study of learning speed in backpropagation networks,” Computer Science Technical Report CMU-CS-88-162 (Carnegie Mellon University, 1988).
  12. R. Collins, A. Lipton, H. Fujiyoshi, and T. Kanade, “Algorithms for cooperative multisensor surveillance,” Proc. IEEE 89, 1456-1477 (2001).
    [CrossRef]
  13. C. Papageorgiou and T. Poggio, “Trainable pedestrian detection,” in Proceedings of the International Conference on Image Processing (IEEE, 1999), Vol. 4, pp. 35-39.
  14. I. Haritaoglu and M. Flickner, “Attentive billboards,” in Proceedings of the International Conference on Image Analysis and Processing (Italian Group of Researchers in Pattern Recognition, 2001), pp. 162-167.
  15. S. Munder and D. M. Gavrila, “An experimental study on pedestrian classification,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1863-1868 (2006).
    [CrossRef]
  16. F. H. C. Tivive and A. Bouzerdoum, “A shunting inhibitory convolutional neural network for gender classification,” in Proceedings of the International Conference on Pattern Recognition (International Association of Pattern Recognition, 2006), pp. 421-424.
  17. C. Wohler and J. K. Anlauf, “An adaptable time-delay neural-network algorithm for image sequence analysis,” IEEE Trans. Neural Netw. 10, 1531-1536 (1999).
    [CrossRef]
  18. S. Agarwal, A. Awan, and D. Roth, “Learning to detect objects in images via a sparse, part-based representation,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1475-1490 (2004).
  19. J. Fang and G. Qiu, “Car/non-car classification in an informative sample subspace,” in Proceedings of the International Conference on Pattern Recognition (International Association of Pattern Recognition, 2006), vol. 2, pp. 962-965.
  20. Z. Zhu, Y. Zhao, and H. Lu, “Sequential architecture for efficient car detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1-8.
  21. K-K. Sung and T. Poggio, “Example-based learning for view-based human face detection,” IEEE Trans. Pattern Anal. Mach. Intell. 20, 39-51 (1998).
  22. C. Garcia and M. Delakis, “Convolutional face finder: a neural architecture for fast and robust face detection,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1408-1423(2004).
    [CrossRef]
  23. J. Mutch and D. G. Lowe, “Multiclass object recognition with sparse, localized features,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2006), Vol. 1, pp. 11-18.
  24. M. Fritz, B. Leibe, B. Caputo, and B. Schiele, “Integrating representative and discriminative models for object category detection,” in Proceedings of the IEEE International Conference on Computer Vision (IEEE, 2005), Vol. 2, pp. 1363-1370.

2007 (1)

S. L. Phung and A. Bouzerdoum, “A pyramidal neural network for visual pattern recognition,” IEEE Trans. Neural Netw. 18, 329-343 (2007).
[CrossRef]

2006 (1)

S. Munder and D. M. Gavrila, “An experimental study on pedestrian classification,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1863-1868 (2006).
[CrossRef]

2004 (2)

S. Agarwal, A. Awan, and D. Roth, “Learning to detect objects in images via a sparse, part-based representation,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1475-1490 (2004).

C. Garcia and M. Delakis, “Convolutional face finder: a neural architecture for fast and robust face detection,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1408-1423(2004).
[CrossRef]

2001 (2)

R. Collins, A. Lipton, H. Fujiyoshi, and T. Kanade, “Algorithms for cooperative multisensor surveillance,” Proc. IEEE 89, 1456-1477 (2001).
[CrossRef]

T. Hammadou and A. Bouzerdoum, “Novel image enhancement technique using shunting inhibitory cellular neural networks,” IEEE Trans. Consumer Electron. 47, 934-940 (2001).
[CrossRef]

2000 (1)

G. Kreiman, C. Koch, and I. Fried, “Category-specific visual responses of single neurons in the human medial temporal lobe,” Nat. Neurosci. 3, 946-953 (2000).
[CrossRef]

1999 (2)

M. Riesenhuber and T. Poggio, “Hierarchical models of object recognition in cortex,” Nat. Neurosci. 2, 1019-1025 (1999).
[CrossRef]

C. Wohler and J. K. Anlauf, “An adaptable time-delay neural-network algorithm for image sequence analysis,” IEEE Trans. Neural Netw. 10, 1531-1536 (1999).
[CrossRef]

1998 (2)

K-K. Sung and T. Poggio, “Example-based learning for view-based human face detection,” IEEE Trans. Pattern Anal. Mach. Intell. 20, 39-51 (1998).

L. J. Borg-Graham, C. Monier, and Y. Fregnac, “Visual input evokes transient and strong shunting inhibition in visual cortical neurons,” Nature 393, 369-373 (1998).

1994 (1)

M. Riedmiller, “Advanced supervised learning in multilayer perceptrons--from backpropagation to adaptive learning algorithms,” Comput. Standards Interfaces 16, 265-275 (1994).

1989 (1)

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1, 541-551 (1989).
[CrossRef]

1988 (1)

K. Fukushima, “Neocognitron: a hierarchical neural network capable of visual pattern recognition,” Neural Networks 1, 119-130 (1988).
[CrossRef]

1962 (1)

D. H. Hubel and T. N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,” J. Physiol. (London) 160, 106-154 (1962).

Agarwal, S.

S. Agarwal, A. Awan, and D. Roth, “Learning to detect objects in images via a sparse, part-based representation,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1475-1490 (2004).

Anlauf, J. K.

C. Wohler and J. K. Anlauf, “An adaptable time-delay neural-network algorithm for image sequence analysis,” IEEE Trans. Neural Netw. 10, 1531-1536 (1999).
[CrossRef]

Awan, A.

S. Agarwal, A. Awan, and D. Roth, “Learning to detect objects in images via a sparse, part-based representation,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1475-1490 (2004).

Borg-Graham, L. J.

L. J. Borg-Graham, C. Monier, and Y. Fregnac, “Visual input evokes transient and strong shunting inhibition in visual cortical neurons,” Nature 393, 369-373 (1998).

Boser, B.

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1, 541-551 (1989).
[CrossRef]

Bouzerdoum, A.

S. L. Phung and A. Bouzerdoum, “A pyramidal neural network for visual pattern recognition,” IEEE Trans. Neural Netw. 18, 329-343 (2007).
[CrossRef]

T. Hammadou and A. Bouzerdoum, “Novel image enhancement technique using shunting inhibitory cellular neural networks,” IEEE Trans. Consumer Electron. 47, 934-940 (2001).
[CrossRef]

F. H. C. Tivive and A. Bouzerdoum, “A shunting inhibitory convolutional neural network for gender classification,” in Proceedings of the International Conference on Pattern Recognition (International Association of Pattern Recognition, 2006), pp. 421-424.

Caputo, B.

M. Fritz, B. Leibe, B. Caputo, and B. Schiele, “Integrating representative and discriminative models for object category detection,” in Proceedings of the IEEE International Conference on Computer Vision (IEEE, 2005), Vol. 2, pp. 1363-1370.

Collins, R.

R. Collins, A. Lipton, H. Fujiyoshi, and T. Kanade, “Algorithms for cooperative multisensor surveillance,” Proc. IEEE 89, 1456-1477 (2001).
[CrossRef]

Delakis, M.

C. Garcia and M. Delakis, “Convolutional face finder: a neural architecture for fast and robust face detection,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1408-1423(2004).
[CrossRef]

Denker, J. S.

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1, 541-551 (1989).
[CrossRef]

Fahlman, S. E.

S. E. Fahlman, “An empirical study of learning speed in backpropagation networks,” Computer Science Technical Report CMU-CS-88-162 (Carnegie Mellon University, 1988).

Fang, J.

J. Fang and G. Qiu, “Car/non-car classification in an informative sample subspace,” in Proceedings of the International Conference on Pattern Recognition (International Association of Pattern Recognition, 2006), vol. 2, pp. 962-965.

Flickner, M.

I. Haritaoglu and M. Flickner, “Attentive billboards,” in Proceedings of the International Conference on Image Analysis and Processing (Italian Group of Researchers in Pattern Recognition, 2001), pp. 162-167.

Fregnac, Y.

L. J. Borg-Graham, C. Monier, and Y. Fregnac, “Visual input evokes transient and strong shunting inhibition in visual cortical neurons,” Nature 393, 369-373 (1998).

Fried, I.

G. Kreiman, C. Koch, and I. Fried, “Category-specific visual responses of single neurons in the human medial temporal lobe,” Nat. Neurosci. 3, 946-953 (2000).
[CrossRef]

Fritz, M.

M. Fritz, B. Leibe, B. Caputo, and B. Schiele, “Integrating representative and discriminative models for object category detection,” in Proceedings of the IEEE International Conference on Computer Vision (IEEE, 2005), Vol. 2, pp. 1363-1370.

Fujiyoshi, H.

R. Collins, A. Lipton, H. Fujiyoshi, and T. Kanade, “Algorithms for cooperative multisensor surveillance,” Proc. IEEE 89, 1456-1477 (2001).
[CrossRef]

Fukushima, K.

K. Fukushima, “Neocognitron: a hierarchical neural network capable of visual pattern recognition,” Neural Networks 1, 119-130 (1988).
[CrossRef]

Garcia, C.

C. Garcia and M. Delakis, “Convolutional face finder: a neural architecture for fast and robust face detection,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1408-1423(2004).
[CrossRef]

Gavrila, D. M.

S. Munder and D. M. Gavrila, “An experimental study on pedestrian classification,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1863-1868 (2006).
[CrossRef]

Goldstein, E. B.

E. B. Goldstein, Sensation and Perception (Wadsworth, 2007).

Hammadou, T.

T. Hammadou and A. Bouzerdoum, “Novel image enhancement technique using shunting inhibitory cellular neural networks,” IEEE Trans. Consumer Electron. 47, 934-940 (2001).
[CrossRef]

Haritaoglu, I.

I. Haritaoglu and M. Flickner, “Attentive billboards,” in Proceedings of the International Conference on Image Analysis and Processing (Italian Group of Researchers in Pattern Recognition, 2001), pp. 162-167.

Henderson, D.

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1, 541-551 (1989).
[CrossRef]

Howard, R. E.

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1, 541-551 (1989).
[CrossRef]

Hubbard, W.

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1, 541-551 (1989).
[CrossRef]

Hubel, D. H.

D. H. Hubel and T. N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,” J. Physiol. (London) 160, 106-154 (1962).

Jackel, L. D.

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1, 541-551 (1989).
[CrossRef]

Kanade, T.

R. Collins, A. Lipton, H. Fujiyoshi, and T. Kanade, “Algorithms for cooperative multisensor surveillance,” Proc. IEEE 89, 1456-1477 (2001).
[CrossRef]

Koch, C.

G. Kreiman, C. Koch, and I. Fried, “Category-specific visual responses of single neurons in the human medial temporal lobe,” Nat. Neurosci. 3, 946-953 (2000).
[CrossRef]

Kreiman, G.

G. Kreiman, C. Koch, and I. Fried, “Category-specific visual responses of single neurons in the human medial temporal lobe,” Nat. Neurosci. 3, 946-953 (2000).
[CrossRef]

LeCun, Y.

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1, 541-551 (1989).
[CrossRef]

Leibe, B.

M. Fritz, B. Leibe, B. Caputo, and B. Schiele, “Integrating representative and discriminative models for object category detection,” in Proceedings of the IEEE International Conference on Computer Vision (IEEE, 2005), Vol. 2, pp. 1363-1370.

Lipton, A.

R. Collins, A. Lipton, H. Fujiyoshi, and T. Kanade, “Algorithms for cooperative multisensor surveillance,” Proc. IEEE 89, 1456-1477 (2001).
[CrossRef]

Lowe, D. G.

J. Mutch and D. G. Lowe, “Multiclass object recognition with sparse, localized features,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2006), Vol. 1, pp. 11-18.

Lu, H.

Z. Zhu, Y. Zhao, and H. Lu, “Sequential architecture for efficient car detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1-8.

Monier, C.

L. J. Borg-Graham, C. Monier, and Y. Fregnac, “Visual input evokes transient and strong shunting inhibition in visual cortical neurons,” Nature 393, 369-373 (1998).

Munder, S.

S. Munder and D. M. Gavrila, “An experimental study on pedestrian classification,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1863-1868 (2006).
[CrossRef]

Mutch, J.

J. Mutch and D. G. Lowe, “Multiclass object recognition with sparse, localized features,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2006), Vol. 1, pp. 11-18.

Papageorgiou, C.

C. Papageorgiou and T. Poggio, “Trainable pedestrian detection,” in Proceedings of the International Conference on Image Processing (IEEE, 1999), Vol. 4, pp. 35-39.

Phung, S. L.

S. L. Phung and A. Bouzerdoum, “A pyramidal neural network for visual pattern recognition,” IEEE Trans. Neural Netw. 18, 329-343 (2007).
[CrossRef]

Poggio, T.

M. Riesenhuber and T. Poggio, “Hierarchical models of object recognition in cortex,” Nat. Neurosci. 2, 1019-1025 (1999).
[CrossRef]

K-K. Sung and T. Poggio, “Example-based learning for view-based human face detection,” IEEE Trans. Pattern Anal. Mach. Intell. 20, 39-51 (1998).

C. Papageorgiou and T. Poggio, “Trainable pedestrian detection,” in Proceedings of the International Conference on Image Processing (IEEE, 1999), Vol. 4, pp. 35-39.

Qiu, G.

J. Fang and G. Qiu, “Car/non-car classification in an informative sample subspace,” in Proceedings of the International Conference on Pattern Recognition (International Association of Pattern Recognition, 2006), vol. 2, pp. 962-965.

Riedmiller, M.

M. Riedmiller, “Advanced supervised learning in multilayer perceptrons--from backpropagation to adaptive learning algorithms,” Comput. Standards Interfaces 16, 265-275 (1994).

Riesenhuber, M.

M. Riesenhuber and T. Poggio, “Hierarchical models of object recognition in cortex,” Nat. Neurosci. 2, 1019-1025 (1999).
[CrossRef]

Roth, D.

S. Agarwal, A. Awan, and D. Roth, “Learning to detect objects in images via a sparse, part-based representation,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1475-1490 (2004).

Schiele, B.

M. Fritz, B. Leibe, B. Caputo, and B. Schiele, “Integrating representative and discriminative models for object category detection,” in Proceedings of the IEEE International Conference on Computer Vision (IEEE, 2005), Vol. 2, pp. 1363-1370.

Sung, K-K.

K-K. Sung and T. Poggio, “Example-based learning for view-based human face detection,” IEEE Trans. Pattern Anal. Mach. Intell. 20, 39-51 (1998).

Tivive, F. H. C.

F. H. C. Tivive and A. Bouzerdoum, “A shunting inhibitory convolutional neural network for gender classification,” in Proceedings of the International Conference on Pattern Recognition (International Association of Pattern Recognition, 2006), pp. 421-424.

Wiesel, T. N.

D. H. Hubel and T. N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,” J. Physiol. (London) 160, 106-154 (1962).

Wohler, C.

C. Wohler and J. K. Anlauf, “An adaptable time-delay neural-network algorithm for image sequence analysis,” IEEE Trans. Neural Netw. 10, 1531-1536 (1999).
[CrossRef]

Zhao, Y.

Z. Zhu, Y. Zhao, and H. Lu, “Sequential architecture for efficient car detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1-8.

Zhu, Z.

Z. Zhu, Y. Zhao, and H. Lu, “Sequential architecture for efficient car detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1-8.

Comput. Standards Interfaces (1)

M. Riedmiller, “Advanced supervised learning in multilayer perceptrons--from backpropagation to adaptive learning algorithms,” Comput. Standards Interfaces 16, 265-275 (1994).

IEEE Trans. Consumer Electron. (1)

T. Hammadou and A. Bouzerdoum, “Novel image enhancement technique using shunting inhibitory cellular neural networks,” IEEE Trans. Consumer Electron. 47, 934-940 (2001).
[CrossRef]

IEEE Trans. Neural Netw. (2)

C. Wohler and J. K. Anlauf, “An adaptable time-delay neural-network algorithm for image sequence analysis,” IEEE Trans. Neural Netw. 10, 1531-1536 (1999).
[CrossRef]

S. L. Phung and A. Bouzerdoum, “A pyramidal neural network for visual pattern recognition,” IEEE Trans. Neural Netw. 18, 329-343 (2007).
[CrossRef]

IEEE Trans. Pattern Anal. Mach. Intell. (4)

S. Agarwal, A. Awan, and D. Roth, “Learning to detect objects in images via a sparse, part-based representation,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1475-1490 (2004).

S. Munder and D. M. Gavrila, “An experimental study on pedestrian classification,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 1863-1868 (2006).
[CrossRef]

K-K. Sung and T. Poggio, “Example-based learning for view-based human face detection,” IEEE Trans. Pattern Anal. Mach. Intell. 20, 39-51 (1998).

C. Garcia and M. Delakis, “Convolutional face finder: a neural architecture for fast and robust face detection,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 1408-1423(2004).
[CrossRef]

J. Physiol. (London) (1)

D. H. Hubel and T. N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,” J. Physiol. (London) 160, 106-154 (1962).

Nat. Neurosci. (2)

M. Riesenhuber and T. Poggio, “Hierarchical models of object recognition in cortex,” Nat. Neurosci. 2, 1019-1025 (1999).
[CrossRef]

G. Kreiman, C. Koch, and I. Fried, “Category-specific visual responses of single neurons in the human medial temporal lobe,” Nat. Neurosci. 3, 946-953 (2000).
[CrossRef]

Nature (1)

L. J. Borg-Graham, C. Monier, and Y. Fregnac, “Visual input evokes transient and strong shunting inhibition in visual cortical neurons,” Nature 393, 369-373 (1998).

Neural Comput. (1)

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1, 541-551 (1989).
[CrossRef]

Neural Networks (1)

K. Fukushima, “Neocognitron: a hierarchical neural network capable of visual pattern recognition,” Neural Networks 1, 119-130 (1988).
[CrossRef]

Proc. IEEE (1)

R. Collins, A. Lipton, H. Fujiyoshi, and T. Kanade, “Algorithms for cooperative multisensor surveillance,” Proc. IEEE 89, 1456-1477 (2001).
[CrossRef]

Other (9)

C. Papageorgiou and T. Poggio, “Trainable pedestrian detection,” in Proceedings of the International Conference on Image Processing (IEEE, 1999), Vol. 4, pp. 35-39.

I. Haritaoglu and M. Flickner, “Attentive billboards,” in Proceedings of the International Conference on Image Analysis and Processing (Italian Group of Researchers in Pattern Recognition, 2001), pp. 162-167.

S. E. Fahlman, “An empirical study of learning speed in backpropagation networks,” Computer Science Technical Report CMU-CS-88-162 (Carnegie Mellon University, 1988).

J. Fang and G. Qiu, “Car/non-car classification in an informative sample subspace,” in Proceedings of the International Conference on Pattern Recognition (International Association of Pattern Recognition, 2006), vol. 2, pp. 962-965.

Z. Zhu, Y. Zhao, and H. Lu, “Sequential architecture for efficient car detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2007), pp. 1-8.

E. B. Goldstein, Sensation and Perception (Wadsworth, 2007).

J. Mutch and D. G. Lowe, “Multiclass object recognition with sparse, localized features,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2006), Vol. 1, pp. 11-18.

M. Fritz, B. Leibe, B. Caputo, and B. Schiele, “Integrating representative and discriminative models for object category detection,” in Proceedings of the IEEE International Conference on Computer Vision (IEEE, 2005), Vol. 2, pp. 1363-1370.

F. H. C. Tivive and A. Bouzerdoum, “A shunting inhibitory convolutional neural network for gender classification,” in Proceedings of the International Conference on Pattern Recognition (International Association of Pattern Recognition, 2006), pp. 421-424.

Cited By

OSA participates in CrossRef's Cited-By Linking service. Citing articles from OSA journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (5)

Fig. 1
Fig. 1

Overview of the proposed hierarchical structure.

Fig. 2
Fig. 2

Subsampling operations performed in (a) Stage 1 and (b) Stage 2.

Fig. 3
Fig. 3

Image patterns from the Daimler–Chrysler pedestrian detection database.

Fig. 4
Fig. 4

Performance comparison of different classifiers on the Daimler–Chrysler pedestrian detection database. CR denotes classification rate.

Fig. 5
Fig. 5

Car detector outputs for some images in the UIUC database. The detection score (sc) is also shown.

Tables (3)

Tables Icon

Table 1 Comparison of Thresholding Approaches on UIUC Test Set 1

Tables Icon

Table 2 Performance of the Proposed Car Detector on the UIUC Database for Different Initial Cutoff Values V 0

Tables Icon

Table 3 Comparison of Recall Rates of Different Car Detectors in the UIUC Database a

Equations (26)

Equations on this page are rendered with MathJax. Learn more.

Z 1 , i = D i * I G * I ,
G ( x , y ) = 1 2 π σ 2 exp ( x 2 + y 2 2 σ 2 ) .
D i ( x , y ) = cos ( θ i ) G x ( x , y ) + sin ( θ i ) G y ( x , y ) ,
G x ( x , y ) = G ( x , y ) / x = x 2 π σ 4 exp ( x 2 + y 2 2 σ 2 ) ,
G y ( x , y ) = G ( x , y ) / y = y 2 π σ 4 exp ( x 2 + y 2 2 σ 2 ) .
θ i = ( i 1 ) π N 1
Z 1 , i { Z 2 , 4 i 3 , Z 2 , 4 i 2 , Z 2 , 4 i 1 , Z 2 , 4 i } , i = 1 , 2 , , N 1 .
Z 2 , i { on-response map Z 3 , 2 i 1 = max ( Z 2 , i , 0 ) off-response map Z 3 , 2 i = min ( Z 2 , i , 0 ) .
Z 4 , i = Z 3 , i Z 3 , i + μ i .
Z 5 , i = g ( P k * Z 4 , i + b k ) + c k a k + f ( Q k * Z 4 , i + d k ) ,
a k ε inf ( f ) ,
{ Z 5 , 4 i 3 , Z 5 , 4 i 2 , Z 5 , 4 i 1 , Z 5 , 4 i } Z 6 , i .
y = i = 1 N 3 w i Z 6 , i + b ,
minimize   E ( w ) = | | Z 6 w d | | 2 .
w = ( Z 6 T Z 6 ) 1 Z 6 d .
v ( t + 1 ) = v ( t ) + Δ v ( t ) + μ ( t ) Δ v ( t 1 ) .
Δ v ( t ) = sign [ g ( t , v ) ] γ ( t ) ,
γ ( t ) = { max ( 0.5 γ ( t 1 ) , 10 10 ) ,   if     g ( t , v ) g ( t 1 , v ) < 0 min ( 1.2 γ ( t 1 ) , 10 ) , if g ( t , v ) g ( t 1 , v ) > 0 γ ( t 1 ) , if g ( t , v ) g ( t 1 , v ) = 0 .
μ ( t ) = | g ( t , v ) g ( t 1 , v ) g ( t , v ) | .
r ecall = T P n P ,
p recision = T P T P + F P ,
F measure = 2 × r ecall × p recision r ecall + p recision .
S 0 = { y 0 ( m , n ) | y 0 ( m , n ) > V 0 } .
T 0 = 1 | S 0 | s i S 0 s i .
S k = S k 1 { y k ( m , n ) | y k ( m , n ) > V 0 } .
T k = 1 | S k | s j S k s j .

Metrics