Abstract

We propose a novel object-of-interest (OOI) segmentation algorithm for various images that is based on human attention and semantic region clustering. As object-based image segmentation is beyond current computer vision techniques, the proposed method segments an image into regions, which are then merged as a semantic object. At the same time, an attention window (AW) is created based on the saliency map and saliency points from an image. Within the AW, a support vector machine is used to select the salient regions, which are then clustered into the OOI using the proposed region merging. Unlike other algorithms, the proposed method allows multiple OOIs to be segmented according to the saliency map.

© 2006 Optical Society of America

Full Article  |  PDF Article

References

  • View by:
  • |
  • |
  • |

  1. Q. Tian, Y. Wu, and T. S. Huang, "Combine user defined region-of-interest and spatial layout for image retrieval," in Proceedings of the 2000 IEEE International Conference on Image Processing (IEEE Press, 2000), pp. 746-749.
  2. S. Kim, S. Park, and M. Kim, "Central object extraction for object-based image retrieval," in Proceedings of the International Conference on Image and Video Retrieval (Association for Computing Machinery, 2003), pp. 39-49.
  3. W. Wang, Y. Song, and A. Zhang, "Semantics retrieval by region saliency," in Proceedings of the International Conference on Image and Video Retrieval (Association for Computing Machinery, 2002), pp. 29-37.
  4. W. Osberger and A. J. Naeder, "Automatic identification of perceptually important regions in an image," in Proceedings of the IEEE International Conference on Pattern Recognition (IEEE Press, 1998), pp. 701-704.
  5. Y.-F. Ma and H.-J. Zhang, "Content-based image attention analysis by using fuzzy growing," in Proceedings of the International Conference on ACM Multimedia (Association for Computing Machinery, 2003), pp. 374-381.
  6. J. Z. Wang, J. Li, R. M. Gray, and G. Wiederhold, "Unsupervised multiresolution segmentation for images with low depth of field," IEEE Trans. Pattern Anal. Mach. Intell. 23, 85-90 (2001).
    [CrossRef]
  7. C. Kim, "Segmenting a low-depth-of-field image using morphological filters and region merging," IEEE Trans. Image Process. 14, 1503-1511 (2005).
    [CrossRef] [PubMed]
  8. B. Suh, H. Ling, B. Bederson, and D. W. Jacob, "Automatic thumbnail cropping and its effectiveness," in Proceedings of the 16th ACM Symposium on User Interface Software and Technology (Association of Computer Machinery, 2003), pp. 95-104.
    [CrossRef]
  9. G. Lei and G. B. Long, "A watermarking scheme using image object region," in Proceedings of the IEEE International Conference on Computational Intelligence and Multimedia Applications (IEEE Press, 2003), pp. 419-423.
    [PubMed]
  10. B. C. Ko and H. Byun, "Frip: a region-based image retrieval tool using automatic image segmentation and stepwise boolean and matching," IEEE Trans. Multimedia 7, 105-113 (2005).
    [CrossRef]
  11. J. M. Wolfe, "Guided search 2.0: a revised model of visual search," Psychon. Bull. Rev. 1, 202-238 (1994).
    [CrossRef]
  12. L. Itti, C. Koch, and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis," IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254-1259 (1998).
    [CrossRef]
  13. E. Loupias and N. Sebe, "Wavelet-based salient points for image retrieval," Research Report RR 99.11 (RFV-INSA Lyon, 1999).
  14. B. Moghaddam and M.-H. Yang, "Gender classification with support vector machines," in Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition (IEEE Press, 2000), pp. 306-311.
    [CrossRef]
  15. L. Fei-Fei, R. Fergus, and P. Perona, "Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories," CVPR 2004, Workshop on Generative-Model Based Vision (IEEE, 2004).
  16. C. Yimand and A. C. Bovik, "Multiresolution 3-D range segmentation using focused cues," IEEE Trans. Image Process. 7, 1283-1299 (1998).
    [CrossRef]
  17. Z. Ye and C. C. Lu, "Unsupervised multiscale focused objects detection using hidden Markov tree," in Proceedings of the International Conference on Computer Vision, Pattern Recognition, and Image Processing (IEEE Press, 2002), pp. 812-815.
  18. The OOI segmentation results can be found at http:/video.kmu.ac.kr/cvpr/OOI-results/OOIResults.htm

2005 (2)

C. Kim, "Segmenting a low-depth-of-field image using morphological filters and region merging," IEEE Trans. Image Process. 14, 1503-1511 (2005).
[CrossRef] [PubMed]

B. C. Ko and H. Byun, "Frip: a region-based image retrieval tool using automatic image segmentation and stepwise boolean and matching," IEEE Trans. Multimedia 7, 105-113 (2005).
[CrossRef]

2001 (1)

J. Z. Wang, J. Li, R. M. Gray, and G. Wiederhold, "Unsupervised multiresolution segmentation for images with low depth of field," IEEE Trans. Pattern Anal. Mach. Intell. 23, 85-90 (2001).
[CrossRef]

1998 (2)

L. Itti, C. Koch, and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis," IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254-1259 (1998).
[CrossRef]

C. Yimand and A. C. Bovik, "Multiresolution 3-D range segmentation using focused cues," IEEE Trans. Image Process. 7, 1283-1299 (1998).
[CrossRef]

1994 (1)

J. M. Wolfe, "Guided search 2.0: a revised model of visual search," Psychon. Bull. Rev. 1, 202-238 (1994).
[CrossRef]

Bederson, B.

B. Suh, H. Ling, B. Bederson, and D. W. Jacob, "Automatic thumbnail cropping and its effectiveness," in Proceedings of the 16th ACM Symposium on User Interface Software and Technology (Association of Computer Machinery, 2003), pp. 95-104.
[CrossRef]

Bovik, A. C.

C. Yimand and A. C. Bovik, "Multiresolution 3-D range segmentation using focused cues," IEEE Trans. Image Process. 7, 1283-1299 (1998).
[CrossRef]

Byun, H.

B. C. Ko and H. Byun, "Frip: a region-based image retrieval tool using automatic image segmentation and stepwise boolean and matching," IEEE Trans. Multimedia 7, 105-113 (2005).
[CrossRef]

Fei-Fei, L.

L. Fei-Fei, R. Fergus, and P. Perona, "Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories," CVPR 2004, Workshop on Generative-Model Based Vision (IEEE, 2004).

Fergus, R.

L. Fei-Fei, R. Fergus, and P. Perona, "Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories," CVPR 2004, Workshop on Generative-Model Based Vision (IEEE, 2004).

Gray, R. M.

J. Z. Wang, J. Li, R. M. Gray, and G. Wiederhold, "Unsupervised multiresolution segmentation for images with low depth of field," IEEE Trans. Pattern Anal. Mach. Intell. 23, 85-90 (2001).
[CrossRef]

Huang, T. S.

Q. Tian, Y. Wu, and T. S. Huang, "Combine user defined region-of-interest and spatial layout for image retrieval," in Proceedings of the 2000 IEEE International Conference on Image Processing (IEEE Press, 2000), pp. 746-749.

Itti, L.

L. Itti, C. Koch, and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis," IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254-1259 (1998).
[CrossRef]

Jacob, D. W.

B. Suh, H. Ling, B. Bederson, and D. W. Jacob, "Automatic thumbnail cropping and its effectiveness," in Proceedings of the 16th ACM Symposium on User Interface Software and Technology (Association of Computer Machinery, 2003), pp. 95-104.
[CrossRef]

Kim, C.

C. Kim, "Segmenting a low-depth-of-field image using morphological filters and region merging," IEEE Trans. Image Process. 14, 1503-1511 (2005).
[CrossRef] [PubMed]

Kim, M.

S. Kim, S. Park, and M. Kim, "Central object extraction for object-based image retrieval," in Proceedings of the International Conference on Image and Video Retrieval (Association for Computing Machinery, 2003), pp. 39-49.

Kim, S.

S. Kim, S. Park, and M. Kim, "Central object extraction for object-based image retrieval," in Proceedings of the International Conference on Image and Video Retrieval (Association for Computing Machinery, 2003), pp. 39-49.

Ko, B. C.

B. C. Ko and H. Byun, "Frip: a region-based image retrieval tool using automatic image segmentation and stepwise boolean and matching," IEEE Trans. Multimedia 7, 105-113 (2005).
[CrossRef]

Koch, C.

L. Itti, C. Koch, and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis," IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254-1259 (1998).
[CrossRef]

Lei, G.

G. Lei and G. B. Long, "A watermarking scheme using image object region," in Proceedings of the IEEE International Conference on Computational Intelligence and Multimedia Applications (IEEE Press, 2003), pp. 419-423.
[PubMed]

Li, J.

J. Z. Wang, J. Li, R. M. Gray, and G. Wiederhold, "Unsupervised multiresolution segmentation for images with low depth of field," IEEE Trans. Pattern Anal. Mach. Intell. 23, 85-90 (2001).
[CrossRef]

Ling, H.

B. Suh, H. Ling, B. Bederson, and D. W. Jacob, "Automatic thumbnail cropping and its effectiveness," in Proceedings of the 16th ACM Symposium on User Interface Software and Technology (Association of Computer Machinery, 2003), pp. 95-104.
[CrossRef]

Long, G. B.

G. Lei and G. B. Long, "A watermarking scheme using image object region," in Proceedings of the IEEE International Conference on Computational Intelligence and Multimedia Applications (IEEE Press, 2003), pp. 419-423.
[PubMed]

Loupias, E.

E. Loupias and N. Sebe, "Wavelet-based salient points for image retrieval," Research Report RR 99.11 (RFV-INSA Lyon, 1999).

Lu, C. C.

Z. Ye and C. C. Lu, "Unsupervised multiscale focused objects detection using hidden Markov tree," in Proceedings of the International Conference on Computer Vision, Pattern Recognition, and Image Processing (IEEE Press, 2002), pp. 812-815.

Ma, Y.-F.

Y.-F. Ma and H.-J. Zhang, "Content-based image attention analysis by using fuzzy growing," in Proceedings of the International Conference on ACM Multimedia (Association for Computing Machinery, 2003), pp. 374-381.

Moghaddam, B.

B. Moghaddam and M.-H. Yang, "Gender classification with support vector machines," in Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition (IEEE Press, 2000), pp. 306-311.
[CrossRef]

Naeder, A. J.

W. Osberger and A. J. Naeder, "Automatic identification of perceptually important regions in an image," in Proceedings of the IEEE International Conference on Pattern Recognition (IEEE Press, 1998), pp. 701-704.

Niebur, E.

L. Itti, C. Koch, and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis," IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254-1259 (1998).
[CrossRef]

Osberger, W.

W. Osberger and A. J. Naeder, "Automatic identification of perceptually important regions in an image," in Proceedings of the IEEE International Conference on Pattern Recognition (IEEE Press, 1998), pp. 701-704.

Park, S.

S. Kim, S. Park, and M. Kim, "Central object extraction for object-based image retrieval," in Proceedings of the International Conference on Image and Video Retrieval (Association for Computing Machinery, 2003), pp. 39-49.

Perona, P.

L. Fei-Fei, R. Fergus, and P. Perona, "Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories," CVPR 2004, Workshop on Generative-Model Based Vision (IEEE, 2004).

Sebe, N.

E. Loupias and N. Sebe, "Wavelet-based salient points for image retrieval," Research Report RR 99.11 (RFV-INSA Lyon, 1999).

Song, Y.

W. Wang, Y. Song, and A. Zhang, "Semantics retrieval by region saliency," in Proceedings of the International Conference on Image and Video Retrieval (Association for Computing Machinery, 2002), pp. 29-37.

Suh, B.

B. Suh, H. Ling, B. Bederson, and D. W. Jacob, "Automatic thumbnail cropping and its effectiveness," in Proceedings of the 16th ACM Symposium on User Interface Software and Technology (Association of Computer Machinery, 2003), pp. 95-104.
[CrossRef]

Tian, Q.

Q. Tian, Y. Wu, and T. S. Huang, "Combine user defined region-of-interest and spatial layout for image retrieval," in Proceedings of the 2000 IEEE International Conference on Image Processing (IEEE Press, 2000), pp. 746-749.

Wang, J. Z.

J. Z. Wang, J. Li, R. M. Gray, and G. Wiederhold, "Unsupervised multiresolution segmentation for images with low depth of field," IEEE Trans. Pattern Anal. Mach. Intell. 23, 85-90 (2001).
[CrossRef]

Wang, W.

W. Wang, Y. Song, and A. Zhang, "Semantics retrieval by region saliency," in Proceedings of the International Conference on Image and Video Retrieval (Association for Computing Machinery, 2002), pp. 29-37.

Wiederhold, G.

J. Z. Wang, J. Li, R. M. Gray, and G. Wiederhold, "Unsupervised multiresolution segmentation for images with low depth of field," IEEE Trans. Pattern Anal. Mach. Intell. 23, 85-90 (2001).
[CrossRef]

Wolfe, J. M.

J. M. Wolfe, "Guided search 2.0: a revised model of visual search," Psychon. Bull. Rev. 1, 202-238 (1994).
[CrossRef]

Wu, Y.

Q. Tian, Y. Wu, and T. S. Huang, "Combine user defined region-of-interest and spatial layout for image retrieval," in Proceedings of the 2000 IEEE International Conference on Image Processing (IEEE Press, 2000), pp. 746-749.

Yang, M.-H.

B. Moghaddam and M.-H. Yang, "Gender classification with support vector machines," in Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition (IEEE Press, 2000), pp. 306-311.
[CrossRef]

Ye, Z.

Z. Ye and C. C. Lu, "Unsupervised multiscale focused objects detection using hidden Markov tree," in Proceedings of the International Conference on Computer Vision, Pattern Recognition, and Image Processing (IEEE Press, 2002), pp. 812-815.

Yimand, C.

C. Yimand and A. C. Bovik, "Multiresolution 3-D range segmentation using focused cues," IEEE Trans. Image Process. 7, 1283-1299 (1998).
[CrossRef]

Zhang, A.

W. Wang, Y. Song, and A. Zhang, "Semantics retrieval by region saliency," in Proceedings of the International Conference on Image and Video Retrieval (Association for Computing Machinery, 2002), pp. 29-37.

Zhang, H.-J.

Y.-F. Ma and H.-J. Zhang, "Content-based image attention analysis by using fuzzy growing," in Proceedings of the International Conference on ACM Multimedia (Association for Computing Machinery, 2003), pp. 374-381.

IEEE Trans. Image Process. (2)

C. Kim, "Segmenting a low-depth-of-field image using morphological filters and region merging," IEEE Trans. Image Process. 14, 1503-1511 (2005).
[CrossRef] [PubMed]

C. Yimand and A. C. Bovik, "Multiresolution 3-D range segmentation using focused cues," IEEE Trans. Image Process. 7, 1283-1299 (1998).
[CrossRef]

IEEE Trans. Multimedia (1)

B. C. Ko and H. Byun, "Frip: a region-based image retrieval tool using automatic image segmentation and stepwise boolean and matching," IEEE Trans. Multimedia 7, 105-113 (2005).
[CrossRef]

IEEE Trans. Pattern Anal. Mach. Intell. (2)

J. Z. Wang, J. Li, R. M. Gray, and G. Wiederhold, "Unsupervised multiresolution segmentation for images with low depth of field," IEEE Trans. Pattern Anal. Mach. Intell. 23, 85-90 (2001).
[CrossRef]

L. Itti, C. Koch, and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis," IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254-1259 (1998).
[CrossRef]

Psychon. Bull. Rev. (1)

J. M. Wolfe, "Guided search 2.0: a revised model of visual search," Psychon. Bull. Rev. 1, 202-238 (1994).
[CrossRef]

Other (12)

B. Suh, H. Ling, B. Bederson, and D. W. Jacob, "Automatic thumbnail cropping and its effectiveness," in Proceedings of the 16th ACM Symposium on User Interface Software and Technology (Association of Computer Machinery, 2003), pp. 95-104.
[CrossRef]

G. Lei and G. B. Long, "A watermarking scheme using image object region," in Proceedings of the IEEE International Conference on Computational Intelligence and Multimedia Applications (IEEE Press, 2003), pp. 419-423.
[PubMed]

Q. Tian, Y. Wu, and T. S. Huang, "Combine user defined region-of-interest and spatial layout for image retrieval," in Proceedings of the 2000 IEEE International Conference on Image Processing (IEEE Press, 2000), pp. 746-749.

S. Kim, S. Park, and M. Kim, "Central object extraction for object-based image retrieval," in Proceedings of the International Conference on Image and Video Retrieval (Association for Computing Machinery, 2003), pp. 39-49.

W. Wang, Y. Song, and A. Zhang, "Semantics retrieval by region saliency," in Proceedings of the International Conference on Image and Video Retrieval (Association for Computing Machinery, 2002), pp. 29-37.

W. Osberger and A. J. Naeder, "Automatic identification of perceptually important regions in an image," in Proceedings of the IEEE International Conference on Pattern Recognition (IEEE Press, 1998), pp. 701-704.

Y.-F. Ma and H.-J. Zhang, "Content-based image attention analysis by using fuzzy growing," in Proceedings of the International Conference on ACM Multimedia (Association for Computing Machinery, 2003), pp. 374-381.

E. Loupias and N. Sebe, "Wavelet-based salient points for image retrieval," Research Report RR 99.11 (RFV-INSA Lyon, 1999).

B. Moghaddam and M.-H. Yang, "Gender classification with support vector machines," in Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition (IEEE Press, 2000), pp. 306-311.
[CrossRef]

L. Fei-Fei, R. Fergus, and P. Perona, "Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories," CVPR 2004, Workshop on Generative-Model Based Vision (IEEE, 2004).

Z. Ye and C. C. Lu, "Unsupervised multiscale focused objects detection using hidden Markov tree," in Proceedings of the International Conference on Computer Vision, Pattern Recognition, and Image Processing (IEEE Press, 2002), pp. 812-815.

The OOI segmentation results can be found at http:/video.kmu.ac.kr/cvpr/OOI-results/OOIResults.htm

Cited By

OSA participates in CrossRef's Cited-By Linking service. Citing articles from OSA journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (9)

Fig. 1
Fig. 1

Flow diagram of OOI segmentation.

Fig. 2
Fig. 2

(a) Original image, (b) color map, (c) orientation map, (d) luminance map, (e) saliency map, (f) saliency points.

Fig. 3
Fig. 3

Algorithm for creating optimal AW using saliency map and saliency points.

Fig. 4
Fig. 4

Final attention window.

Fig. 5
Fig. 5

Region merging procedure: (a) original image, (b) AW creation and initial seed region selection (two white regions), (c) regions eliminated by the first condition, (d) final OOIs.

Fig. 6
Fig. 6

Experimental results for OOI segmentation. (See Ref. [15])

Fig. 7
Fig. 7

Performance evaluation criteria for object extraction performance.

Fig. 8
Fig. 8

Comparison of OOI segmentation: (a) test images, (b) results using Ref. [6], (c) results using Ref. [15], (d) results using Ref. [16], (e) results using Ref. [7], (f) results using proposed algorithm, (g) manually segmented OOIs.

Fig. 9
Fig. 9

Performance evaluation using Eqs. (15): (a) results for bird, (b) results for leopard, (c) results for butterfly.

Tables (1)

Tables Icon

Table 1 Comparison of OOI Segmentation Results Among Three Methods, Where S U is the Undersegmentation Error, S 0 is the Oversegmentation Error, and A is the Average Accuracy

Equations (18)

Equations on this page are rendered with MathJax. Learn more.

C ¯ = 1 4 ( c { a * , b * } s { 11 × 11 , 13 × 13 } C ( c , s ) ) .
O ¯ = 1 6 ( c { H H , H L , L H } s { 11 × 11 , 13 × 13 } O ( c , s ) ) .
L ¯ = 1 2 ( s { 11 × 11 , 13 × 13 } L ( s ) ) .
w i = V i i = 1 3 V i ,
C m = w 1 × C ¯ ( x , y ) + w 2 × L ¯ ( x , y ) + w 3 × O ¯ ( x , y ) .
f i L = ( 1 L i ) i = 1 N ( 1 L i ) .
f i H x = H x i i = 1 N H x i , f i H y = H y i i = 1 N H y i .
f i S = S i i = 1 N S i .
f i S p = S p i i = 1 N S p i .
f i C m = C m i i = 1 N C m i .
f ¯ = [ f L , f H x , f H y , f S , f S p , f C m ] T .
K ( x , y ) = exp ( x y 2 2 σ 2 ) ,
{ OOI k = OOI k R i , if ( d i > θ 2 and O b i , k > θ 3 ) B = B R i else } ,
O b i , k = card ( B R i OOI k ) card ( P m ) ,
P m = min ( B R i OOI k ) ,
S U = ( M ( M E ) ) S M × 100 ,
S O = ( E ( M E ) ) S E × 100 ,
A = 100 ( S U + S O ) × 100 ,

Metrics