Abstract

Motion is one of the most important types of information contained in natural video, but direct use of motion information in the design of video quality assessment algorithms has not been deeply investigated. Here we propose to incorporate a recent model of human visual speed perception [Nat. Neurosci. 9, 578 (2006) ] and model visual perception in an information communication framework. This allows us to estimate both the motion information content and the perceptual uncertainty in video signals. Improved video quality assessment algorithms are obtained by incorporating the model as spatiotemporal weighting factors, where the weight increases with the information content and decreases with the perceptual uncertainty. Consistent improvement over existing video quality assessment algorithms is observed in our validation with the video quality experts group Phase I test data set.

© 2007 Optical Society of America

Full Article  |  PDF Article

References

  • View by:
  • |
  • |

  1. T. N. Pappas, R. J. Safranek, and J. Chen, "Perceptual criteria for image quality evaluation," in Handbook of Image and Video Processing, 2nd ed., A.Bovik, ed. (Academic, 2005), pp. 923-939.
  2. Z. Wang, H. R. Sheikh, and A. C. Bovik, "Objective video quality assessment," in Handbook of Video Databases: Design and Applications, B.Furht and O.Marques, eds. (CRC Press, 2003), pp. 1041-1078.
  3. B. A. Wandell, Foundations of Vision (Sinauer, 1995).
  4. H. R. Sheikh and A. C. Bovik, "A visual information fidelity approach to video quality assessement," in First International Workshop on Video Processing and Quality Metrics for Consumer Electronics (2005), CD only.
  5. Z. Wang, L. Lu, and A. C. Bovik, "Video quality assessment based on structural distortion measurement," Signal Process. 19, 121-132 (2004).
  6. Z. Wang, H. R. Sheikh, A. C. Bovik, and E. P. Simoncelli, "Image quality assessment: from error visibility to structual similarity," IEEE Trans. Image Process. 13, 600-612 (2004).
    [CrossRef] [PubMed]
  7. Z. K. Lu, W. Lin, X. K. Yang, E. P. Ong, and S. S. Yao, "Modeling visual attention's modulatory afterreffects on visual sensitivity and quality evaluation," IEEE Trans. Image Process. 14, 1928-1942 (2005).
    [CrossRef] [PubMed]
  8. K. Seshadrinathan and A. C. Bovik, "A structural similarity metric for video based on motion models," in IEEE International Conference on Acoustics, Speech, and Signal Processing (IEEE, 2007), Vol. I, pp. 869-872.
  9. K. Seshadrinathan and A. C. Bovik, "An information theoretic video quality metric based on motion models," in Third International Workshop on Video Processing and Quality Metrics for Consumer Electronics (2007), CD only.
  10. Z. Wang and E. P. Simoncelli, "Translation insensitive image similarity in complex wavelet domain," in IEEE International Conference on Acoustics, Speech, and Signal Processing (2005), pp. 573-576.
  11. H. R. Sheikh and A. C. Bovik, "Image information and visual quality," IEEE Trans. Image Process. 15, 430-444 (2006).
    [CrossRef] [PubMed]
  12. A. A. Stocker and E. P. Simoncelli, "Noise characteristics and prior expectations in human visual speed perception," Nat. Neurosci. 9, 578-585 (2006).
    [CrossRef] [PubMed]
  13. E. P. Simoncelli, E. H. Adelson, and D. J. Heeger, "Probability distributions of optical flow," IEEE International Conference on Computer Vision and Pattern Recognition (1991), pp. 310-315.
  14. Y. Weiss, E. P. Simoncelli, and E. H. Adelson, "Motion illusions as optimal percepts," Nat. Neurosci. 5, 598-604 (2002).
    [CrossRef] [PubMed]
  15. F. Hürlimann, D. C. Kiper, and M. Carandini, "Testing the Bayesian model of perceived speed," Vision Res. 42, 2253-2257 (2002).
    [CrossRef] [PubMed]
  16. E. P. Simoncelli and B. Olshausen, "Natural image statistics and neural representation," Annu. Rev. Neurosci. 24, 1193-1216 (2001).
    [CrossRef] [PubMed]
  17. R. Raj, W. S. Geisler, R. A. Frazor, and A. C. Bovik, "Contrast statistics for foveated visual systems: fixation selection by minimizing contrast entropy," J. Opt. Soc. Am. A 22, 2039-2049 (2005).
    [CrossRef]
  18. J. Najemnik and W. S. Geisler, "Optimal eye movement strategies in visual search," Nature (London) 434, 387-391 (2005).
    [CrossRef]
  19. Z. Wang and X. L. Shang, "Spatial pooling strategies for perceptual image quality assessment," in Proceedings of IEEE International Conference on Image Processing (2006), pp. 2945-2948.
  20. VQEG, "Final report from the video quality experts group on the validation of objective models of video quality assessment," http://www.vqeg.org/ (2000).
  21. Z. Wang, L. Lu, and A. C. Bovik, "Foveation scalable video coding with automatic fixation selection," IEEE Trans. Image Process. 12, 243-254 (2003).
    [CrossRef]
  22. M. J. Black and P. Anandan, "The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields," Comput. Vis. Image Underst. 63, 75-104 (1996).
    [CrossRef]
  23. T. Vlachos, "Simple method for estimation of global motion parameters using SPARSE translational motion vector fields," Electron. Lett. 34, 90-91 (1998).
    [CrossRef]
  24. E. Peli, "Contrast in complex images," J. Opt. Soc. Am. A 7, 2032-2040 (1990).
    [CrossRef] [PubMed]
  25. P. C. Teo and D. J. Heeger, "Perceptual image distortion," Proc. SPIE 2179, 127-141 (1994).
    [CrossRef]
  26. D. J. Heeger and T. C. Teo, "A model of perceptual image fidelity," Proceedings of IEEE International Conference on Image Processing (1995), pp. 343-345.
    [CrossRef]

2006 (2)

H. R. Sheikh and A. C. Bovik, "Image information and visual quality," IEEE Trans. Image Process. 15, 430-444 (2006).
[CrossRef] [PubMed]

A. A. Stocker and E. P. Simoncelli, "Noise characteristics and prior expectations in human visual speed perception," Nat. Neurosci. 9, 578-585 (2006).
[CrossRef] [PubMed]

2005 (3)

R. Raj, W. S. Geisler, R. A. Frazor, and A. C. Bovik, "Contrast statistics for foveated visual systems: fixation selection by minimizing contrast entropy," J. Opt. Soc. Am. A 22, 2039-2049 (2005).
[CrossRef]

J. Najemnik and W. S. Geisler, "Optimal eye movement strategies in visual search," Nature (London) 434, 387-391 (2005).
[CrossRef]

Z. K. Lu, W. Lin, X. K. Yang, E. P. Ong, and S. S. Yao, "Modeling visual attention's modulatory afterreffects on visual sensitivity and quality evaluation," IEEE Trans. Image Process. 14, 1928-1942 (2005).
[CrossRef] [PubMed]

2004 (2)

Z. Wang, L. Lu, and A. C. Bovik, "Video quality assessment based on structural distortion measurement," Signal Process. 19, 121-132 (2004).

Z. Wang, H. R. Sheikh, A. C. Bovik, and E. P. Simoncelli, "Image quality assessment: from error visibility to structual similarity," IEEE Trans. Image Process. 13, 600-612 (2004).
[CrossRef] [PubMed]

2003 (1)

Z. Wang, L. Lu, and A. C. Bovik, "Foveation scalable video coding with automatic fixation selection," IEEE Trans. Image Process. 12, 243-254 (2003).
[CrossRef]

2002 (2)

Y. Weiss, E. P. Simoncelli, and E. H. Adelson, "Motion illusions as optimal percepts," Nat. Neurosci. 5, 598-604 (2002).
[CrossRef] [PubMed]

F. Hürlimann, D. C. Kiper, and M. Carandini, "Testing the Bayesian model of perceived speed," Vision Res. 42, 2253-2257 (2002).
[CrossRef] [PubMed]

2001 (1)

E. P. Simoncelli and B. Olshausen, "Natural image statistics and neural representation," Annu. Rev. Neurosci. 24, 1193-1216 (2001).
[CrossRef] [PubMed]

1998 (1)

T. Vlachos, "Simple method for estimation of global motion parameters using SPARSE translational motion vector fields," Electron. Lett. 34, 90-91 (1998).
[CrossRef]

1996 (1)

M. J. Black and P. Anandan, "The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields," Comput. Vis. Image Underst. 63, 75-104 (1996).
[CrossRef]

1994 (1)

P. C. Teo and D. J. Heeger, "Perceptual image distortion," Proc. SPIE 2179, 127-141 (1994).
[CrossRef]

1990 (1)

Annu. Rev. Neurosci. (1)

E. P. Simoncelli and B. Olshausen, "Natural image statistics and neural representation," Annu. Rev. Neurosci. 24, 1193-1216 (2001).
[CrossRef] [PubMed]

Comput. Vis. Image Underst. (1)

M. J. Black and P. Anandan, "The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields," Comput. Vis. Image Underst. 63, 75-104 (1996).
[CrossRef]

Electron. Lett. (1)

T. Vlachos, "Simple method for estimation of global motion parameters using SPARSE translational motion vector fields," Electron. Lett. 34, 90-91 (1998).
[CrossRef]

IEEE Trans. Image Process. (4)

Z. Wang, L. Lu, and A. C. Bovik, "Foveation scalable video coding with automatic fixation selection," IEEE Trans. Image Process. 12, 243-254 (2003).
[CrossRef]

H. R. Sheikh and A. C. Bovik, "Image information and visual quality," IEEE Trans. Image Process. 15, 430-444 (2006).
[CrossRef] [PubMed]

Z. Wang, H. R. Sheikh, A. C. Bovik, and E. P. Simoncelli, "Image quality assessment: from error visibility to structual similarity," IEEE Trans. Image Process. 13, 600-612 (2004).
[CrossRef] [PubMed]

Z. K. Lu, W. Lin, X. K. Yang, E. P. Ong, and S. S. Yao, "Modeling visual attention's modulatory afterreffects on visual sensitivity and quality evaluation," IEEE Trans. Image Process. 14, 1928-1942 (2005).
[CrossRef] [PubMed]

J. Opt. Soc. Am. A (2)

Nat. Neurosci. (2)

A. A. Stocker and E. P. Simoncelli, "Noise characteristics and prior expectations in human visual speed perception," Nat. Neurosci. 9, 578-585 (2006).
[CrossRef] [PubMed]

Y. Weiss, E. P. Simoncelli, and E. H. Adelson, "Motion illusions as optimal percepts," Nat. Neurosci. 5, 598-604 (2002).
[CrossRef] [PubMed]

Nature (London) (1)

J. Najemnik and W. S. Geisler, "Optimal eye movement strategies in visual search," Nature (London) 434, 387-391 (2005).
[CrossRef]

Proc. SPIE (1)

P. C. Teo and D. J. Heeger, "Perceptual image distortion," Proc. SPIE 2179, 127-141 (1994).
[CrossRef]

Signal Process. (1)

Z. Wang, L. Lu, and A. C. Bovik, "Video quality assessment based on structural distortion measurement," Signal Process. 19, 121-132 (2004).

Vision Res. (1)

F. Hürlimann, D. C. Kiper, and M. Carandini, "Testing the Bayesian model of perceived speed," Vision Res. 42, 2253-2257 (2002).
[CrossRef] [PubMed]

Other (11)

E. P. Simoncelli, E. H. Adelson, and D. J. Heeger, "Probability distributions of optical flow," IEEE International Conference on Computer Vision and Pattern Recognition (1991), pp. 310-315.

Z. Wang and X. L. Shang, "Spatial pooling strategies for perceptual image quality assessment," in Proceedings of IEEE International Conference on Image Processing (2006), pp. 2945-2948.

VQEG, "Final report from the video quality experts group on the validation of objective models of video quality assessment," http://www.vqeg.org/ (2000).

K. Seshadrinathan and A. C. Bovik, "A structural similarity metric for video based on motion models," in IEEE International Conference on Acoustics, Speech, and Signal Processing (IEEE, 2007), Vol. I, pp. 869-872.

K. Seshadrinathan and A. C. Bovik, "An information theoretic video quality metric based on motion models," in Third International Workshop on Video Processing and Quality Metrics for Consumer Electronics (2007), CD only.

Z. Wang and E. P. Simoncelli, "Translation insensitive image similarity in complex wavelet domain," in IEEE International Conference on Acoustics, Speech, and Signal Processing (2005), pp. 573-576.

T. N. Pappas, R. J. Safranek, and J. Chen, "Perceptual criteria for image quality evaluation," in Handbook of Image and Video Processing, 2nd ed., A.Bovik, ed. (Academic, 2005), pp. 923-939.

Z. Wang, H. R. Sheikh, and A. C. Bovik, "Objective video quality assessment," in Handbook of Video Databases: Design and Applications, B.Furht and O.Marques, eds. (CRC Press, 2003), pp. 1041-1078.

B. A. Wandell, Foundations of Vision (Sinauer, 1995).

H. R. Sheikh and A. C. Bovik, "A visual information fidelity approach to video quality assessement," in First International Workshop on Video Processing and Quality Metrics for Consumer Electronics (2005), CD only.

D. J. Heeger and T. C. Teo, "A model of perceptual image fidelity," Proceedings of IEEE International Conference on Image Processing (1995), pp. 343-345.
[CrossRef]

Cited By

OSA participates in CrossRef's Cited-By Linking service. Citing articles from OSA journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (5)

Fig. 1
Fig. 1

Illustration of absolute motion, background motion, and relative motion estimated from two consecutive frames of a video sequence.

Fig. 2
Fig. 2

Bayesian visual speed perception in an information communication framework. v: stimulus speed; m: noisy measurement; v ̂ : estimated speed; c: stimulus contrast. Adapted from Stocker and Simoncelli [12].

Fig. 3
Fig. 3

(a), (b) Two consecutive frames extracted from the “Mobile Calendar” sequence; (c) estimated absolute motion field; (d) estimated relative motion field; (e) estimated local information content map; (f) estimated local perceptual uncertainty map; (g) estimated local weighting factor map.

Fig. 4
Fig. 4

Illustration of quality maps. (a) Original image; (b) distorted image (by JPEG compression); (c) absolute error map—brighter indicates better quality (smaller absolute difference); (d) SSIM index map—brighter indicates better quality (larger SSIM value). The SSIM index appears to be a better indicator of local image quality.

Fig. 5
Fig. 5

Scatter plots of subjective/objective scores on VQEG Phase I test database (all video sequences included). The vertical and horizontal axes represent the subjective and the objective scores, respectively. Each sample point represents one test video sequence. (a) PSNR; (b) PSNR with proposed weighting method; (c) SSIM; (D) SSIM with proposed weighting method. All SSIM values were raised to the eighth power for visualization purpose only.

Tables (1)

Tables Icon

Table 1 ROCC Results of VQA Algorithms on VQEG Phase I Database

Equations (15)

Equations on this page are rendered with MathJax. Learn more.

v r = v a v g .
p ( v r ) = τ v r α ,
I = log p ( v r ) = α log v r + β ,
p ( m v s ) = 1 2 π σ m exp [ ( log m log v s ) 2 2 σ 2 ] ,
σ = λ c γ ,
U = p ( m v g ) log p ( m v g ) d m = 1 2 + 1 2 log ( 2 π σ 2 ) + log v g = log v g γ log c + δ ,
w = I U = ( α log v r + β ) ( log v g γ log c + δ ) .
Q = t x y w ( x , y , t ) q ( x , y , t ) t x y w ( x , y , t ) .
c = σ p μ p + μ 0 ,
c = 1 e ( c θ ) ρ ,
w = max { 0 , [ α log ( 1 + v r v 0 ) + β ] [ log ( 1 + v g v 0 ) γ log ( 1 + c c 0 ) + δ ] } .
q ( x , y , t ) = I r ( x , y , t ) I d ( x , y , t ) 2 ,
PSNR = 10 log 10 ( L 2 MSE ) ,
q ( x , y , t ) = ( 2 μ r μ d + C 1 ) ( 2 σ r d + C 2 ) ( μ r 2 + μ d 2 + C 1 ) ( σ r 2 + σ d 2 + C 2 ) ,
r = 1 6 i = 1 N d i 2 K ( K 2 1 ) ,

Metrics