Suitable methodology in subjective video quality assessment: a resolution dependent paradigm

Embed Size (px)

DESCRIPTION

Presentation of my scientific paper to the Third International Workshop on image Media Quality and its Applications (IMQA2008).

Citation preview

  • 1. Stphane Pchard Romuald Ppion Patrick Le Callet IRCCyNFrance Suitable methodology in subjective video quality assessment: a resolution dependent paradigm

2. Outline 1. Introduction 2. Comparison of subjective scores 3. Impact of the number of observers on precision 4. Conclusion 3. DSIS... ACR... SSCQE... DSCQS... SAMVIQ... SDSCE... SCACJ... ? Which one? 4. Many factors Sequences order Sequence number Type of scale 5. random order only one viewing discrete scale no explicit reference used by VQEG ACR 6. user-driven order multiple viewing continuous scale explicit reference used by EBU SAMVIQ 7. Subjective quality tests 8. 192 HDTV sequences 7 24 ref -------- 9. controled environment recommended number of viewers minimal instructions 10. 100 0 excellent good fair poor bad excellent good fair poor bad ACR SAMVIQ 5 4 3 2 1 80% 11. Quality scale adjustment 12. CC=0.899 RMSE=14.06 13. CC=0.94 Brotherton [2006] CIF HDTV CC=0.89 ? 14. MOS ACR> MOS SAMVIQ 15. ACR less critical than SAMVIQ Distorsions better perceived with SAMVIQ BUT the inverse for reference 16. What can explain? Scale difference Number of viewing Explicit reference 17. Scale difference? Corriveau: ACR closer to the extremities But reference MOS only in [68.52;87.04] => not the explanation ACR uses 96.3% SAMVIQ uses 82.1% 18. Number of viewing? => only explain the plot, not the CC SAMVIQ: unlimited viewing with distorsions:MOS ACR> MOS SAMVIQ more precise scores 19. 20. Explicit reference presence? No obvious impact SAMVIQ: no difference between references No higher scores than explicit reference=> only identical assessments Not the same psychological condition 21. More tests! HDTV VGA QVGA 22. Results HDTV VGA QVGA 13 19 33 0.969 0.942 0.899 6.73 9.31 14.06 Visual field CC RMSE 23. 24. Only the number of viewing may imply such an impact. MOS 1 X 1 2 X 2 ~X 1 3 X 3 ~X 1 4 X 4 ~X 1 1 2 3 4 X 1 X 2 X 1 X 3 X 2 X 4 X 3 25. First conclusion ACR and SAMVIQ are equivalent until a certain resolution 26. Precision number of observers N How: 95% confidence interval depends on N ACR: high N SAMVIQ: high precision 27. Problem: rejection algorithms ACR: from ITU SAMVIQ: own 1. without rejection 2. with ACR rejection 3. with SAMVIQ rejection 28. Number of validated observers ACR mode 1 mode 2 mode 3 SAMVIQ 28 27 23 18-25 15-25 15-22 all sequences available some sequences available 29. Analysis Confidence intervals for several N P ACR: N P {28, 25, 22, 20, 18, 15, 12, 10, 8} possible combinations => mean CI 30. Number of observers Mean confidence interval ACR 31. Number of observers Mean confidence interval SAMVIQ 32. Close values? BUT: ACR scale is 80% shorter! 100 0 0 80 ACR SAMVIQ CI=10 CI=10 > more precise x 1.25 33. Mean confidence intervals 8 10 12 15 18 20 22 25 SAMVIQ ACR (adjusted) 10.296 9.284 8.519 7.658 7.014 6.893 6.701 5.964 12.815 11.567 10.619 9.55 8.749 8.315 7.94 7.461 < number of observers 34. 35. Confidence interval of MCI (ACR) 36. Confidence interval of MCI (SAMVIQ) 37. Rejection modes analysis CI without rejection < CI with rejection because mean computed with more CI without rejection (mode 1) Nevertheless, not important differences 38. Rejection modes analysis CI SAMVIQ rejection> CI ACR rejection Same reason : number of validated observersin SAMVIQ < in ACR 39. Conclusion ACR-SAMVIQ comparison different behaviours weak relation when resolution increases SAMVIQ more accurate with multi-viewing more information to process 40. Conclusion strong impact of the number of observers weak impact of the rejection algorithm ACR requires more than 22 observers to get the same precision than SAMVIQ with 15 interesting for laboratories to select the best methodology 41. Questions?