21
Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Embed Size (px)

Citation preview

Page 1: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Comparison of subjective test methodologies

VQEG Berlin meeting June 2009

P. Le Callet, R. Pépion

Page 2: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Context, methodologies and issues

Context:

HRCs (coder, processing, transmission …)

Resolutions

Applications and services

ACR (5 , 11 categories …)

Pair ComparisonSAMVIQ

DSCQS

The value (e.g. accuracy, stability) of protocols might depend on the context … and the targeted goals

Page 3: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Outline

• Study 1: ACR 5 vs SAMVIQ HD H264• Study 2: Preference Tests vs SAMVIQ

«processing»• Study 3: ACR5 vs ACR 11 vs SAMVIQ encoded +

processing• Study 4: ACR5 for encoded + transmission error

Page 4: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 1: ACR 5 vs SAMVIQ HDMotivations: HDTV high quality in a short range=> quality measure should be precise and discriminative

Absolute Category Rating (ACR)

- random order

- only one viewing

- category scale

- no explicit reference

...Good

Subjective Assessment Methodologyfor Video Quality (SAMVIQ)

- user-driven order

- multiple viewing (natural?)

- continuous scale

- explicit reference

Page 5: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

5

Previous and new studies

[Brotherton, 2006] correlation on CIF (352x288):

CC(MOSACR, MOSSAMVIQ) = 0.94

New studies: - Resolutions: QVGA, VGA and HD 1080i50 (viewing distance according to the resolution)- HRC: coding artefacts only (H264 AVC and SVC)

CC(MOSACR, MOSSAMVIQ) =

HDTV

VGA

QVGA 13°

19°

33°

0.969

0.942

0.899

6.73

9.31

14.06

visualfield

RMSDiff=

ACR and seems to provide “equivalent” resultsup to a certain resolution

Page 6: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

6

Accuracy vs Number of observers

5 10 15 20 25 3002468

101214

SAMVIQACR'

confi

denc

e in

terv

al

number of observers

24

« Suitable methodology in subjective video quality assessment: a resolution dependent paradigm » Stéphane Péchard, Romuald Pépion and Patrick Le CalletProceedings of the Third International Workshop on Image Media Quality and its Applications, IMQA2008, Chiba, Japan, September 2008

Page 7: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Outline

• Study 1: ACR 5 vs SAMVIQ HD H264• Study 2: Preference Tests vs SAMVIQ

«processing»• Study 3: ACR5 vs ACR 11 vs SAMVIQ encoded +

processing• Study 4: ACR5 for encoded + transmission error

Page 8: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 2: Preference Test vs SAMVIQ « processing »Motivations: HDTV pre post processing, comparison between format on a 1080p display= > No other impairments

1080p SRC

Pre Processing(interleaced and down Scaling)

1080i, 720p

Pre Processing(deinterleaced + down Scaling)

720p

Post Processingdeinterleaced + up Scaling)

Post Processing(up Scaling)

1080p PVS

Page 9: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 2: some results

0

10

20

30

40

50

60

70

80

90

100

Référenceexplicite1080p50

référencecachée

1080p50

1080i50TDeint

720p(source

1080p50)Lanczos

720p(source

1080p50)NN

720p(source

1080i50)Lanczos

720p(source

1080i50)NN

ancrebasse

MO

S

SAMVIQ

-3

-2,5

-2

-1,5

-1

-0,5

0

0,5

1

1,5

1080i50 TDeint 720p50 (source1080p50)Lanczos

720p50 (source1080p50) NN

720p50 (source1080i50)Lanczos

720p50 (source1080i50) NN

720p50 natif

Pré

fére

nce

Preference Test1080p SRC compared to other PVS7 categories preference test

Generally good agreement but…further analysis is required(Thurstone Mosteller, CI …)

Page 10: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Outline

• Study 1: ACR 5 vs SAMVIQ HD H264• Study 2: Preference Tests vs SAMVIQ

«processing»• Study 3: ACR5 vs ACR 11 vs SAMVIQ encoded +

processing• Study 4: ACR5 for encoded + transmission error

Page 11: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 3: ACR5 vs ACR 11vs SAMVIQ encoded + processing

Motivations:Comparison of 1080p50 with other HD and SD formats on a 1080p display =>

compression + processing

Compression:H264 coderAll formats (e.g. 1080p or i, 720p …) are coded at 3,6 and 9Mb/s and

decoded before post processing.

Processing:All formats are displayed in 1080p50 after decoding

1 deinterlacer : Smooth (VirtualDub/Avisynth),2 Upscalers : Bilinear and Lanczos (VirtualDub/Avisynth).

Page 12: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 3: PVS generation

1080i50

720p50

1280x1080i50

1280x1080p50

SD

Deint

Upscale 1

Upscale 2

DeintUpscale 1

Upscale 2

DeintUpscale 2

Upscale 1

Upscale 2

29 HRC(8x3 HD+2x2 SD+1Ref)

x

3 SRC

=

87 PVS

3Mb/s

6Mb/s

9Mb/sNot forSD

Upscale 1

Page 13: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

ACR5 vs ACR11: correlation

correlation between ACR 5 and 11: 0.98

5 contents

1

1.5

2

2.5

3

3.5

4

4.5

5

0 2 4 6 8 10

MOS ACR 11 levels

MO

S A

CR

5 l

evel

s Canal

équivalence

DepartCross

FootRennes

Manege

Stockholm

Page 14: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 3: SAMVIQ vs ACR11, PVS generation

X

1080i50

720p50

1280x1080i50

1280x1080p50

SD

Deint

Upscale 1

Upscale 2

DeintUpscale 1

Upscale 2

Deint Upscale 1

Upscale 1

Upscale 2

10 HRC(8HD+1SD

+1Ref)

x

2 SRC

=

20 PVS

Page 15: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 3: ACR11 vs SAMVIQ (on 20 PVS)

• Good correlation between ACR and SAMVIQ (0.97) => may be questionnable for high quality score

4 contents

0

10

20

30

40

50

60

70

80

90

100

0 2 4 6 8 10

MOS ACR 11 levels

MO

S S

AM

VIQ

Canal

équivalence

DepartCross

FootRennes

Manege

Page 16: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 3: score distribution

ACR5

ACR11

0

10

20

30

40

50

60

70

0 10 20 30 40 50 60 70 80 90 100

SAMVIQ

Page 17: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 3: CI distribution

Page 18: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Outline

• Study 1: ACR 5 vs SAMVIQ HD H264• Study 2: Preference Tests vs SAMVIQ

«processing»• Study 3: ACR5 vs ACR 11 vs SAMVIQ encoded +

processing• Study 4: ACR5 for encoded + transmission error

Page 19: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 4: ACR5 encoded + transmission error

• The goal : analyse the relation between the position of the transmission error and the MOS on SD sequences.

• Each content is coded at 4 or 6Mb/s and some simulation of transmission errors are tested.

• Advanced FEC and Error concealment technique (ROI based)

Page 20: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 4: ACR5 encoded + transmission error

X

14HRC

(Trans-MissionErrors)

=84

PVS

Page 21: Comparison of subjective test methodologies VQEG Berlin meeting June 2009 P. Le Callet, R. Pépion

Study 4: ACR5 encoded + transmission error

Reminder: coding only (study 3)