Upload
nathaniel-conley
View
213
Download
1
Embed Size (px)
Citation preview
Objective and Subjective Degradations of Transcoded
Voice for Heterogeneous Radio Networks Interoperability
Ľubica Blašková1, Jan Holub1, Michael Street2, Filip Szczucki2 and Ondřej Tomíška1
1FEE CTU, Prague2NATO C3 Agency, The Hague
2
Presentation Outlines
Voice Transcoding – Issue of Modern Voice Transcoding – Issue of Modern
Heterogenous Networks Heterogenous Networks
Speech Transmission Quality MeasurementsSpeech Transmission Quality Measurements
Experiments PerformedExperiments Performed
ResultsResults
ConclusionsConclusions
3
Voice Transcoding I
EEffective voice communications will remains a keyffective voice communications will remains a key service for those operating in a tactical environmentservice for those operating in a tactical environment
Multinational operations routinely require different Multinational operations routinely require different tactical communication systems from different tactical communication systems from different nations to connect togethernations to connect together or f or for wired networks to or wired networks to connect to wireless sub-systems connect to wireless sub-systems
DDifferent networks apply differing voice encoding ifferent networks apply differing voice encoding methods to the voice signal methods to the voice signal
It is recognised that use of multiple voice coders in It is recognised that use of multiple voice coders in series degrades the quality and intelligibility of the series degrades the quality and intelligibility of the resulting voiceresulting voice
4
Voice Transcoding II
For complex telecommunication chains as appear For complex telecommunication chains as appear today, multiple voice coding in one communication today, multiple voice coding in one communication direction occurs, examples:direction occurs, examples: GSM-to-GSM call: GSM (FR/EFR/HR) – G.711 – GSM-to-GSM call: GSM (FR/EFR/HR) – G.711 –
GSM (FR/EFR/HR)GSM (FR/EFR/HR) GSM-to-DECT call: GSM-G.711-DECTGSM-to-DECT call: GSM-G.711-DECT UMTS-to-PSTN: AMR-(G.729)-G.711UMTS-to-PSTN: AMR-(G.729)-G.711 Skype-to-GSM (typ.): Skype-G.729-GSMSkype-to-GSM (typ.): Skype-G.729-GSM etc.etc.
5
Voice Transcoding III
For both ad-hoc and permanently interoperating For both ad-hoc and permanently interoperating (special) networks, even more coder types must be (special) networks, even more coder types must be taken into account:taken into account:
TETRA (ACELP)TETRA (ACELP)
STANAG 4STANAG 4591591 (MELPe) (MELPe)
Problem: the lower bit-rate, the higher risk the coder Problem: the lower bit-rate, the higher risk the coder will not interoperate satisfactorilywill not interoperate satisfactorily
6
Speech Transmission Quality
Perceived connection quality is influenced by many transmission Perceived connection quality is influenced by many transmission impairments (delay, echo, various kinds of noise, speech (de)coding impairments (delay, echo, various kinds of noise, speech (de)coding distortions and artifacts, temporal and amplitude clipping, ...), distortions and artifacts, temporal and amplitude clipping, ...), assessed and measured in MOS (Mean Opinion Score)assessed and measured in MOS (Mean Opinion Score)
5 Excellent5 Excellent 4 Good4 Good 3 Fair3 Fair 2 Poor2 Poor 1 Bad1 Bad
Listening / Conversational Tests (ITU-T P.800)Listening / Conversational Tests (ITU-T P.800) Intrusive Algorithmic Methods (P.862 PESQ)Intrusive Algorithmic Methods (P.862 PESQ)
dedicated test call must be establisheddedicated test call must be established
Non-intrusive Algorithmic Methods (P.563 Non-intrusive Algorithmic Methods (P.563 3SQM3SQM)) real voice samples acquired on real connections are processedreal voice samples acquired on real connections are processed
Estimation Algorithmic Methods - Estimation Algorithmic Methods - jitter, delay, packet loss etc. jitter, delay, packet loss etc. mapped mapped to MOSto MOS (P.564, E-model) (P.564, E-model)
7
Experiments Performed Speech database recordingSpeech database recording
No background noise, Hoth +10dB SNRNo background noise, Hoth +10dB SNR
2 male, 2 female speakers2 male, 2 female speakers
ACELP, MELPe, G.729, GSM FR coding ACELP, MELPe, G.729, GSM FR coding (different typical combinations)(different typical combinations)
5 recordings per combination per speaker5 recordings per combination per speaker
8
Subjective Testing
ITU-T P.800 ITU-T P.800 methodologymethodology
38 untrained 38 untrained listenerslisteners
listeninglistening chamber chamber <190<190 ms ms,, <1<10 dB 0 dB SPL (A)SPL (A)
Results shown for Results shown for „no noise“ „no noise“ conditioncondition
Technology MOS-LQSn CI95%
ACELP 4,25 0,166
MELPe 2,21 0,184
GSM 3,64 0,198
G.729 4,00 0,188
ACELP-MELPe 1,30 0,121
MELPe-ACELP 2,89 0,199
ACELP- GSM 3,68 0,199
GSM-ACELP 3,85 0,191
ACELP- G.729 4,22 0,204
G.729-ACELP 3,63 0,203
MELPe-G.729 3,44 0,200
G.729-MELPe 2,46 0,168
MELPe-GSM 3,08 0,197
GSM-MELPe 2,33 0,189
MELPe–G.729-ACELP 3,13 0,203
ACELP-G.729-MELPe 1,75 0,182
9
Objective Testing (PESQ-LQ, 3SQM)
PESQ-LQ after 2-nd order regression
1
2
3
4
5
1 2 3 4 5
MOS-LQsn
MO
S-L
Qo
n (
PE
SQ
-LQ
, reg
r.)
3SQM (no regression)
1
2
3
4
5
1 2 3 4 5
MOS-LQsn
MO
S-L
Qo
n (
3SQ
M, n
o r
egr.
)
10
Objective Testing II (PESQ-LQ, left: male voices, right: female voices)
Male Voices
1
2
3
4
5
1 2 3 4 5
MOS - LQSn
PE
SQ
-LQ
(re
gre
ssed
)
Female Voices
1
2
3
4
5
1 2 3 4 5
MOS - LQSn
PE
SQ
-LQ
(re
gre
ssed
)
11
Results
PESQ: P.862 + P.862.1,
regressed) 3SQM: P.563
Correlation 0,836 0,370
Maximum pos. difference 1,060 3,288
Maximum neg. difference -1,550 -2,722
RMSE 0,560 1,072
12
Conclusions:• All tandem setups perform with decreased speech
transmission quality
• Always both directions (coders “A-to-B” and “B-to-A”) must be tested as the results can differ significantly
• Neither PESQ-LQ neither 3SQM can be used reliably for objective voice QoS monitoring in case of multiple coder tandeming where at least one low bit-rate coder is used. However, PESQ-LQ after proper regression shows at least reasonable correlation with subjective data (0.84)
• In our experiment, both male and female transmitted voices were subjectively evaluated almost equally
• Objective methods underestimate MOS scores for female
voice transmissions
Thank you for your attention !
http://measure.feld.cvut.czhttp://measure.feld.cvut.cz
www.mesaqin.comwww.mesaqin.com