36
ETSI/SMG11 "Speech Aspects" ETSI/SMG11 "Speech Aspects" Presentation of SMG11 Activities to Tiphon

ETSI Codecs

Embed Size (px)

DESCRIPTION

ETSI Codes

Citation preview

Page 1: ETSI Codecs

ETSI/SMG11 "Speech Aspects"ETSI/SMG11 "Speech Aspects"

Presentation of SMG11 Activities to Tiphon

Page 2: ETSI Codecs

OutlineOutline

• SMG11

• GSM Speech Codecs

• GSM Enhanced Full Rate Codec

• Tandem Free Operation

• Adaptive Multi-Rate (AMR) Codec

• Narrowband AMR

• Wideband AMR

• UMTS Matters

• Next Meetings

Page 3: ETSI Codecs

SMG11SMG11

• ETSI STC SMG11 is the competent body responsible for speech aspects of the GSMand UMTS standards (since 1996)

• SMG11 Chairman: Mr. Kari Järvinen, Nokia

• SMG11 plenary meets four times a year

• Additional extraordinary meetings as needed

• SMG11 currently consists of three sub-groups

• TFO sub-group (Tandem Free Operation issues)

• AMR sub-group (Adaptive Multi-Rate codec issues)

• SQ sub-group (Speech Quality issues)

• Sub-groups may have ad-hoc meetings between SMG11 plenary meetings

• Typical attendance in SMG11 plenary meetings is between 30-50

• SMG11 e-mail reflector as well as sub-group reflectors are extensively used betweenthe meetings

Page 4: ETSI Codecs

GSM Speech CodecsGSM Speech Codecs

• GSM has so far standardised three codecs

• 13 kbps GSM-FR (1987); good cellular quality and robust operation in thepresence of background noise

• 5.6 kbps GSM-HR (1994); possibility for higher system capacity at the expenseof slightly lower speech quality in some conditions (particularly in backgroundnoise)

• 12.2 kbps GSM-EFR (1996); high quality even exceeding the G.726 "wirelinereference" under clear channel conditions and in background noise

• SMG11 is currently in the process of defining an Adaptive Multi Rate (AMR) codecwhich will be the fourth GSM speech codec

Subjective speech quality GSM FR GSM HR GSM EFR No coding

Clean conditions (MOS) 3.71 3.85 4.43 4.61

Vehicle noise (DMOS) 3.83 3.45 4.25 4.42

Street noise (DMOS) 3.92 3.56 4.18 4.35

Source: TR 06.85 v5.0.0 (1998-07), "Subjective tests on the interoperability of the HR/FR/EFR speechcodecs; single, tandem and tandem free operation"

Page 5: ETSI Codecs

GSM EFR CodecGSM EFR Codec

• Selected as a basis for a new high quality speech service for PCS 1900 in the US in1995 (formal standardization procedure in TIA and T1 completed in 1996)

• ETSI standardized the same codec for GSM in 1996

• Provides high quality speech service for GSM, GSM 1800 (DCS 1800), and GSM1900 (PCS 1900) systems in all continents

• Technical summary

• Source coding rate 12.2 kbps (channel coding 10.6 kbps)

• Based on the Algebraic CELP (ACELP) algorithm

• Speech frame size and algorithmic delay 20 ms

• Optional VAD/DTX function with comfort noise generation

• Example implementation for error concealment

• Complexity (encoder/decoder) approximately 18 MIPS (processor dependent)

• Memory requirement (incl. RAM and ROM) approximately 16-19k 16-bit words

Page 6: ETSI Codecs

GSM EFR Speech QualityGSM EFR Speech Quality

• GSM EFR speech quality is characterized in

ETSI Technical Report

“Performance Characterisation of the GSM EFR speech codec”, GSM 06.55.

• Additional performance data can be found in

ETSI Technical Report

"Subjective tests on the interoperability of the HR/FR/EFR speech codecs;single, tandem and tandem free operation", GSM 06.85

• The GSM EFR codec has been included in numerous other formal and informalsubjective listening tests and extensive test data is available

• The examples in the following slides are an extract of test results from COMSATlaboratories obtained during the PCS 1900 EFR codec standardization, comparing

12.2 kbps EFR codec

32 kbps G.726 codec

8 kbps G.729

(13 kbps GSM FR codec)

Page 7: ETSI Codecs

GSM EFR PerformanceGSM EFR Performance

• Basic speech quality at different input levels and tandeming

Test Condition G.726 at32kbit/s

GSM EFR G.729

Clean speech, high level, -16 dBOL (MOS) 3.7 3.8 3.4 Clean speech, medium level, -26 dBOL (MOS) 3.6 3.6 3.3 Clean speech, low level, -36 dBOL (MOS) 3.0 2.9 2.7 Self-tandem = codec-codec tandem (MOS) 3.1 3.4 2.9 Tandem with G.726 at 32kbit/s (MOS) 3.2 3.6 3.3

Page 8: ETSI Codecs

GSM EFR PerformanceGSM EFR Performance

• Performance in background noise

Test Condition G.726 at32kbit/s

GSM EFR G.729

Background noise, Home noise 20 dB (DMOS) 4.5 4.6 4.3 Background noise, Car noise 10 dB (DMOS) 4.4 4.5 3.9 Background noise, Car noise 20 dB (DMOS) 4.6 4.6 4.1 Background noise, Street noise 10 dB (DMOS) 3.7 4.1 3.7 Background noise, Office noise 20 dB (DMOS) 4.3 4.5 3.7

Page 9: ETSI Codecs

GSM EFR PerformanceGSM EFR Performance

• Performance in error conditions

Test Condition Frameerror rate

BER class 2

GSM FR GSM EFR

Clean speech, No errors (MOS) 0.0% ≈0% 3.4 4.1 Clean speech, 13 dB C/I, 30 mph (MOS) ≈0.0% ≈2% 3.3 4.0 Clean speech, 10 dB C/I, 30 mph (MOS) ≈0.5% ≈4% 3.0 3.8 Clean speech, 7 dB C/I, 30 mph (MOS) ≈3.0% ≈8% 2.3 3.2

• In GSM, part of the coded bits are protected by a convolutional code, and residualerrors are detected via CRC. The frame error rate for this part is indicated above.Part of the data is unprotected and receive the BER class 2 indicated above.

• The frame error rates are not directly comparable to quality figures with no residualerrors

Page 10: ETSI Codecs

Tandem Free Operation (TFO)Tandem Free Operation (TFO)

• Motivation: "Unnecessary" dual speech encoding and decoding in mobile-to-mobilecalls can significantly decrease speech quality

• TFO prevents the encoding and decoding performed in the network

• Applicable to all the three GSM codecs (FR, HR, and EFR)

• The same speech codec must be used in both mobile stations for TFO to work

• TFO Standardization ongoing in ETSI SMG11 TFO Sub-group

• Work started (TFO sub-group established) in early 1996

• Target: specifications ready by 4Q/1998 (ETSI GSM release 98)

• Current work concentrating on completing four Annexes to Stage 3 description:in-band signalling, operation with In-Path Equipments (IPEs), SDL definition, testvectors

• The TFO Stage 3 GSM 04.53 will be forwarded to SMG#27 plenary in October-98

• Formal subjective tests to evaluate the audible effects of TFO signalling are beingcarried out by Coherent

Page 11: ETSI Codecs

MS-to-MS Call, no TFOMS-to-MS Call, no TFO

Decoding

MSa MSb

Encoding

Encoding Decoding

64 kbits/s PCM Coded Speech

8 or 16 kbits/s Voice Coded Speech

A-side B-side

PLMN PLMN

MSC

TRAU

BSS

MSC

BSS

TRAU

Page 12: ETSI Codecs

Effect of Tandemin gEffect of Tandemin g

MOS value

Speech codecOne encoding anddecoding (normal)

Two encodings anddecodings (tandem)

Enhanced Full Rate 4.43 4.29

Full Rate 3.71 3.13

Half Rate 3.85 3.15

Source: TR 06.85 v2.0.0 (1998-06), "Subjective tests on the interoperabilityof the HR/FR/EFR speech codecs; single, tandem and tandem freeoperation"

Note: The above results are from clean conditions (no background noise, nochannel errors)

Page 13: ETSI Codecs

Effect of Tandemin g in Error ConditionsEffect of Tandemin g in Error Conditions

MOS value

Speech codecOne encoding anddecoding (normal)

Two encodings anddecodings (tandem)

Enhanced Full Rate 4.12 3.45

Full Rate 3.41 2.64

Half Rate 3.68 2.77

Source: TR 06.85 v2.0.0 (1998-06), "Subjective tests on the interoperabilityof the HR/FR/EFR speech codecs; single, tandem and tandem freeoperation"

Note: EP1 error condition was used (moderate errors).

Page 14: ETSI Codecs

Effect of Tandemin g in Back ground NoiseEffect of Tandemin g in Back ground Noise

MOS value

Speech codecOne encoding anddecoding (normal)

Two encodings anddecodings (tandem)

Enhanced Full Rate 4.25 3.87

Full Rate 3.83 3.34

Half Rate 3.45 2.38

Source: TR 06.85 v2.0.0 (1998-06), "Subjective tests on the interoperabilityof the HR/FR/EFR speech codecs; single, tandem and tandem freeoperation"

Note: Vehicle noise of 10 dB was used.

Page 15: ETSI Codecs

TFO ModesTFO Modes

• Two modes in TFO

• Establishment mode : the necessary conditions for TFO are verified with inaudiblebit stealing

• Verify whether both transcoders support TFO

• Possible change of speech codecs to enable TFO

• Duration typically 0.5-1.0 seconds

• TFO mode : speech is transmitted compressed through the whole network with bitstealing that guarantees smooth transitions in all situations

• TFO includes the proper means to ensure TFO also when In Path Equipment suchas Echo Cancellers and DCMEs are used in the fixed network

Page 16: ETSI Codecs

MS-to-MS Call, with TFOMS-to-MS Call, with TFO

PLMN

MSa MSb

8 or 16 kbits/s

Encoding Decoding

A-side B-side

MSC

TRAU

BSS

MSC

TRAU

BSS

PLMN

56 or 48 kbits/s

EncodingDecoding

Page 17: ETSI Codecs

TFO ModeTFO Mode

X X X X X X X Y

56 Kbits/s 8 Kbits/s

PCM Coded Speech

Voice Coded Speech

X X X X X X Y Y

48 Kbits/s 16 Kbits/s

PCM Coded Speech

Voice Coded Speech

• Coded speech is transmitted in the LSBs of the PCM samples in the Ainterface with the decoded PCM samples

• Both types of speech presentations (PCM and coded) are available at thereceiving end

• Minor speech degradation in TFO - non TFO transition due to bit-stealing(increased noise) when the 48/56 kbit/s speech samples are used for avery short period

Page 18: ETSI Codecs

Adaptive Multi-Rate CodecAdaptive Multi-Rate Codec

• Source codec rates probably between 4 kbit/s and 14.4 kbit/s (no fixed source raterequirements)

• Operation in both GSM full rate (22.8 kbps) and half rate (11.4 kbps) channels

• Main advantages in GSM

• Increased robustness against channel errors

• Enhanced quality in the half-rate channel in good channel conditions

• Codec rate selected dynamically depending on radio conditions and local capacityrequirements

• Codec bit rate selected by an adaptation algorithm specific to the system applicatione.g. GSM or UMTS

• Generic speech codec applicable to many mobile systems

• High AMR performance targets and the flexibility obtained by the switchable codecbit-rates (modes) have made it an interesting candidate for UMTS and IMT2000.

• Ability to adapt the bit-rate in a wide range may also be of interest for VoIPapplications

Page 19: ETSI Codecs

Adaptive Multi-Rate Codec ScheduleAdaptive Multi-Rate Codec Schedule

• Qualification testing has been completed on schedule

• Substantial improvements demonstrated, justifying the AMR technique

• 5 codecs advanced to the selection phase

• Good expectation that all, or nearly all, requirements will be met

• Selection phase to end by September 1998

• The AMR speech codec specifications are planned to be completed by December1998

• The AMR codec will be selected from among five different proposals passing thequalification phase

Alcatel/BT/Cellnet/France Telecom/Nortel/Rockwell

Ericsson/Nokia 1

Ericsson/Nokia 2

Lucent

NEC

Page 20: ETSI Codecs

Delivery Dates of AMR SpecificationsDelivery Dates of AMR Specifications

Target date SpecificationsDecem ber1998(required)

• source codec• channel codec• bad fram e handling• in-band s ignalling o f codec m ode - transm iss ion aspects

and defin ition o f param eters• in-band s ignalling o f channe l m etric and s ide in form ation -

transm iss ion aspects (b it a llocation and channelprotection)

Decem ber1998(objective)

• VAD/D TX /com fort no ise generation• defin ition of channel m etric and s ide in form ation

param eters• exam ple of codec m ode adaptation• layer 3 s ignalling

June1999

• AM R TR AU fram es• channel perform ance tab les (G SM 05.05)• TFO• test sequences

Decem ber1999

• perform ance characterisation• [m in im um perform ance of adapta tion a logorithm s]

Page 21: ETSI Codecs

AMR Speech Quality RequirementsAMR Speech Quality Requirements

Full-Rate Channel Half-Rate Channel

C/I Ideal caseperformance(requirement)

Worst caseperformance(objective)

Ideal caseperformance(requirement)

Worst caseperformance(objective)

no errors EFR no errors G.728 no errors G.728 no errors FR no errors

19 dB EFR no errors G.728 no errors G.728 no errors FR no errors

16 dB EFR no errors G.728 no errors G.728 no errors FR at 10 dB

13 dB EFR no errors G.728 no errors FR at 13 dB FR at 7 dB

10 dB G.728 no errors EFR at 10 dB FR at 10 dB FR at 4 dB

7 dB G.728 no errors EFR at 7 dB FR at 7 dB

4 dB EFR at 10 dB EFR at 4 dB FR at 4 dB

Table 1a: Clean speech requirements and objectives under static testconditions.

• Static error conditions: without background noise

Page 22: ETSI Codecs

AMR Speech Quality RequirementsAMR Speech Quality Requirements

Full-Rate Channel Half-Rate Channel

C/I Ideal caseperformance(requirement)

Worst caseperformance(objective)

Ideal caseperformance(requirement)

Worst caseperformance(objective)

no errors EFR no errors G.729 and FR

no errors

better than

G.729 and FR

no errors

G.729 and FR

no errors

19 dB EFR no errors G.729 and FR

no errors

better than

G.729 and FR

no errors

G.729 and FR

no errors

16 dB EFR no errors G.729 and FR

no errors

better than

G.729 and FR

no errors

FR at 10 dB

13 dB EFR no errors G.729 and FR

no errors

FR at 13 dB FR at 7 dB

10 dB G.729 and FR

no errors

FR at 10 dB FR at 10 dB FR at 4 dB

7 dB G.729 and FR

no errors

FR at 7 dB FR at 7 dB

4 dB FR at 10 dB FR at 4 dB FR at 4 dB

Table 1b: Background noise requirements and objectives under static testconditions.

• Static error conditions: in the presence of background noise

Page 23: ETSI Codecs

AMR Speech Quality RequirementsAMR Speech Quality Requirements

Full-Rate Channel

Requirement Same or better than the EFR under the sameconditions, and also the same or better than all theAMR full rate tested modes under the sameconditions

Objective 1 Same or better than the EFR using the error pattern +3 dB

Objective 2 Same or better than the EFR using the error pattern +6 dB

Table 2a: Requirements and objectives under dynamic test conditions for the full-rate channel

Half-Rate Channel

Requirement Same or better than the FR under the sameconditions, and also the same or better than all theAMR half rate tested modes under the sameconditions

Objective 1 Same or better than the FR on a full rate channelusing the error pattern + 3 dB

Objective 2 Same or better than the FR on a full rate channelusing the error pattern + 6 dB

Table 2b: Requirements and objectives under dynamic test conditions for the half-rate channel

• Dynamic conditions

(no background noise):

Page 24: ETSI Codecs

AMR Desi gn ConstraintsAMR Desi gn Constraints

• Some AMR design constraints (simplified to a general form)

• Only very moderate complexity increase compared to existing GSM codecs

• Maximum source coding rate for FR channel modes is 14.4 kbit/s (due to 16kbit/s sub multiplexing)

• In-band signalling for codec modes. Independent adaptation on the up- anddown-links.

• The AMR codec shall support Tandem Free Operation

• The AMR codec shall support DTX operation

• The AMR codec and its control will operate without any changes to the air-interface channel multiplexing, with the possible exception of the interleavedepth.

• It shall be possible to operate power control independently of the AMRadaptation. Not included in qualification and selection tests.

Page 25: ETSI Codecs

AMR Desi gn ConstraintsAMR Desi gn Constraints

• Some AMR design constraints (continued)

• Codec mode control relating to capacity or radio link quality should be located inthe network (BSS).

• Transmission delay: The total algorithmic round trip delay is limited by EFR+10ms in AMR FR channel, and HR+10 ms in AMR HR channel.

• Frame size: 5ms, 10ms or 20 ms

• The AMR in-band signalling shall be expandable to signal the use of future AMRmodes including signalling the use of the existing GSM FR, GSM HR and GSMEFR speech coders, one or two wideband modes and all AMR speech codecmodes in FR channel mode (to guarantee proper TFO operation).

Page 26: ETSI Codecs

Qualification testsQualification tests• The expected performance of the AMR candidates was evaluated in

qualification tests

• Tests conducted in FR and HR channels, including

• Clear speech

no errors and C/I 19 dB to 1 dB

• Speech in background noise with channel errors

street noise (@15 dB SNR)

car noise (@15 dB SNR)

• Tandeming

• Speech level dependency

• Switching between codec modes

• Dynamic C/I: 5 error profiles

3 profiles for downlink test

2 profiles for uplink test

Page 27: ETSI Codecs

Overall performance aimsOverall performance aims

0.00

5.00

10.00

15.00

20.00

25.00

30.00

C / I (d B ) - Id e a l fre q ue nc y ho p p ing

A MR -FR e nve lo p e

A MR -H R e nve lo p e

E FR

H R

Introduce improvements where they are needed

• low C/I in FR mode

• high C/I in HR mode.

Page 28: ETSI Codecs

Qualification results overviewQualification results overview

• Major benefits of AMR technique demonstrated especially

• low C/I in FR mode (1 - 2 delta MOS)

• high C/I in HR mode (same as G.728 - wireline)

• dynamic conditions in FR mode (up to 1.6 delta MOS)

• Several codecs close to meeting all the requirements

• Most challenging condition - background noise in HR mode

Page 29: ETSI Codecs

Static C/I - examplesStatic C/I - examples

FR Channel HR channel

Experiment 1b - Family of Curves

1.00

1.50

2.00

2.50

3.00

3.50

4.00

4.50

5.00

No Errors@-26dBovl

C/I=19 dB C/I=16 dB C/I=13 dB C/I=10 dB C/I= 7 dB C/I= 4 dB C/I= 1 dB

Condi tions

M OS

Rate A

Rate B

Rate C

Spec.

Experiment 1a - Family of Curves

1.00

1.50

2.00

2.50

3.00

3.50

4.00

4.50

5.00

No Errors@-26dBovl

C/I=19 dB C/I=16 dB C/I=13 dB C/I=10 dB C/I= 7 dB C/I= 4 dB C/I= 1 dBCondi tions

M OS

Rate A

Rate B

Rate C

Spec.

Page 30: ETSI Codecs

Experiment 2a - Family of Curves in FR

1.00

1.50

2.00

2.50

3.00

3.50

4.00

4.50

5.00

FR No Errors FR EC16 FR EC10 FR EC4

Condi tions

M OS

Rate A

Rate B

Rate C

Spec. FR

Experiment 2a - Family of Curves in HR

1.00

1.50

2.00

2.50

3.00

3.50

4.00

4.50

5.00

HR No Errors HR EC19 HR EC13 HR EC7

Condi tions

M OS

Rate A

Rate B

Rate C

Spec. HR

Back ground noise; static C/I -Back ground noise; static C/I -examplesexamples

FR channel HR channel

Failed Conditions

Page 31: ETSI Codecs

Dynamic C/I - examplesDynamic C/I - examples

• Dynamic test designed to evaluate AMR performances in “realistic” radio environment with codecadaptation turned on

• Consistent results demonstrated by all candidates

• adaptation mechanism finds best codec mode

• in FR mode, significant improvement compared to fixed rate codec reference, EFR (up to 1.6delta MOS)

• in HR mode, quality equivalent to GSM FR or better (improvement sensitive to dynamic profile)

Typical Result in FR Typical Result in HR

Experiment 4a Test Results

1.00

1.50

2.00

2.50

3.00

3.50

4.00

DEC1 DEC2 DEC3 DEC4 DEC5

Dynam ic Er ror Condition

M OS

YtestEFRRate CRate BRate ARate D

Experiment 4b Test Results

1.00

1.50

2.00

2.50

3.00

3.50

DEC1 DEC2 DEC3 DEC4 DEC5

Dynam ic Error Condition

M OS

Ytes tFRRate ARate BRate CRate D

Page 32: ETSI Codecs

Examples of dynamic conditionsExamples of dynamic conditions

� Dynamic error profilesfrom Radio Simulator(SMG2)

� One minute long� Up and down links� Correlation of C/I

between up and downlinks controlled

0 10 20 30 40 50 60−85

−80

−75

−70

−65

−60

−55

−50C and I profile etsiq3

time [s]

C a

nd I

[dB

m]

DL CUL CDL IUL I

0 10 20 30 40 50 600

5

10

15

20

25

30C/(I+N) profile etsiq3

time [s]

C/(I

+N) [

dB]

DLUL

0 10 20 30 40 50 60−75

−70

−65

−60

−55

−50C and I profile etsiq11

time [s]

C a

nd I

[dB

m]

DL CUL CDL IUL I

0 10 20 30 40 50 602

4

6

8

10

12

14

16

18

20

22C/(I+N) profile etsiq11

time [s]

C/(I

+N) [

dB]

DLUL

Page 33: ETSI Codecs

Wideband AMRWideband AMR

• The narrowband AMR work will continue with the specification of a wideband mode

• No target date for finalized specification yet

• Feasibility phase on-going

• Discussion on Design Constraints and Recommended audio bandwidth

• Preliminary working assumption for optimum audio bandwidth (to be confirmed)

• 100 Hz to 7 kHz (possibly also 100 Hz to 5 kHz)

• In some types of background noise, advantages to reducing low frequencies

• So far, there has been little activity on wideband AMR due to work load on thenarrowband AMR

• Several organisations indicated they are studying wideband AMR.

• Results probably not available until end 1998.

Page 34: ETSI Codecs

UMTS MattersUMTS Matters

• Liaisons with ARIB (Japan)

• Set-up collaboration on UMTS/IMT-2000 speech coding matters

• ARIB representatives attending SMG11 meetings

• AMR in UMTS and IMT-2000

• Working assumption for UMTS (decision from SMG#26, subject to re-evaluationafter the AMR selection)

• A possible candidate for IMT-2000 in ARIB, if standardized on schedule

• WCDMA simulations

• Initial simulation results with the GSM EFR codec and the AMR concept in aWCDMA channel have been presented to SMG11

Page 35: ETSI Codecs

New Work Item: Noise Suppression New Work Item: Noise Suppression

• A new Work Item on Noise Suppression with AMR was approved by SMG in June

•• Optional DSP feature to reduce audio background noise

• Can improve ease of conversation

• Located ahead of the speech codec

• Effective in many but not all background noise environments

• Optimised for the AMR speech codec

• Standardisation to guarantee minimum performance level

•• The work has not started yet and the scope of the work and possible standardization

has not been fully defined and agreed to

Page 36: ETSI Codecs

Next SMG11 Plenary Meetin gsNext SMG11 Plenary Meetin gs

•• SMG11#7: 28 September - 2 October 1998; Sophia Antipolis; host Texas

Instruments

• SMG11#8: 11 - 15 January 1999

• SMG11#9: 3 - 5 June 1999