3
08.19 D6895e4 Subject to change DATA SHEET VoCAS (Code 6985) Voice Control Analysis System Overview Automatic Speech Recognition (ASR) systems find widespread application in telephone networks, cars, mobile appli- ances, multimedia and IoT devices such as smart speakers. Performance of the ASR mainly depends on the variety of talkers, languages and dialects as well as background noise. Thus, testing with a lot of variations is beneficial for devel- oping and improving ASR systems. On the other hand, more variation makes testing more time-consuming and exact reproduction more difficult. The voice control analysis system VoCAS was developed by HEAD acous- tics to organize and automate testing of ASR systems and devices. It offers easy setup and allows comprehensive, variation-rich testing under realistic and fully reproducible conditions. Description Assessing an ASR system’s performance requires a vast selection of input sample data and subsequent evaluation of the processed results. VoCAS covers both points in an easy-to-use software solu- tion. An HMS artificial head measure- ment system simulates a talker to the ASR system, e.g. a smart speaker. The user determines spoken sentences, a wide selection of conditions and designs the result evaluation. As a starting point, VoCAS provides a small demo audio database with seven talkers and two spoken languages. When desired, the software also records new sentences and imports existing audio data. Files are organized in databases and can be edited and tagged extensively. Building custom test sequences Test sequences in VoCAS are visualized as a top-down flow diagram. The user can add and rearrange sequence elements as well as alter their respective settings to create the desired test scenario. In most cases, ‘Playback’ acts as a central element while the surrounding actions set up the test and evaluation conditions. The selection of what to play is made in a clearly arranged multiple choice list. VoCAS automatically generates the sum of test cycles resulting from the current selection in the parameter set list below the flow sequence. Every list entry contains all respective test attributes. Arbitrary parameter sets can be deselected to skip undesired test runs. Adding background noise and room reverberation Besides the talker, ASR performance heavily depends on handling of interfering sounds. To test for this, VoCAS offers full integration of the HEAD acoustics back- ground noise simulation systems 3PASS lab and 3PASS flex (or HAE-BGN / HAE-car) as well as the reverberation simulation 3PASS reverb. In this combination, VoCAS allows to create fully-automated testing in lifelike conditions while retaining precise repeatability. Evaluating the results To judge the performance of the DUT, VoCAS allows customer-created evalua- tion dialogs. The evaluation summary is displayed in a widely configurable, dedi- cated window. Evaluation results can be exported to Microsoft Excel for further processing. Virtualizing with simulation mode When real-time acoustic testing is not an option, VoCAS can also virtualize test envi- ronments. Based on a one-time measure- ment of a test room’s impulse response, VoCAS can calculate the acoustic sum at the DUT position (talker, background noise, reverberation) to save the result into audio files. With the test room’s acoustics taken into account via software, these files are lifelike representations of actual acoustic testing in this room. Combined Ebertstraße 30a 52134 Herzogenrath, Germany Tel.: +49 2407 577-0 Fax: +49 2407 577-99 Email: [email protected] Web: www.head-acoustics.com Exemplary test sequence (top), selected playback attributes (right side) and the resulting parameter set (bottom) in VoCAS ASR labCORE USB labBGN USB 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Pulse Exemplary VoCAS test setup for a smart speaker including 3PASS flex to simulate background noise

Tel.: +49 2407 577-0 VoCAS Fax: +49 2407 577-99 (Code 6985 ... · sentences) can be processed in simulation mode at any time without further access to the test environment. Key Features

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Tel.: +49 2407 577-0 VoCAS Fax: +49 2407 577-99 (Code 6985 ... · sentences) can be processed in simulation mode at any time without further access to the test environment. Key Features

08.19 D6895e4 Subject to change

Data Sheet

VoCAS(Code 6985) Voice Control Analysis System

Overview

Automatic Speech Recognition (ASR) systems find widespread application in telephone networks, cars, mobile appli-ances, multimedia and IoT devices such as smart speakers. Performance of the ASR mainly depends on the variety of talkers, languages and dialects as well as background noise. Thus, testing with a lot of variations is beneficial for devel-oping and improving ASR systems. On the other hand, more variation makes testing more time-consuming and exact reproduction more difficult.

The voice control analysis system VoCAS was developed by HEAD acous-tics to organize and automate testing of ASR systems and devices. It offers easy setup and allows comprehensive, variation-rich testing under realistic and fully reproducible conditions.

DescriptionAssessing an ASR system’s performance requires a vast selection of input sample data and subsequent evaluation of the processed results. VoCAS covers both points in an easy-to-use software solu-tion. An HMS artificial head measure-ment system simulates a talker to the ASR system, e.g. a smart speaker. The user determines spoken sentences, a wide selection of conditions and designs the result evaluation.

As a starting point, VoCAS provides a small demo audio database with seven talkers and two spoken languages. When desired, the software also records new sentences and imports existing audio data. Files are organized in databases and can be edited and tagged extensively.

Building custom test sequences

Test sequences in VoCAS are visualized as a top-down flow diagram. The user can add and rearrange sequence elements as well as alter their respective settings to create the desired test scenario. In most cases, ‘Playback’ acts as a central element while the surrounding actions set up the test and evaluation conditions. The selection of what to play is made in a clearly arranged multiple choice list.

VoCAS automatically generates the sum of test cycles resulting from the current selection in the parameter set list below the flow sequence. Every list entry contains all respective test attributes. Arbitrary parameter sets can be deselected to skip undesired test runs.

Adding background noise and room reverberation

Besides the talker, ASR performance heavily depends on handling of interfering sounds. To test for this, VoCAS offers full integration of the HEAD acoustics back-ground noise simulation systems 3PASS lab and 3PASS flex (or HAE-BGN / HAE-car) as well as the reverberation simulation 3PASS reverb. In this combination, VoCAS allows to create fully-automated testing in lifelike conditions while retaining precise repeatability.

Evaluating the results

To judge the performance of the DUT, VoCAS allows customer-created evalua-tion dialogs. The evaluation summary is

displayed in a widely configurable, dedi-cated window. Evaluation results can be exported to Microsoft Excel for further processing.

Virtualizing with simulation mode

When real-time acoustic testing is not an option, VoCAS can also virtualize test envi-ronments. Based on a one-time measure-ment of a test room’s impulse response, VoCAS can calculate the acoustic sum at the DUT position (talker, background noise, reverberation) to save the result into audio files. With the test room’s acoustics taken into account via software, these files are lifelike representations of actual acoustic testing in this room. Combined

Ebertstraße 30a52134 Herzogenrath, GermanyTel.: +49 2407 577-0Fax: +49 2407 577-99Email: [email protected]: www.head-acoustics.com

Exemplary test sequence (top), selected playback attributes (right side) and the resulting parameter set (bottom) in VoCAS

ASR

labCORE

USB

labBGN

USB

1 2 3 4 5 6 7 8

12

3

456

7

8

Pulse

Exemplary VoCAS test setup for a smart speaker including 3PASS flex to simulate background noise

Page 2: Tel.: +49 2407 577-0 VoCAS Fax: +49 2407 577-99 (Code 6985 ... · sentences) can be processed in simulation mode at any time without further access to the test environment. Key Features

08.19 D6895e4 Subject to change

with the multiple-choice attributes list to select the desired playback material combinations, VoCAS conveniently gener-ates a large amount of clearly sorted audio test files in arbitrary variants.

Each test cycle becomes a standalone file which can be fed directly into the ASR algorithm to evaluate its perfor-mance. This purely electronic processing runs significantly faster than real-time acoustic measurements. Additionally, new audio source files (e.g. newly recorded sentences) can be processed in simulation mode at any time without further access to the test environment.

Key FeaturesAudio file sources

• Wizard-guided import of DAT, MP3, WAV, RAW and PCM audio files

• Recording of new sentences for custom testing

• Editing of audio files (Level adjustment, trimming, splitting, FIR/IIR filtering etc.)

• Manual or automatic audio file tagging

Testing features

• Graphically presented, freely configu-rable test sequence flow chart

• Multiple choice list for selecting play-back attributes

• Customizable list of attributes for

– Playback configurations (incl. gain, Lombard effect)

– Background noise configurations

– Reverberation simulation configura-tions

• Supports Python™ scripts for further automation

• Result acquisition, presentation and export to Microsoft Excel

• Simulation mode to evaluate ASR algorithm performance without access to test environment

Remote controlling options

• Sample-accurate remote control of background noise simulation systems 3PASS lab, 3PASS flex, HAE-BGN or HAE-car

• Sample-accurate remote control of reverberation simulation system 3PASS reverb

• Software control interface for motor-ized turntable HRT I

• Remote control of basic VoCAS func-tions (e.g. start / stop test case) via REST server

Applications• Flexible, objective and reproducible

evaluation of the performance of Auto-matic Speech Recognition (ASR) systems

Flow sequence elements

VoCAS offers a diverse selection of configurable flow sequence elements to create arbitrary test scenarios

Displays a user-defined text message in a pop-up window and halts the test sequence until closed

Operates the motorized turntable HRT I in user-defined rotation steps

Activates audio monitoring function on the hardware platform

Triggers background noise simulation (with 3PASS lab / flex or HAE-BGN / car) with the user-selected noises

Triggers room reverberation simulation (with 3PASS reverb) of the user-selected room

Initiates SPL measurement over a user-defined duration with a dedicated microphone

Triggers recording of the test run with a dedicated microphone

Initiates playback of the next sentence in the parameter set list

Opens a user-configured dialogue to assess the ASR system’s performance (e.g. “pass” / ”fail”) after each playback and halts the test sequence until closed

Halts the test sequence for a selectable duration

Processes a user-created Python™ script for further automation

• Benchmarking of different ASR systems or ASR software versions

Page 3: Tel.: +49 2407 577-0 VoCAS Fax: +49 2407 577-99 (Code 6985 ... · sentences) can be processed in simulation mode at any time without further access to the test environment. Key Features

08.19 D6895e4 Subject to change

Left

Front

Right

Rear

Left RightSubwoofer

1 2 3 4 9 10

USB

labBGN

USB

labCORE

ASR

Pulse

Exemplary VoCAS test setup for voice recognition systems in vehicles

Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

General RequirementsSoftware

• Microsoft Windows 7 Professional, Windows 8/8.1 Pro or Windows 10 Pro (English or German version, including all current service packs)

Hardware

• PC with multi-core processor (1.6 GHz or faster), 4 GB RAM, 40 GB free disk space, 4 USB Ports

• labCORE (Code 7700), modular multi-channel hardware platform, with labCORE modules

– coreBUS (Code 7710), I/O Bus Mainboard

– coreOUT-Amp2 (Code 7720), Output Module, Power Amplifier (2 Channels)

– coreIN-Mic4 (Code 7730), Input Module, Microphone (4 Channels)

• HMS II.3-33/-34 (Code 1230.1/2) or HMS II.6 (Code 1389), artificial head measurement system with pinna simulator

For preparation measurements

• Measurement microphone with LEMO 7-pin connector

Options • HRT I (Code 6498), HEAD acoustics

Remote-operated Turntable

Background noise simulation

• 3PASS lab (Code 6990), for testing at fixed microphone positions (e.g. mobile phones), including necessary system components (cf. separate data sheet) or

• 3PASS flex (Code 6995), for testing multi-microphone systems, microphone arrays, beamforming microphones, in-cluding necessary system components (cf. separate data sheet) or

• HAE-BGN (Code 6970), automated equalization for background noise simulation in labs, including necessary system components (cf. separate data sheet) or

• HAE-car (Code 6971), automated equalization for background noise simulation in car cabins, including nec-essary system components (cf. separate data sheet)

Reverberation simulation

• 3PASS reverb (Code 6996), option for 3PASS lab and 3PASS flex – simula-tion of reverberation scenarios

Delivery ItemsVoCAS (Code 6985) comprises the following components

• Setup DVD, including demo project and demo audio database

• USB dongle