Voice That Matter 2010 - Core Audio

Audio and OpenAL for iPhone Games

Kevin AvilaRegistered HEX Offender

• About Me• Core Audio• iPhone Services• OpenAL• Tips & Tricks

Introduction

About Me

email: eddienull@me.comtwitter: eddienull

About Me

Core Audio

Core AudioWhy?

"Easy" and "CoreAudio" can't be used in the same sentence. CoreAudio is very powerful,

very complex, and under-documented. Be prepared for a steep learning curve, APIs with millions of tiny little pieces, and puzzling things

out from sample code rather than reading high-level documentation.

–Jens Alfke, coreaudio-api listFeb 9, 2009

• Problem domain is hard

• Performance is hard

• Low latency is hard

• Reusability is hard

Core AudioWhy?

• Doing without would suck

• Slowness would suck

• Latency would suck

• Non-reusability would suck

Core AudioWhy?

Theory

How it Works

• Overview of Core Audio• Terminology• Fundamental Concepts

• ASBD• Properties

• Fundamental API• AudioFormat• AudioConverter• AudioFile

Core AudioIntroduction

Terminology

• Sample—a data point for one channel

Core AudioTerminology

• Frame— The number of samples presented at one time• 1 for mono• 2 for stereo• 4 for quad

• Packet—a collection of Frames

The Basics

• AudioSampleType• Used for I/O• 32-bit float (Mac)• 16-bit integer (iPhone)

• AudioUnitSampleType• Used for DSP• 32-bit float (Mac)• 8.24 fixed (iPhone)

Core Audio OverviewThe Canonical Format

• Contains the minimal information needed to describe audio data

• Some formats may not use all of the fields• Unused fields need to be set to zero

Core Audio OverviewAudioStreamBasicDescription - “ASBD”

struct AudioStreamBasicDescription { Float64 mSampleRate; UInt32 mFormatID; UInt32 mFormatFlags; UInt32 mBytesPerPacket; UInt32 mFramesPerPacket; UInt32 mBytesPerFrame; UInt32 mChannelsPerFrame; UInt32 mBitsPerChannel; UInt32 mReserved;};

• Key/Value pair used to describe/manipulate API attributes.

• Scope and element selectors are used by some API to further qualify a property

• The value of a property can be of whatever type the API needs.

• APIs share several common functions• GetProperty, SetProperty, and GetPropertyInfo• AddPropertyListener and RemovePropertyListener

Core Audio OverviewProperties

• An element is the same as a bus

Core Audio OverviewScopes and Elements

input scopeelement 0

output scopeelement 0

output scopeelement 1

global scope

• Provides information about installed codecs

• Fills out ASBDs based on Format ID

• Provides more information about a formatʼs parameters

Core Audio OverviewAudioFormat

• ASBDs can be complicated, let the system do the work for you!

Core Audio OverviewAudioFormat

asbd.mSampleRate = 44100.0;asbd.mFormatID = kAudioFormatMPEG4AAC;asbd.mChannelsPerFrame = 2;

UInt32 propSize = sizeof(AudioStreamBasicDescription);AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL,!! ! ! ! &propSize, &asbd);

• Converts• bit depths• sample rate• interleaving & deinterleaving• channel ordering• PCM <-> compressed/encoded

• Can use all installed codecs

Core Audio OverviewAudioConverter

• Parses a file and provides access to the raw data• Uses properties to query information about the file

• ExtendedAudioFile• High-level access to an audio file• Combines AudioFile + AudioConverter

Core Audio OverviewAudioFile

For Example...

Core Audio OverviewSimple File Reading

// Open the audio fileExtAudioFileOpenURL(fileURL, &inputFile);

// Get the file’s audio data formatAudioStreamBasicDescription inputFileFormat;UInt32 propSize = sizeof(AudioStreamBasicDescription);ExtAudioFileGetProperty(inputFile, kExtAudioFileProperty_FileDataFormat, &propSize, &inputFileFormat);

// configure the output audio format to native canonical format!AudioStreamBasicDescription outputFormat = {0}; outputFormat.mSampleRate = inputFileFormat.mSampleRate;!outputFormat.mFormatID = kAudioFormatLinearPCM;!outputFormat.mFormatFlags = kAudioFormatFlagsCanonical;!outputFormat.mChannelsPerFrame = inputFileFormat.mChannelsPerFrame;!outputFormat.mBitsPerChannel = 16;

UInt32 propSize = sizeof(AudioStreamBasicDescription);AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &propSize, &outputFormat);

Core Audio OverviewSimple File Reading

// Set the desired decode data formatExtAudioFileSetProperty(inputFile, kExtAudioFileProperty_ClientDataFormat, sizeof(outputFormat), &outputFormat);

// Get the total frame countSInt64 inputFileLengthInFrames;UInt32 propSize = sizeof(SInt64);ExtAudioFileGetProperty(inputFile, kExtAudioFileProperty_FileLengthFrames, &propSize, &inputFileLengthInFrames);

// Read all the data into memoryUInt32 dataSize = (inputFileLengthInFrames * outputFormat.mBytesPerFrame); void *theData = malloc(dataSize);AudioBufferList dataBuffer;!dataBuffer.mNumberBuffers = 1;!dataBuffer.mBuffers[0].mDataByteSize = dataSize;!dataBuffer.mBuffers[0].mNumberChannels = outputFormat.mChannelsPerFrame;!dataBuffer.mBuffers[0].mData = theData;

ExtAudioFileRead(inputFile, (UInt32*)&theFileLengthInFrames, &dataBuffer);

iPhone Services

• Audio Sessions• Categories• Interruptions• Routes

• Hardware Acceleration

iPhone Services Overview

Audio SessionFundamental Concepts• Describes an applicationʼs interaction with the audio system

• Represents the current state of audio on the device• Current Settings• Preferences

• State Transitions• Interruptions• Route Changes

• Session Settings• Influences all audio activity

• Except UI sound effects• Current Session Characteristics

Audio SessionSettings & Preferences

• Identify a set of audio features for your application• Mixable with others• Have input or output• Silence on Screen Lock or Ringer Switch

Audio SessionCategories

Audio SessionBasic Setup

// Get the session instanceAVAudioSession *mySession = [AVAudioSession sharedInstance];

// Implement delegates to handle notificationsmySession.delegate = self;

// Establish appropriate category[mySession setCategory:AVAudioSessionCategoryAmbient error:nil];

4// Activate the session[mySession setActive:YES error:nil];

• System forces session to ʻinactiveʼ• Unable to resume until interrupt task is complete

• AVAudioSession delegates• -(void) beginInterruption

• Update UI to reflect non-active audio state

• -(void) endInterruption• Resume audio, update UI

Audio SessionInterruptions

Audio SessionDefining Interruption Delegates

-(void) beginInterruption{if (isPlaying){

wasInterrupted = YES;isPlaying = NO;

-(void) endInterruption{if (wasInterrupted){

[[AVAudioSession sharedInstance] setActive:YES error:nil];[self startSound];! ! !wasInterrupted = NO;

•The pathway for audio signals

•Where is audio output to?

•Where is audio input from?

•“Last in Wins” rule

•Notification when route changes

•Reason why the route changed

AudioSessionAddPropertyListener(kAudioSessionProperty_AudioRouteChange, audioRouteChangeListenerCallback, self);

Audio SessionDefining a Property Listener Callback

void audioRouteChangeListenerCallback ( void *inUserData, AudioSessionPropertyID inPropertyID UInt32 inPropertyValueSize, const void *inPropertyValue){ MyAudioController *controller = (MyAudioController *)inUserData;

if (inPropertyID != kAudioSessionProperty_AudioRouteChange) return;

if(controller.isPlaying != NO) {!! NSDictionary *routeChangeDictionary = (NSDictionary*)inPropertyValue;!! SInt32 routeChangeReason = [[routeChangeDictionary objectWithKey: ! ! ! CFSTR (kAudioSession_AudioRouteChangeKey_Reason)] intValue];

if (routeChangeReason == kAudioSessionRouteChangeReason_OldDeviceUnavailable) {!! ! [controller pause]; } }}

• Low CPU cost, low power consumption

• Supported HW decoders:• AAC / HE-AAC• Apple Lossless • MP3• IMA4 (IMA/ADPCM)

• Supported HW encoders:• AAC (3GS)

Audio SessionHardware Accelerated Codecs

// Override our current categories ‘mix with others’ attributeUInt32 value = 1;AudioSessionSetProperty(kAudioSessionProperty_OverrideCategoryMixWithOthers,

sizeof(value), &value);

Audio SessionEnabling Hardware Acceleration

• Must set “Mix With Others” to false

• Overrides not persistent across category changes

OpenAL

• Powerful API for 3D audio mixing• Designed for games• Cross-platform, used everywhere!

• Models audio in 3D space, as heard by a single listener

• Designed as a compliment to OpenGL• Mimics OpenGL conventions• Uses the same coordinate system

• Implemented using Core Audioʼs 3D Mixer AU

OpenAL

• Context• The spatial environment• Contains the listener

• Source• 3D point emitting audio• Many attributes to control rendering

• Buffer• Container for audio data• alBufferData() - copies data to internal buffers• alBufferDataStatic() - application owns memory

OpenALFundamental Concepts

OpenALArchitecture

OpenAL

iPhone Hardware

OpenAL Device

ContextListener

Buffer Buffer nBuffer Buffer

Source nSource Source

• Only 1 per context• Positionable just like Sources• Represents the userʼs experience in the 3D environment• Orientation described by two Vectors:

• AT = Direction the Listener is facing• UP = Direction pointing up from the top of the Listenerʼs head

OpenALListener

// Orient the Listener facing +ZALfloat listenerOrientation[6] = {!0.0, 0.0, 1.0, // AT!! 0.0, 1.0, 0.0} // UPalListenerfv(AL_ORIENTATION, listenerOrientation);

• Applies to Listener• Applies to Sources (Mono-only)• Cartesian coordinate system

OpenALPositioning

ALfloat sourcePosition[] = {!0.0, 25.0, 0.0}alSourcefv(AL_POSITION, sourcePosition);

ALfloat listenerPosition[] = {!0.0, 2.0, 0.0}alListenerfv(AL_POSITION, listenerPosition);

OpenALCartesian Coordinates

x:0, y:0, z:+1 = Listener facing the Positive Z

OpenALCartesian Coordinates

x:0, y:0, z:-1 = Listener facing Negative Z

OpenALBasic Setup

// open an OpenAL DeviceoalDevice = alcOpenDevice(NULL);

// Create a new OpenAL Context (and listener)oalContext = alcCreateContext(oalDevice, NULL);

// Set our new context to be the current OpenAL ContextalcMakeContextCurrent(oalContext);

OpenALCreating Buffers and Sources

// Create an OpenAL buffer to hold our audio dataalGenBuffers(1, &oalBuffer);

// Fill the OpenAL buffer with dataalBufferDataStatic(oalBuffer, AL_FORMAT_MONO16, audioData, audioDataSize, 44100);

// Create an OpenAL Source objectalGenSources(1, &oalSource);

// Attach the OpenAL Buffer to the OpenAL SourcealSourcei(oalSource, AL_BUFFER, oalBuffer);

Distance Attenuation

• Describes the reduction in volume based on the distance to the listener.

• Set distance model on the context with alDistanceModel()AL_INVERSE_DISTANCEAL_INVERSE_DISTANCE_CLAMPEDAL_NONE

• Configure Source attributesAL_REFERENCE_DISTANCEAL_MAX_DISTANCEAL_ROLLOFF_FACTORAL_SOURCE_RELATIVE

OpenALAttenuation by Distance

Distance from Listener

// Set the distance model to be usedalDistanceModel(AL_INVERSE_DISTANCE_CLAMPED);

reference distance

// Set the Source’s Reference DistancealSourcef(mySource, AL_REFERENCE_DISTANCE, 2.0);

reference distance

// Set the Maximum DistancealSourcef(mySource, AL_MAX_DISTANCE, 30.0);

maximum distance

reference distance

// Set the Rolloff FactoralSourcef(mySource, AL_ROLLOFF_FACTOR, 2.0);

maximum distance

The Doppler Effect

• No Motion = No Doppler• Doppler only describes the warping of sound due to motion

OpenALThe Doppler Effect

• The default value is 0.0 (disabled)• enabling has small CPU cost

• The normal value is 1.0

alDopplerFactor(1.0);

• Describes the speed of sound in your universe (per second)

// 1000 units per secondalDopplerVelocity(1000);

OpenALPutting it all together- (void)initOpenAL{! ALenum!! ! error;

!! device = alcOpenDevice(NULL);! if (device != NULL)! {! ! context = alcCreateContext(device, 0);! ! if (context != NULL)! ! {! ! ! alcMakeContextCurrent(context);

!! ! ALfloat listenerPosition[] = {! 0.0, 2.0, 0.0} alListenerfv(AL_POSITION, listenerPosition);

! ! ! alGenBuffers(1, &buffer);! ! ! if((error = alGetError()) != AL_NO_ERROR)! ! ! ! exit(1);

!! !! ! ! alGenSources(1, &source);! ! ! if(alGetError() != AL_NO_ERROR) ! ! ! ! exit(1);! ! }! }

!![self initBuffer];!![self initSource];}

OpenALPutting it all together- (void) initBuffer{! ALenum error = AL_NO_ERROR;! ALenum format;! ALsizei size;! ALsizei freq;

!! data = MyGetOpenALAudioData(fileURL, &size, &freq);

!! alBufferDataStatic(buffer, AL_FORMAT_MONO16, data, size, freq);

!! if((error = alGetError()) != AL_NO_ERROR)! {! ! NSLog(@"error attaching audio to buffer: %x\n", error);! }! !

OpenALPutting it all together- (void) initSource{! ALenum error = AL_NO_ERROR;

! alSourcei(source, AL_BUFFER, buffer);

! alSourcei(source, AL_LOOPING, AL_TRUE);!! float sourcePosAL[] = {sourcePos.x, kDefaultDistance, sourcePos.y};! alSourcefv(source, AL_POSITION, sourcePosAL);

!! alSourcef(source, AL_REFERENCE_DISTANCE, 5.0f);

!!! if((error = alGetError()) != AL_NO_ERROR)! {! ! NSLog(@"Error attaching buffer to source: %x\n", error);! }!

• Youʼre now ready to go!

OpenALPutting it all together

alSourcePlay(source);if((error = alGetError()) != AL_NO_ERROR){! NSLog(@"error starting source: %x\n", error);}

Tips & Tricks

• High-Quality laser *pew pew!* and *beeps* unnecessary• Example: ʻafconvert -f caff -d LEI16@8000ʼ

• The more SRCs the less performance you get

Sample Rates

What Next?

Coming soon eventually

The End

Voice That Matter 2010 - Core Audio

Documents

Why local realities matter for Citizens? Voice and ......Why local realities matter for Citizens Voice and Accountability. Lessons from Mwananchi Uganda pilot projects Andrew Kawooya

Voice Saving - Marywood University · Time line of animated clips . Adding Audio . Insert tab – Audio – Audio from a file. The audio will appear as a speaker ... Recording Voice

Bringing Low Power, High Performance Audio and Voice to

VOICE/AUDIO INFORMATION RETRIEVAL: MINIMIZING THE …sap.ist.i.kyoto-u.ac.jp/asru2007/slide/S6-Clements.pdf · VOICE/AUDIO INFORMATION RETRIEVAL: MINIMIZING THE NEED FOR HUMAN

Voice Recognition and Audio Quality Enhancer with Dual ... · Voice Recognition and Audio Quality Enhancer with Dual Microphone Interface Strong Adaptive Beamforming with ST MEMS

A deep architecture for audio-visual voice activity

EVAC AUDIO SYSTEM EMERGENCY VOICE/ALARM …

USB Audio Design Guide - Voice & Audio Interfaces | XMOS€¦ · · 2014-06-02USB Audio Design Guide 2/90 SYNOPSIS The XMOS USB Audio solution provides USB Audio Class compliant

Audio Matter: An Intro to Podcasting & Storytelling

PAVA—IP Audio Voice Alarm Router - Tyco Integrated …€¦ · · 2016-08-23tyco PAVA—IP Audio Voice Alarm Router tyco PAVA Systems The TYCO PAVA System provides 12 analogue

Audio or voice cv

SINGING VOICE NASALITY DETECTION IN POLYPHONIC AUDIO …mtg.upf.edu/static/media/Ramesh-Anandhi-Master-Thesis-2009.pdf · SINGING VOICE NASALITY DETECTION IN POLYPHONIC AUDIO Anandhi

Automated Voice And Audio Quality Test Measurement

Screen Name Screen Type Screen Content Narration/Audio · Narration Voice Professional male or female voice Audio Notes Navigation Notes No interaction is required on this screen

Voice and Audio Communications - CCSDS.orgRECOMMENDED STANDARD FOR VOICE AND AUDIO COMMUNICATIONS CCSDS 766.2-B-1 Page ii November 2017 STATEMENT OF INTENT The Consultative Committee

Does Brand Voice Matter?

Digital Voice Audio Recorder - Voice Activated RecorderygixoL.pdf · Question:Will this digital audio recorder work when not connected to a laptop? Answer: you can start the recording

Introduction - Microsoft... · Web viewAll audio data within a voice burst is continuous. Each voice burst is equivalent to a single, continuous buffer of audio. When no audible audio

Voice Biometrics 2 - Uniphore · banking to increase customer engagement via latest voice technologies such as speech recognition, voice biometrics, voice assistant, and audio mining

Voice Tracer - Philips · Manuale di istruzioni Voice Tracer Registratore audio DVT DVT DVT DVT