29
2015 Pebble Developer Retreat Voice on Pebble Andrew Stapleton, Firmware Engineer

#PDR15 - Voice API

Embed Size (px)

Citation preview

Page 1: #PDR15 - Voice API

2015 Pebble Developer Retreat

Voice on Pebble

Andrew Stapleton, Firmware Engineer

Page 2: #PDR15 - Voice API

Voice• Intro •Basic overview •Dictation API - Intro •Dictation API - Example app •How it works •Do’s and don’ts with the API •Development Help

Page 3: #PDR15 - Voice API

Voice Overview

Microphone

MCU

PDM -> PCM

Speex Encoder

Dictation UIRecognizer

6

1 2

3

4

5

Page 4: #PDR15 - Voice API

Examples•General text input •Voice notes •Voice commands (tell your phone what to do) •Translation tool •Search/contextual query interface •Messaging •Twitter interface

Page 5: #PDR15 - Voice API

API - The BasicsDictationSession *dictation_session_create(uint32_t buffer_size, DictationSessionStatusCallback callback, void *callback_context);typedef void (*DictationSessionStatusCallback)(DictationSession *session, DictationSessionStatus status, char *transcription, void *context);DictationSessionStatus dictation_session_start(DictationSession *session);

Page 6: #PDR15 - Voice API

Dictation UI Flow

Page 7: #PDR15 - Voice API

Dictation UI Flow

UI Started

Speech ends

User accepts

User rejects

1 2 3 4

Page 8: #PDR15 - Voice API

Dictation UI Flow

x4

1

2

3

4

FailureTranscriptionError

FailureSystemAborted

FailureSystemAborted

Page 9: #PDR15 - Voice API

Dictation UI Flow

1 2 3

FailureConnectivityError FailureDisabled

Page 10: #PDR15 - Voice API

API - Advanced Usagevoid dictation_session_enable_error_dialogs(DictationSession *session, bool is_enabled);void dictation_session_enable_confirmation(DictationSession *session, bool is_enabled);DictationSessionStatus dictation_session_stop(DictationSession *session);

Page 11: #PDR15 - Voice API

Dictation UI Flow•No confirmation dialog

UI Started

Speech ends

1 2 3

Page 12: #PDR15 - Voice API

API - Demo•Translation tool •Use dictation session to get text to be translated from user •Use Google Translate API to translate the text •Display response in the form of text to user

Page 13: #PDR15 - Voice API

#define BUFFER_SIZE (512)

static void init(void) { session = dictation_session_create(BUFFER_SIZE, handle_dictation_result, NULL); if (!session) { APP_LOG(APP_LOG_LEVEL_ERROR, "No phone connected, platform is not supported or " "phone app does not support dictation APIs!"); } // more initialization}

Page 14: #PDR15 - Voice API

static void select_click_handler(ClickRecognizerRef recognizer, void *context) { dictation_session_start(session);}

static void handle_dictation_result(DictationSession *session, DictationSessionStatus status, char *transcription, void *context) { if (status == DictationSessionStatusSuccess) { if (dictation_result) { free(dictation_result); } const char *preamble = "ENG: "; size_t len = strlen(transcription); dictation_result = malloc(len + strlen(preamble) + 1); strcpy(dictation_result, preamble); strcat(dictation_result, transcription); text_layer_set_text(q_text_layer, dictation_result); } else { // handle errors }}

Page 15: #PDR15 - Voice API

API - Demo

Page 16: #PDR15 - Voice API

API - Demo

Page 17: #PDR15 - Voice API

static void init(void) { session = dictation_session_create(BUFFER_SIZE, handle_dictation_result, NULL); if (!session) { APP_LOG(APP_LOG_LEVEL_ERROR, "No phone connected, platform is not supported or " "phone app does not support dictation APIs!"); } dictation_session_enable_confirmation(session, false /* is_enabled */);

// more initialization}

Page 18: #PDR15 - Voice API

API - Demo

Page 19: #PDR15 - Voice API

Recognizer

How it Works - Microphone

Microphone

MCU

PDM -> PCM

Speex Encoder

Dictation UI6

1 2

3

4

5

Page 20: #PDR15 - Voice API

How it Works - Microphone•Single, MEMS (microelectromechanical system) microphone •PDM output @ ~1MHz •Pulse Density Modulation •1 bit signal that encodes 16-bit data

•Pass 1 bit signal through decimation and low pass filter to convert to 16-bit PCM (Pulse code modulation) data at 16kHz

Decimation + LPF

Page 21: #PDR15 - Voice API

Recognizer

How it Works - Encoder

Microphone

MCU

PDM -> PCM

Speex Encoder

Dictation UI6

1 2

3

4

5

Page 22: #PDR15 - Voice API

How it Works - Encoder•Why encode? •Bluetooth throughput limited

•Why Speex? •Developed specifically for voice encoding •Outperforms most telephony codecs (compression ratio v quality) •Tunable quality •Recovers from dropped frames

Page 23: #PDR15 - Voice API

How it Works - Encoder•CELP (Code-excited linear prediction) coding •Converts PCM to a sequence of frames •Converts 16kHz, 16-bit PCM signal (256kbps) to a 12.8kbps sequence of frames •~50% CPU cost

Page 24: #PDR15 - Voice API

How it Works - The rest

Microphone

MCU

PDM -> PCM

Speex Encoder

Dictation UI6

1 2

3

4

5

Recognizer

Page 25: #PDR15 - Voice API

Do’s and Don’ts•Only create one session instance (~1.5kB RAM + buffer space) - it can be reused. •While session is in progress: •No heavy processing •No appmessage

•Clean up the session to recover precious RAM • If you decide to disable error messages, provide some useful feedback for the user.

Page 26: #PDR15 - Voice API

Do’s and Don’ts•Common failures: •user not speaking clearly (helps to enunciate and speak slowly) •background noise.

•Encourage users to keep phrases brief •Voice language setting may be different from watch language

Page 27: #PDR15 - Voice API

Development Tools•Dictation API works in local emulator! •Coming to CloudPebble soon! •To use with the emulator: •Fire up the emulator •With the pebble tool:

•Use voice-enabled app like you would on a watch

$  pebble  transcribe  <status  code>  -­‐t  <transcription  string>  

$  pebble  transcribe  0  -­‐t  “What  is  the  current  time  in  London  England"

Page 28: #PDR15 - Voice API

More Info•API Documentation: http://developer.getpebble.com/docs/c/preview/Foundation/Dictation/ •Guide: https://developer.getpebble.com/guides/pebble-apps/sensors/dictation/ •Example: https://github.com/pebble-hacks/voice-demo

Page 29: #PDR15 - Voice API

Questions?