54
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scott Totman - Capital One, VP of Mobile and Innovation Mike Hines Amazon, Developer Evangelist October 2015 MBL308 How Capital One Developed a Skill for Alexa

(MBL308) Extending Alexa’s Built-in Skills. See How Capital One Did It

Embed Size (px)

Citation preview

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Scott Totman - Capital One, VP of Mobile and Innovation

Mike Hines – Amazon, Developer Evangelist

October 2015

MBL308

How Capital One Developed

a Skill for Alexa

CREDIBLY INNOVATE PHOTO

HERE

Alexa Skills Kit

TODAY’S AGENDA

About Alexa

Capital One skill demo

Building the Capital One skill

Alexa skill best practices

About Alexa

What is Alexa?

Alexa is a cloud-based voice service

that can answer questions

play music

read the news

and more.

Echo is an

always-on

always-connected

hands-free device that connects to

Alexa.

Alexa architecture

Amazon

Alexa

serviceGUI cards are

rendered in the

Amazon Alexa app

User audio is streamed

to the service

Audio responses

are rendered on

device

Alexa is always learning.

Alexa gets smarter by

learning new skills.

Developers can create new

skills for Alexa.

Alexa is

ALWAYS LEARNING

Creating your own

ALEXA SKILLSAlexa skills have two parts:

Configuration data in Amazon

Developer Portal

Hosted service responding to user

requests

Alexa Skills Kit architecture

Amazon

Alexa

service

Developer’s

application

service

Amazon’s

Developer

PortalApplication, intents, sample data,

developer service URL endpoint

Configured through portal

User intents and

arguments are sent

to the developer

service

GUI cards are rendered in

the Amazon Alexa app

User audio is

streamed to the

service

Audio responses are

rendered on-deviceText response and/or GUI

card data is returned

Building an Alexa skill

HOSTED SERVICE• You define interactions for your voice

app through intent schemas

• Each intent consists of two fields. The

intent field gives the name of the intent.

The slots field lists the slots associated

with that intent.

• Slots can also included types, such as

LITERAL, NUMBER, DATE, etc.

Building an Alexa skill

HOSTED SERVICE• The mappings between intents and the

typical utterances that invoke those

intents are provided in a tab-separated

text document of sample utterances.

• Each possible phrase is assigned to one

of the defined intents.

• GetHoroscope what is the horoscope for

{pisces|Sign}

• GetHoroscope what will the horoscope for

{leo|Sign} be {next tuesday|Date}

Capital One’s Journey

Capital One’s Alexa approach

June: A few

developers buy Echos

July: Full day tech offsite &

side of desk project kickoff

August: Rapid prototyping and

expanding Capital One skill

Goal: Pair Alexa with the

Capital One app and

allow users to get their

credit card balance

Consumer insights: Design thinking/test + learn

Customers like it!

• Hands-free convenience is valuable

• Interested in using Echo for informational

purposes

• Open to making payments/transactions

But…

• Concerns about local security

• Users don’t want financial information captured

by a third party (Amazon)

Prototyping: Prerequisites & new development

Leverage existing API model

built for Android/iPhone apps

Piggy-back off “glance”

services built for Apple Watch

Build new JS service as the

ASK orchestrator*

*Used Alexa app node library

(Thanks Matt Kruse!)

Capital One skills focus

Read-only information Transactional skills Experimenting

• Default accounts

(credit card, bank,

loans)

• Account balances

• Bill due date

• Last payment

• Last transactions

• Interest rate

• Pay bill(s)

• Transfer $

• App usage Patterns

• O-Auth

• Customer service/

support

• Customer

acquisition

• Alexa adoption

• Alexa evolution

Skill development segmented into three priority buckets

Demo

Alexa challenges discovered during prototypingNumerical utterances, device latency, and security were our most significant

Numerical utterances

Challenge:

• “Twenty-two” is hard to turn into 22 instead of 20

and 2

• “Three hundred and forty-four dollars”

• Needed to call out words like ‘hundred’, ‘and’

Solution:

• Programmatically create utterances (big list)!

• Optional words

• ASK support for CURRENCY data type

PayAmount {one|THOUSANDS} thousand {one|HUNDREDS} hundred {one|DOLLARS} dollars and {eighty eight|CENTS} cents

PayAmount {one|THOUSANDS} thousand {one|HUNDREDS} hundred {one|DOLLARS} dollars and {ninety nine|CENTS} cents

PayAmount {twenty-one|THOUSANDS} thousand {twenty-one|HUNDREDS} hundred {twenty-one|DOLLARS} dollars

PayAmount {twenty-two|THOUSANDS} thousand {twenty-one|HUNDREDS} hundred {twenty-one|DOLLARS} dollars

PayAmount {twenty-three|THOUSANDS} thousand {twenty-one|HUNDREDS} hundred {twenty-one|DOLLARS} dollars

PayAmount {twenty-one|THOUSANDS} thousand {twenty-two|HUNDREDS} hundred {twenty-one|DOLLARS} dollars

PayAmount {twenty-two|THOUSANDS} thousand {twenty-two|HUNDREDS} hundred {twenty-one|DOLLARS} dollars

PayAmount {twenty-three|THOUSANDS} thousand {twenty-two|HUNDREDS} hundred {twenty-one|DOLLARS} dollars

PayAmount {twenty-one|THOUSANDS} thousand {twenty-three|HUNDREDS} hundred {twenty-one|DOLLARS} dollars

PayAmount {twenty-two|THOUSANDS} thousand {twenty-three|HUNDREDS} hundred {twenty-one|DOLLARS} dollars

PayAmount {twenty-three|THOUSANDS} thousand {twenty-three|HUNDREDS} hundred {twenty-one|DOLLARS} dollars

PayAmount {twenty two|DOLLARS} dollar {twenty two|CENTS} cents

PayAmount {sixty-seven|THOUSANDS} thousand and {sixty-eight|DOLLARS} dollars

PayAmount {fifty-seven|THOUSANDS} thousand {fifty-eight|DOLLARS} dollars and {fifty-eight|CENTS} cents

PayAmount {twenty-one|THOUSANDS} thousand {twenty-two|HUNDREDS} hundred {twenty-two|DOLLARS} dollars

PayAmount {one|THOUSANDS} thousand {one|HUNDREDS} hundred {one|DOLLARS} dollars and {sixty six|CENTS} cents

PayAmount {eighty eight|DOLLARS} dollar {thirty three|CENTS} centsPayAmount {twenty-three|THOUSANDS} thousand {twenty-two|HUNDREDS} hundred {twenty-two|DOLLARS} dollars

PayAmount {twenty-one|THOUSANDS} thousand {twenty-three|HUNDREDS} hundred {twenty-two|DOLLARS} dollars

PayAmount {twenty-two|THOUSANDS} thousand {twenty-three|HUNDREDS} hundred {twenty-two|DOLLARS} dollars

PayAmount {twenty-three|THOUSANDS} thousand {twenty-three|HUNDREDS} hundred {twenty-two|DOLLARS} dollars

PayAmount {twenty-one|THOUSANDS} thousand {twenty-one|HUNDREDS} hundred {twenty-three|DOLLARS} dollars

PayAmount {twenty-two|THOUSANDS} thousand {twenty-one|HUNDREDS} hundred {twenty-three|DOLLARS} dollars

PayAmount {twenty-three|THOUSANDS} thousand {twenty-one|HUNDREDS} hundred {twenty-three|DOLLARS} dollars

PayAmount {twenty-one|THOUSANDS} thousand {twenty-two|HUNDREDS} hundred {twenty-three|DOLLARS} dollars

PayAmount {twenty-two|THOUSANDS} thousand {twenty-two|HUNDREDS} hundred {twenty-three|DOLLARS} dollars

Sample numerical utterances…out of 712

Latency

Challenge:

• Coding visually is great for websites, not for voice

• Pauses while the service looks up data are a much

bigger deal for voice

Solution:

• Keep APIs fast

• Leverage Alexa session data

• Keep explanations terse…but not rude

Security

Challenge:

• Account linking didn’t exist as an available solution

• Figure out how to connect an Echo with a

customer account

• No guarantee of privacy on Echo end

Solution:

• Make vulnerabilities dependent on compromised

account

• Pairing code for secure account linking

• 2nd factor authentication for moving money

Pairing process workflow

1. Open session

2. Device ID not

recognized

3. Generate 6-digit PIN

4. Log in

to C1 app

– provide

PIN

Keeping things in context

Challenge:

• Context is hard with multiple accounts

• Helping a user with tasks and cross-

context:

• Switching context

• Keeping context

• Recognizing context

Solution:

• Map user workflow

• When in doubt, ask the user

Code

sample:

Context

switching

function getCreditCardAccount(){

var currentAccount = hasContext()

if( currentAccount && currentAccount.isCreditCard() ) {

return currentAccount

}

var accounts = getCachedAccounts(req,res)

if( accounts ) {

var cached = accounts.filter(function(entry) {

return entry.isCreditCard()

})

if( cached.length == 1 ) {

return cached[0]

}

}

return null

}

Capital One takeaways

Wish list

• Skill discoverability

• Handle vocal interruptions

better, with context

• Notification indicator

Works great

• Straightforward

• Majority of the effort is

on customer

experience, not

implementation

• ASK is evolving quickly

+ adding new

capabilities

Best Practices

Making it sound easy

A person can absorb and process a lot more

written information than audio information.

Instructions that makes sense in an average

web page dialog are probably going to sound

intimidating in a spoken command.

Follow these best practices for better results.Image of

Picture of an Ear

1. Make it clear the user needs to respond

Not so good

Trivia challenge: Trivia Challenge.

You can choose from the following

categories: 80’s Pop Songs, Potent

Potables, or European History.

1. Make it clear the user needs to respond

Better

Trivia challenge: Trivia

Challenge. Here are your

categories: 80’s Pop Songs,

Potent Potables, or European

History. Which one do you want?

1. Make it clear the user needs to respond

Best practice

If you expect the user to say

something, make sure you end

your prompt with a question.

2. Don’t assume the user knows what to do

Not so good

Car Fu: Car Fu.

2. Don’t assume the user knows what to do

Better

Car Fu: Car Fu. You can ask to get a

ride or request a fare estimate. Which

will it be?

User: Get a ride.

Car Fu: Sending your request. A mobile

alert on your cell phone will let you

know when your car arrives.

2. Don’t assume the user knows what to do

Best practice

When launching a skill or

finishing an interaction, always

suggest what the user can do

next.

3. Present the options clearly

Not so good

Food Taxi: Would you like french

fries or a salad?

User: Yes

3. Present the options clearly

Better

Food Taxi: Which side would you

like: French fries or a salad?

User: Salad.

3. Present the options clearly

Best practice

Either/or questions must be

stated explicitly, lest it be

interpreted as a yes/no

question.

4. Keep it brief

Not so good

Astrology Daily: There are 12

Zodiac signs that I can give you

a horoscope for. Please tell

which one you’d like.Image Here

4. Keep it brief

Better

Astrology Daily: Get the

Horoscope for which sign?

Image Here

4. Keep it brief

Best practice

Use fewer words than you

might on your website.

Image Here

5. Avoid verbose choices

Not so good

Dairy Shack: What flavor do you

want? For chocolate, say

Chocolate. For vanilla, say

Vanilla. Or for strawberry, say

Strawberry.

Image Here

5. Avoid verbose choices

Better

Dairy Shack: Which flavor

would you like? You can say

Chocolate, Vanilla, or

Strawberry.

5. Avoid verbose choices

Best practice

Do not present more than

three choices and avoid

repetitive wording.

6. Avoid crowding options

Not so good

Score Keeper: Score Keeper. You

can give a player points, add a new

player, ask for the score, start a new

game, clear all players, or stop if you’re

done. Now, what would you like?

User: What was that again?

Image Here

6. Avoid crowding options

Better

Score Keeper: Score Keeper. You can give a player points, ask for the score,

or say Help. What would you like?

User: Help.

Score Keeper: Here are some things you can say:

add John, give John 5 points, tell me the score, start a new game, or reset all

players.

You can also say stop if you’re done.

So, how can I help?

6. Avoid crowding options

Best practice

Present the 2-3 choices that users

will pick 80% of the time and expose

the rest through ‘Help’.

7. Get one piece of information at a time and use it

Not so good

Joke Bank: Would you like to hear a

joke?

User: Yes.

Joke Bank: What’s black, white, and

red all over? An embarrassed skunk.

“One, Two, Five!”

“Three, sir! Three!”

7. Get one piece of information at a time and use it

Better

Joke Bank: What’s black,

white, and red all over? An

embarrassed skunk.

7. Get one piece of information at a time and use it

Best practice

Make smart assumptions

where possible.

Avoid asking non-essential

questions.

8. Finally, make the user comfortable

Best practice

• Let users know they’re in the right place.

• Present usable chunks of information, not overload.

• Take care of technical and legal details when enabling the

skill, not in the audio.

• Don’t blame the user.

Best practices

1. Make it clear the user needs to respond

2. Don’t assume the user knows what to do

3. Present the options clearly

4. Keep it brief

5. Avoid verbose choices

6. Avoid crowding options

7. Get information and use it

8. Make users comfortable

@MikeFHines

developer.amazon.com/blog

Learn more: http://developer.amazon.com/ASK

http://capitalone.com

How did we do: Remember to complete the evaluation!

Follow us:

Thank you!

http://bit.ly/appstoregiveaway