Upload
machine-learning-prague
View
190
Download
0
Embed Size (px)
Citation preview
Intelligent Personal Assistant Jan Šedivý
Petr Baudiš, Tomáš Gogár, Tomáš Tunys
April 2016
Agenda
CommunicationIPA
SegmentsFunctionalityUse casesExpectationsInteractionModesContextPrivacyTechnologiesOur FocusRule based IPAYodaQAFuture
goo.gl/NiSF0I
Intelligent Personal Assistants
Intelligent Personal Assistants on the market today
Apple Siri, Google Now, Microsoft Cortana and Amazon Echo applications/appliances are the best known Intelligent, Virtual or Personal Assistants (IPA).
In this presentation I will discuss the use cases, challenges and basic architecture of the future intelligent assistants.
IPA Basic Definition
Predict users' needs, helps, alerts, answers questions and ultimately acts autonomously on our behalf.
To achieve it this goal: IPAs needs to communicate and be connected to the cloud.
How can IPA help us?
IPA Segments
Mobile applications
Car interface
Wearables
Chat bots
Home automation
Robots, appliances
Where is IPA currently used?
IPA Functionality
Alerting, reminding
Command control - mobile, car, wearables
IR front end, Search UI
Simple, factoid question answering
Simple chat bots, Avatars
Advising, Robo advisers
Suggesting
How can IPA help us?
How to communicate with IPA?
People met shook hands and made a deal
Call centers, credit cards and parcel delivery
Internet commerce web sites
WeChat, FaceBook, Alexa call me uber
A short history of making business
Relationships start with conversation …
Language
IPA interaction channels
Input: Text, Speech, Haptic, Gestures, Image recognition,
Output: Text, Voice, Graphics, Haptic,
Mic, Camera, Touch screen, Wearables sensors, Brain waves
Multimodal: 5 senses, combines voice and GUI
Use not only language!
IPA Interaction
ModesIPA initiated: Alerts, Suggests,
User initiated: Execute command, Answer question (voice), Carry a dialog
Directed dialog
Who starts the interaction?
IPA Interaction
Modes
Open dialog
Mixed-initiative: The initiative is changing. IPA or user starts the dialog.
Disambiguation: IPA clarifies the question through a dialog
Conversational
Topic changing system
Who is leading the dialog?
User Model - Context
Internal, embedded
Location, Time,
History - query, commands, situations, …
Future - calendar, email,...
User’s profile, preferences, usage modes, …
Affective computing - Emotional models
People know context!
User Model - Context
External - environmentSocial, family, friendsconnected == IoTSensors, actuators, LANs
Private - with limited access
Recorded phone calls,Credit card transactions,Utilities ...
IPA Privacy
IPA may know almost all
Supertrust relation
How much of private information do we want to share?
User need control
The information may only be shared with trust, (Norms of human relationships)
How much private data do we need to share to let IPA act
for us!
IPA is one of the most complicated examples of the AI technologies
IPA Technologies
Speech recognition, Speaker, language recognitionImage recognition,Haptics, gestures, face gesture, emotion recognitionEmotion recognition TTS, automatic speech generationGraph, picture, haptic generationUser modeling
NLP, NLU, Information retrieval, Knowledge management, Dialog managementInternet, APIsIoT etc. EvaluationAffective computing, Emotional
models
To meet the user's expectations we need to combine many AI technologies:
IPA Architecture
Rule based If Sentence Pair Match is high=> intent do this
Statistical MLQuestion analysis,Knowledge base and Internet Answer Hypothesis
Answer scoring
Rule based or Statistical
Rule Based IPA
Spoken input - ASR - Text - Entity extraction - Intent detector - Normalization - Execution
These systems assume questions with clear goal
If the question is beyond the system capabilities “I can’t answer this question”
Or it does the WEB search
Intent Reco
Answer Sentence Selection
Next Utterance Ranking
Semantic Textual Similarity
Paraphrase Identification
Recognizing Textual Entailment
Basic models: TF-IDF, BM25, word, sentence embeddings
Sentence Pair Scoring
The YodaQA System
● Universal end-to-end QA
● Searching databases and documents
● Open source research system
● Machine learning no manual rules!
● Java, Apache UIMA, Apache Solr
● Proof-of-concept web+mobile interface, public live demo
Factoid Question Answering
Naturally phrased question instead of keywords
Output is not a whole document, but just the snippet of information
Voice interaction
Factoid Question Answering
We cover the basic factoid questions!
When was J. R. R. Tolkien born?
What is the population of Brazil?
Who played Marge in The Simpsons?
Where was she born? (Julie Kavner)
How do I get to Wall Street?
Turn on the green light!
Tune BBC World News!
You are the last one, do you want me tu turn on the alarm?
IPA Building Steps
Intet identification
Data collection,
Labeling,
Feature engineering,
Models building,
(Active learning,)
Model evaluation
Future
Implement norms of human relationship: mutual value, respect, trust
What makes a better conversation?
How to carry an effective dialog, negotiation?
How to design an engine recognizing emotions?
How to learn habits?
How to make the IPA more human like?
Future
How to make IPA adaptable to the user?
How to make IPA automatically configurable and integrate in a new environment?
How to make IPA enough flexible?
IPA unified interface to mobile applications
Millions of mobile apps
Navigation, login-chaos, and unified bad notification
leveraging the context-of-consumption
leverage sensory and multimodal inputs
Gartner: By 2020, IPA will facilitate 40 percent of mobile interactions and it will begin to dominate the postapp era.
Thank you
TeamČVUT FEL - dept. Of Cybernetics
Human behaviour
Content
People ask questions
What
Why
When
How
...
Mobility
People need help everywhere
Small real estate
Navigation, cross-app API, password chaos
UI has to change
iPhone introduced 2007
IPA Segments
Automotive.
Utilities.
Banking.
Health.
Retail.
Real estate.
...
What are the industries benefiting from IPAs?
IPA Developmen
t
Collect utterances
Define the answers
Label utterances
Build the model (ML)
Evaluate
Iterate to improve
Users Expectations
Mustn't forcing to memorize commands.
It must understand natural language.
Helps solving everyday tasks.
Must be non obtrusive giving suggestions.
Answers questions.
IPA Use Cases
Complex questions,
Conversational, dialog
Complex robo advisers,
Presentation commerce,
Digital, enterprise, media asset management
Automatic generating documents, stats, news, tweets based on content on the web ….