GTC 4-notes make pdfon-demand.gputechconf.com/gtc/2017/presentation/s...Title: Microsoft PowerPoint - GTC 4-notes make pdf.pptx Author: Ugh! Microsoft Created Date: 5/18/2017 4:17:25

ILLUMINATING AI: UNDERSTANDING AI'S GOALS, REASONING &

COMPROMISES

AI says “7”

(99%)

Tsvi Achler MD PhD Tsvi Achler MD PhD

Image

If SIRI makes a mistake, the impact is limited

AI Adoption Requires Transparency:

Trust, Regulation, and Understanding of the AI’s Compromises

But the problem is: AI is a Black Box 2

Banking | Medicine | Self-Driving Cars

In most applications a mistake has more serious consequences:

With undetectable noise

Google + =

The Lack of Transparency Leaves You Asking: What Is the AI Really Recognizing?

DARPA commitment: “Explainable AI” Initiative 3

EU Legislates: Users Have a Right to an Explanation

Before

Solution Pathways for Explanability

(1) DARPA: Trial & Error to Find What Effects the Network

Mirrors the brain’s ability to provide reasons

(2) Optimizing Mind:

Takes Time & the AI Remains a Black Box

The Ground Truth Of The AI

5

The Brain Relies on Feedback

During Recognition

AI Does Not

Feedback:

No Feedback:

Feedback is Found Throughout the Brain

(eg Aroniadou-Anderjaska et al 2000) Tri-synaptic connections

More Feedback Than Feedforward

Retrograde Signaling e.g.: nitric oxide

AI Exclusively Uses Feedforward Connections W During Recognition

Outputs Y

Inputs X

Feedforward Caricature

Computational Caricature

Mathematical Function

W= Outputs Y

Inputs X

Connectivity Notation Weight Matrix

Y

X

W XWY

(during recognition)

Recognition

8

Do not be mislead: Even when an AI is called “recurrent” it still uses W

Why Is Feedback Needed? For Optimization

• What is Optimization?

• Difference between “Feedforward” and

Feedforward-Feedback methods

• Why lack of Feedback During Recognition matters

9

Optimization

Try configuration, evaluate, modify and repeat until optimal fit

Example: Solving Jigsaw Puzzle (OP)

OP Try Evaluate

Modify

10

OP

Recognition Algorithms:

Y

X

Bird Bird

Bird

Recall

Recognition- Inference

Learning Memory

Learned feedforward weights in: Deep, Convolutional, Recurrent, LSTM, Reinforcement Networks, Support Vector Machines (SVM), Perceptrons, “Neural Networks” … everything learned via Backprop etc.

by Optimizing (OP) during learning W

Fodor & Pylyshyn (1988)

Sun (2002)

WXY

OP

Optimize weights so that recognition occurs using a simple multiplication

1) Optimized weights W are a “Black Box “Feedforward” methods

OP

Recognition Algorithms:

Y

X

Bird Bird

Bird

Recall

Recognition- Inference

Learning Memory

W

Fodor & Pylyshyn (1988)

Sun (2002)

WXY

OP

encodes uniqueness into weights

“Feedforward” methods

Determining Uniqueness is essential to perform efficient recognition

Problem: Uniqueness changes with context

cannot learn uniqueness for all possible contexts

O O O O

O X

X X O X

O

Besides: relevant context is during recognition, not learning

Unique thus Important!

Unique thus Important!

Training Instance 1 Training Instance 2

OP

We suggest uniqueness is determined during recognition instead

Bird Bird

Bird

Recall

Recognition Learning Memory

2) Optimizing only current test pattern

Weights “Clear Box”

1) Determining activation Y (not weights)

→ Reducing computational costs

OP

(not all of training data)

By Optimizing (OP) during recognition

when the context is available

while estimating uniqueness

15

Model-type During Learning (weight Δ) During Recognition (find Y)

Why would the brain only use feedback during learning?

Optimization “Feedforward” Feedforward recognition

Simpler Learning M Illuminated AI

to find weights W

Switch Dynamics to find neuron activation

Optimization

Illuminated AI Switches When Optimization Occurs

Requires feedback for learning: for example to back-propagate error Not really Feedforward !!!

Recognition with Illuminated AI

Outputs Y

Inputs X

Symmetrical Inhibitory connections modulate

inputs using output activity

M= Outputs Y

Inputs X Y

X

MM

For optimization Weights are Expectations

(allows explainability & update)

Neuron Caricature

Computational Caricature Connectivity Notation

17

Same Results But Transparent Method

XWY

pattern from the

environment (input)

resulting neuron activation (output)

"feedforward" weights

SAME

SAME

M

Illuminated Networks:

Y

X OP

Feedforward:

SAME

Y

X

Y

18 Easier to Explain, Learn and Update Illuminated Weights

Example: You train your AI

It gets good grades (performance)

You are done … right?

Learn Digits

95%

19

1) Can convert existing feedforward networks to xRFN & see what they are doing: Black Box -> “Clear Box”

MNIST Demonstration

SVM Feedforward

xRFN See Inside! Equivalent

Overall Accuracy 91.65%By Digit: 1 2 3 4 5 6 7 8 9 0False Positives 45 67 106 79 86 67 75 152 109 49False Negatives 19 129 91 71 137 50 83 125 109 21

SVM

Overall Accuracy 91.65%By Digit: 1 2 3 4 5 6 7 8 9 0False Positives 45 67 106 79 86 67 75 152 109 49False Negatives 19 129 91 71 137 50 83 125 109 21

RFN

Why are Explainable Regulatory Feedback Networks xRFN beneficial?

Does the brain perform optimization during recognition?

O X O

O O

O

O O

O

O

O

O O

O O

O O O

O

O O O O E

E E

E E E E E E

E E

E E E

E E

E E E

F vs.

Rosenholtz 2001

This occurs in all modalities, including audition, vision, tactile, and

(like seen in the jigsaw puzzle)

suggests optimization during recognition is ubiquitous

Rinberg etal 2006

How long does it take to Find the single pattern?

even in olfaction with its poor spatial resolution

If brain uses optimization: should be faster with unique patterns (like jigsaw) If brain uses Feedforwad: Y=WX fixed propagation, fixed amount of time

(taking what is commonly considered a spatial attention phenomena and attributing it to a recognition phenomena)

Brain takes longer in right box suggesting a signal-to-noise phenomena

Does It Scale To Large AI?

Does Illuminated AI Consume More Resources Than Feedforward AI?

Tests on Random Data Increasing in Size Nodes Features Matrix size

10 100 1,000 100 1,000 100,000 500 1,000 500,000

1,000 10,000 10,000,000 2,000 10,000 20,000,000

6,000 12,000 72,000,000

8,000 15,000 120,000,000

9,000 20,000 180,000,000

0 1,000 2,000 3,000 4,000 5,000 6,000 7,000

Com

putin

g Ti

me

(s)

Computational Costs: During Learning

SVM Learning (W) Fastest Feedforward Learning (W) Illuminated Learning (M)

Out of Memory!

5 10 15 20 Matrix Size

Million 0

Out of Memory! … 120 !

SVM

F

Can Learn > 100x Faster Without Balancing Data

Out of Memory!

0 2 4 6 8

10

Com

puta

tiona

l Cos

t pe

r Tes

t (s)

Computational Costs: During Recognition

Best Alternate AI (KNN) Without Optimization

Illuminated AI (M)

Matrix Size 0 50 100

Millions 20 120

Feedforward AI (W)

Nodes Features Matrix size

10 100 1,000 100 1,000 100,000 500 1,000 500,000

1,000 10,000 10,000,000 2,000 10,000 20,000,000

6,000 12,000 72,000,000

8,000 15,000 120,000,000

9,000 20,000 180,000,000

SVM

F

Out of Memory!

Accelerated on GPU’s

25

Torch/Lua

Also Useful for Simpler AI Such As:

Logistic Regression & Random Forests

26

The Current Standard For Explainability Is Decision Trees Based On Logistic Regression

27

30% Loss In Accuracy Occurs When Explaining Using Decision Trees

0% Loss With Adaptive Insight Using Illuminated AI

(In FinTech, Medicine, Government, …)

Histogram of Factors that Hinder vs. Help the Case

Case #1 Score 0.75 Not Approved

Case #2 Score 0.81 Borderline Approved

Case #3 Score 0.98 Strongly Approved

Understanding Decisions at a Glance N

umbe

r of F

acto

rs

-0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0

2

4

6

8

10

12

Helps Hinders

Most Hindering Factor: E With value 66.2 26.8 below expected

0

2

4

6

8

10

12

14

Helps Hinders

-0.5 0 0.5 1 1.5

Most Helping Factor: G With value -8.8 5.8 above expected

0.6 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5

0

5

10

15

20

25

Helps

Hin

ders

Most Helping Factor : M with score 87.0 12.2 above expected

28

Comparison:

Structure (during recognition)

Feedforward Illuminated

“Feedforward”

Outputs Y

Inputs X

Y

X

W

Feedforward-Feedback

Outputs Y

Inputs X

Y

X

MM

Explainable? Yes No

Easy to Learn & Update?

Yes No

Optimization During Learning During Recognition

1 2 3 4 5

6 7 8 9 0

29

Collective Benefits Enabling Wider AI Adoption

Company Reduce: development costs, time Increase: trust, adoption, and flexibility

Users Better understanding and

trust of AI’s decision process

Developers Less guessing, easier

debugging and updating

Regulators Understanding of AI’s goals,

compromises, and decision process

30

Offerings

• Convert & Explain Any Feedforward AI • Boost: Internal Development & Quality Assurance • Assist Your Regulators: FDA, AMA, DMV, FTC …

1) Illuminate Your AI

• Faster Learning - 100x • Less Data Cleaning • User Personalization • Easier Update

* for Certain Feedforward AI

2) Train or Update Your AI Fast Without Rehearsal*

[email protected]

How Will You Use Illuminated AI?

32

Tsvi Achler 650.486.2303

Documents

GTC 4-notes make pdfon-demand.gputechconf.com/gtc/2017/presentation/s...Title: Microsoft PowerPoint - GTC 4-notes make pdf.pptx Author: Ugh! Microsoft Created Date: 5/18/2017 4:17:25