54
Your Instructor Greg Nelson, CPHIMS, MMCi Vice President, Analytics & Strategy – Vidant Health 25+ year Analytics/ Data Science Expert Adjunct Faculty at Duke University (Fuqua School of Business) Education: B.A. in Psychology from the University of California, Santa Cruz Masters of Management in Clinical Informatics from Duke University’s Fuqua School of Business Ph.D. level work in Cognitive Social Psychology at the University of Georgia Interests: Woodworking, riding his Harley, STEM education, learning Author: The Analytics Lifecycle Toolkit (Wiley, 2018)

Data Visualization Best Practices - NCHICA · The data-value stack begins with the simple display of records . Next comes identifying relationships and exploring data through interactive

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Your Instructor

Greg Nelson, CPHIMS, MMCi Vice President, Analytics & Strategy – Vidant Health • 25+ year Analytics/ Data Science Expert • Adjunct Faculty at Duke University (Fuqua School of Business) Education:

B.A. in Psychology from the University of California, Santa Cruz Masters of Management in Clinical Informatics from Duke University’s Fuqua School of Business Ph.D. level work in Cognitive Social Psychology at the University of Georgia

Interests:

Woodworking, riding his Harley, STEM education, learning

Author: The Analytics Lifecycle Toolkit (Wiley, 2018)

AI 101: Introduction

An NCHICA Workshop

Gregory S. Nelson, MMCi, CPHIMS Vice President, Analytics & Strategy Vidant Health

Artificial Intelligence

Identify & Learn

Patterns

Understand Context

Recognize Things

the science of making computers do things that require intelligence when done by humans (Copeland, 2000)

Image/ Context Tagging

Computer: Tiny humans Tiny humans running Tiny humans running with baskets Tiny humans running with baskets eggs

Person: Easter Egg Hunt

Image Detection

Facial Recognition

Source: Medium. Adam Geitgey

Machine Learning: Supervised vs. Unsupervised • Under machine learning, there are different tasks.

• A task is a specific objective for your machine learning algorithms.

• The two most common categories of tasks are supervised learning and unsupervised learning.

• Supervised learning includes tasks for labeled data.

In practice, it's often used as an advanced form of predictive modeling. (make predictions)

• Unsupervised learning includes tasks for unlabeled data. In practice, it's often used either as a form of automated data analysis or automated signal extraction. (extract the

underlying structure)

generalization

representation

Start with a question: Which patients will visit the ER?

Example – Supervised Learning

Example – Supervised Learning

Analyze historical data set that includes predictive attributes and known answer.

Over weight Unmarried

= < $50,000

Example – Supervised Learning

Analyze historical data set that includes predictive attributes and known answer.

Normal weight Married

> $50,000

=

Example – Supervised Learning

Training Test

End with a prediction… Which patients will visit the ER?

Example – Supervised Learning

Poll Question: Example: Diagnosis

• Question: What type of ML problem is this?

• Patient 21 has not been diagnosed yet but is exhibiting symptoms of stuffy nose, sneezing, and sore throat.

• Using only the data in Table 1, rank the three diagnoses (Cold, Flu, and Allergies) in order of how likely Patient 21 has each.

Source: http://www.puzzlor.com/2010-04_Patient21.html

Quiz If we are classify a blueberry muffin.. • What is the target variable? • Which mapping function would we likely use? • What input features would be relevant?

Correctly Classify a Blueberry Muffin

Blueberry Muffin = Yes Mapping Function Color, “blue-berries”, shape, hue, size

𝛾 = f(x) ∈

Quiz If we are trying to predict the risk of readmission.. • What is the target variable? • Which mapping function would we likely use? • What input features would be relevant?

Predict Readmission

Risk of Readmission Mapping Function Age, Charlson Comorbidity Index, Drug/ Alcohol Use, BMI, Marital Status, Previous inpatient visits, previous ED visits, race, Tabak

Mortality Score

𝛾 = f(x) ∈

Signal vs. Noise • Larger houses are more expensive than smaller

houses

• Underpaid employees tend to leave

• Self driving car seeing a red light at an intersection

• Some smaller houses are more expensive than their larger counterparts

• Some underpaid employees stay

• The glare from the sun is also captured by the camera

One way to judge models is by their ability to separate the signal from the noise…

Signal represents our “true” relationship between the input features and the target variable.

What does a good model do for us?

Different machine learning algorithms are simple different ways to estimate the signal.

Co pyright © SAS Inst i tute Inc . A l l r ights reser ved.

Learning Automation Benefit Images Is this you? Increased Security

Transactions Is it Medicaid fraud? Lowered Risk ICU Patient Will they readmit? Improved quality

Medical Images Is this cancer? Better Outcomes Breathing sounds Is this sleep apnea? Cost/ Outcome

Emails Is it spam? Better Experience

In Machine Learning we Create Mappings from Input to Output

Common AI Methods

Learning

Built for Purpose

Machine Learning

Natural Language Processing

Where we are going…

Built for Scale

The Modern AI Journey

Deep Learning

Cognitive Computing

From Data to Action….

Source: Gartner

Experience: Uses and value of data

Source: The Jurney–Warden data-value pyramid of (Agile Data Science 2.0)

Drive, value,

effect, alter, change, deliver

Curate, recommend, understand, infer, learn

Structure, link, metadata, tag, explore, interact, share

Clean, aggregate, visualize, question

Collect, display, plumb individual records

Actions

Predictions

Reports

Charts

Records

… where we extract enough structure from our data to display its properties in aggregate and start to familiarize ourselves with those properties.

The data-value stack begins with the simple display of records

Next comes identifying relationships and exploring data through interactive reports.

This enables statistical inference to generate predictions.

Finally, we use these predictions to drive user behavior in order to create and capture value.

Source Streaming

Clinical and Operational

Other/Scientific

External

Curate and Enrich

• HIE • Sensors • Devices • Monitors • Audio • Video • IoT Logs

• Omics • Images • Biologics • Clinical Trials • Documents • Study Sets

Healthcare Data Curation and Enrichment Hub

IoT Platform

• CIN • SDoH • Benchmarks • Vendor enriched • Claims • Patient Reported • Public Data

• Epic EHR • PeopleSoft ERP • Epic RCM • PeopleSoft SCM • Pop Health

APIs

Organize

• Normalization • Standardization • Longitudinal Patient Record • De-identification • Rules management • Identity management • Security • Compliance • Semantic management

----------------------------------- Logical Data Warehouse ----------------------------------- Manage and Govern (Data Governance, Data Quality, MDM/ Reference data, Data Virtualization, DataOps,

Data Security, Privacy and Identity)

Prepare

• Reports

• Dashboards

• Adhoc Query

• Data Discovery

• Visualization

• Predictive Models

• Data science

• Machine Learning

• Deep Learning

• Cognitive

• Robotics

• Automation

Analyze and Consume

SQL

SQL

On Premise

Cloud

Hadoop

Data Lakes

Virtualized

Maker Space, Departmental APPS

Blob

SQL

Real-time Algorithms

APIs

AP

Is

APIs

AP

Is

APIs

AP

Is

Clin

ical

and

Ope

ratio

nal W

orkf

low

s

Data

Ser

vice

s

•Ont

olog

ies

Dat

a Ca

talo

g

•An

alyt

ics

Pip

elin

e

Data

Ser

vice

s

Self-Service Data

Preparation

• Advanced Models

• Research Studies

• Registries

• Clinical Trials

Acquire

Organize

Analyze

Govern and Protect

Drive, value,

effect, alter, change, deliver

Curate, recommend,

understand, infer, learn

Structure, link, metadata, tag, explore, interact, share

Clean, aggregate, visualize, question

Collect, display, plumb individual records

Data Science Landscape…

Source: The Analytics Lifecycle Toolkit – Gregory S. Nelson Wiley, 2018

Artificial Intelligence

Machine Learning

Deep Learning

Data Mining

Statistics

Data Science

Probabilistic Reasoning • Machine Learning • Predictive Modeling • Deep Learning • Bayesian nets • Decision trees • …

Computational Logic • Logic programming • Rule based systems • Heuristic techniques • Case-based reasoning • Fuzzy Logic • …

Optimization Techniques • Constraint-based • Linear programming • Genetic algorithms • Operations research • …

Natural Language Processing

Knowledge Representation, Learning, and Search

(AI + Pathologist) > Pathologist | AI

Data Sourced From: https://blogs.nvidia.com/blog/2016/09/19/deep-learning-breast-cancer-diagnosis/

Patient Primary Sample Tissue Blocks Slides 96.5% Accuracy

Pathologist

Scanned Slides Software 97.1% Accuracy

Patient Primary Sample Tissue Blocks Slides

Pathologist + 99.5% Accuracy

Scanned Slides Software

… to demonstrate how artificial intelligence tools can be used to predict unplanned hospital and skilled nursing

facility admissions and adverse events… in testing innovative payment and service

delivery models

Source: CMS.gov (March 28, 2019)

$1.65M Risk of inaction …

The U.S. Food and Drug Administration (FDA)

”AI is a device or product that can imitate intelligent behavior or mimics human learning and reasoning.

The FDA approved the first deep learning algorithm for cardiac imaging built by Arterys in 2017. Already a year later, the agency cleared another 12 smart algorithms in healthcare.

The Promise of AI…

Detect sleep apnea from the sound of someone’s voice

Algorithm can detect pneumonia better than radiologists

AI detects cancer better than docs

Algorithm Predicts if Twitter Users Are Becoming Mentally Ill ...

Fear and Uncertainty

What happens if the model is biased? How can I

trust a black-box!

How did you validate the model?

What happens when the model is wrong?

Analysis “Gotchas”

Methodological

Statistical Analysis

Interpretation and Communication

Results

Operationalization

Actionability

Thinking and Intelligence

Cognitive Biases

By 2022, the first U.S. medical malpractice case involving a medical decision made by an advanced AI algorithm will have been heard. It will not be because an algorithm produced an incorrect diagnosis. It will be due to the failure to use an algorithm that was proven to be more accurate and reliable than the human alone.

Source: Gartner D&A Summit, March, 2019

AI Presents New Challenges…

Data needs to be prepared and reliable Data Quality and Provenance

Identify use cases that can show an ROI (don’t boil the ocean)

Tied to Value

Trust Trust begins with transparency and accountability

Must include legal, regulatory, and ethical oversight Governance

Develop competencies across the enterprise Data & Analytics Literacy

The black-box problem also poses issues for physicians, who lack insight into what the AI is actually doing. It’s not that they’re afraid of being replaced; it’s more that they’re afraid of basing decisions on information they can’t see”

Source: Modern Healthcare https://www.modernhealthcare.com/indepth/artificial-intelligence-in-healthcare-makes-slow-impact/

Who is Accountable? Is AI ethical?

How much can we trust the data sources we use?

Do we trust the libraries, services and APIs that deliver algorithms and models?

How can we demonstrate trustworthiness of the outcomes?

Are we clear about the meaning of the data? Do we understand data transformations within pipelines?

How does our solution conform to regulatory requirements and business constraints?

What kind of explanation does this output require if any?

Have we considered alternative data sources for a more complete picture? Are there any implications due to incomplete data?

Do we know what algorithms to use for what problem? Have we found adversarial examples to invalidate the model?

Are there any cultural differences in consuming the outputs? Did we engage relevant experts to validate outputs?

Analytics Product Validation Questions to ask ourselves throughout the process

Product Are we building the right

product?

Process Are we building the

product right?

Discussion

Fraud

Healthcare Fraud Detection

Care Pathways

Adaptive Treatment Planning

Patient Flow

Patient Flow Management

Augmented Intelligence

Computer Assisted Diagnosis

What are the (a) risks and (b) impact of getting these wrong?

Risk Based Approach to Validation

• What can go wrong?

• What is the impact of getting it wrong?

Extend D&A Governance to AI

Healthcare Bots

Internal • ICD10 code bot

• IT staff automation

• EPIC help bot (3rd shift app support)

• HR data management/new hire

Patient Engagement • Symptom Checker/Triage

• Prescription Refill

• Personalized care scenarios

• Wait time and facility locator

Assistive Robotics

Virtual Pets

Assistive ???

AI Governance & Adoption

Machine Learning versus Deep Learning

Deep Learning • The Big Idea:

• a type of machine learning that trains a computer to perform human-like tasks, such as recognizing speech, identifying images or making predictions.

• Instead of organizing data to run through predefined equations, deep learning sets up basic parameters about the data and trains the computer to learn on its own by recognizing patterns using many layers of processing.

• Relationships:

• Deep learning is one of the foundations of cognitive computing.

Source: https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/

DEFINE

Stakeholder analysis

Requirements gathering & elicitation

Problem definition

Question design

Expected benefit

EXPLORE

Exploration of data (breadth & depth)

Data visualization (explore)

Identification of data relationships

Documentation of dataset culture

Generation of descriptive statistics

IDENTIFY

Data extraction

Data integration

Data transformation

ANALYZE

Statistical analysis

Hypothesis testing

Enrichment options

Modeling

PRESENT

Data visualization (inform)

Storyboarding

Results presentation

ROI calculation

Documentation

OPERATIONALIZE

Workflow impact

End-user training

Analytic product calibration

Maintenance

Retuning and improvement

Analytics Product Lifecycle Management

Source: The Analytics Lifecycle Toolkit, Nelson, G. S. Copyright material used with permission

Key Concepts: Summary Artificial Intelligence

Systems that mimics or replicates human

intelligence (& do intelligent things)

Natural Language Processing

Systems that understand and generate language

Natural Language

Understanding Systems that can understand language

(voice and text)

Natural Language

Generation Systems that can generate

language

Machine Learning

Systems that can learn from experience

Deep Learning Systems that use deep neural network on Big

Data

AI Use Cases

Patterns or classes of AI problems...

Algorithmic Medicine Clinical algorithms to drive medical practice

AI Healthcare Advisors Diagnose and treat diseases

Rev-Cycle/ Efficiency NLP + ML to identify revenue opportunities

Diagnostic Interpretation Efficient and accurate readings of imaging studies

Robotic Process Automation Automation of repetitive tasks

Virtual Care Real-time remote monitoring and alerting

Virtual Personal Health Assistants Augmented reality, cognitive computing, sentiment analysis, speech recognition, NLU/NLG

Question and Answers

@gregorysnelson

linkedin.com/in/gregorysnelson

[email protected]

919.931.4736

Contact