27
Machine Learning April 19, 2016 Roadmap to Constructing A Top Down Machine Learning Paradigm

2016 04-19 machine learning

Embed Size (px)

Citation preview

Page 1: 2016 04-19 machine learning

Machine LearningApril 19, 2016

Roadmap to ConstructingA Top Down

Machine Learning Paradigm

Page 2: 2016 04-19 machine learning

2

Introduction to Southwestern Energy

Southwestern Energy Company (NYSE: SWN) is a leading natural gas and oil company with operations predominantly in the United States, engaged in exploration, development and production activities, including related natural gas gathering and marketing.

Source: http://www.swn.com/

Page 3: 2016 04-19 machine learning

3

Machine Learning, Deep Learning, AI

Roadmap to Constructing aTop Down Machine Learning Paradigm

E&P organizations are turning more attention to accumulated data to

enhance operating efficiency, safety, and recovery. The computing

paradigm is shifting, the O&G paradigm is shifting, and the rise of the

machine learning paradigm requires careful attention to top-down

integrated systems engineering. A system approach will be presented to

stimulate out-of-the-box thinking to address the machine learning

paradigm.

Page 4: 2016 04-19 machine learning

4

Past Paradigm Shifts• Seismic• Horizontal Drilling• Off Shore• Factory Drilling

Paradigm Shifts in Process• Big Crew Change• Mobility (anytime,

anywhere)• Big Data• Machine Learning

The Shifting O&G Paradigm

Source: Mark Reynolds, compilation

Page 5: 2016 04-19 machine learning

5

Changing Paradigms• Computing Paradigm

(4th Paradigm / eScience)• O&G Paradigm

(Shale 2.0)

New Paradigms• Machine Learning Paradigm

Paradigms We Are Discussing Today

Page 6: 2016 04-19 machine learning

6

The Structure of Scientific Revolutions

• Normal Science– Equilibrium, harmony

• Model Drift– Outliers cease to be outliers– Ripples turn to discontinuity

• Model Crisis– Alternate methods permitted– Out-of-the-box reconsidered

• Model Revolution– New model becomes the new-normal

• Paradigm Change– (Textbooks play catch-up)

Source: Thomas Kuhn, (1962) The Structure of Scientific Revolutions. University of Chicago Press Mark Reynolds, compilation

Normal

Science

Model Drift

(Anomaly)

Model

Crisis

Model

Revolution

Paradigm

Change

KuhnCycle

Page 7: 2016 04-19 machine learning

7

The Shifting Computing Paradigm

Descriptive and

Formulaic

Hypothetical and

Investigative

Expertise Driven Models

and Cases

Multivariant

Differential Modelling

Source: Mark Reynolds, compilation

eScience

Traditional ScienceEmpirical Theoretical Computation

alData

Exploration

1000s Years100s Years

10s YearsYears

Page 8: 2016 04-19 machine learning

8

The Shifting Computing Paradigms

• O&G is where we found itEmpirical

• O&G is where we expect itTheoretical

• O&G is where we estimate itComputational

• O&G is where we infer itData Exploration

Source: Mark Reynolds, compilation

Page 9: 2016 04-19 machine learning

9

The Machine Learning Paradigm

“ A computer program is said to learn from experience (E) with respect to some class of tasks (T) and performance measure (P), if its performance at tasks in T, as measured by P, improves with experience E. ”

~Tom Mitchell

Source: Tom Mitchell, Mitchell, T. (1997). Machine Learning, McGraw Hill. Mark Reynolds, compilation

Machine Learning is the “Extraction of Wisdom by Understanding the underlying Data”

Page 10: 2016 04-19 machine learning

10

The Catalyst• Data captured by

instruments• Data generated by

simulations• Data acquired by

sensor networks

The Destination• Solutions from data analysis• Solutions from data mining• Solutions from visualization• Solutions from drill down• Solutions for bottom line• Solutions using eScience

Machine Learning in the 4th Paradigm

Source: Mark Reynolds, compilationeScience and the Fourth Paradigm: Data-Intensive Scientific Discovery and Digital Preservation, Tony Hey, Microsoft Research http://www.alliancepermanentaccess.org/wp-content/uploads/2011/12/apa2011/15_%28Nov11%29TonyHey-APA%20Meeting.pdf

“ eScience is the set of tools and technologiesto support data federation and collaboration ”

~ Jim Grey

Page 11: 2016 04-19 machine learning

11

Predictive Analytics• Focuses on Prediction

– Based on Known Properties– Learned from Training Data

Data Mining• Focuses on Discovery

– Unknown Properties in Data– The Analysis Phase of

Knowledge Discovery

Precursors to Machine Learning

Machine Learning is the “Extraction of Wisdom by Understanding the underlying Data”

~Mark Reynolds

Source: Mark Reynolds, compilation

Page 12: 2016 04-19 machine learning

12

The Machine Learning Paradigm

Unsupervised Learning

Supervised Learning

Semi-Supervised Learning

Reinforcement Learning

24/7

Predictive Analytics

Data Mining

Machine Learning

AI

Source: Mark Reynolds, compilation

Page 13: 2016 04-19 machine learning

13

Principal Concepts in Machine Learning

• Unsupervised Learning– Data is unlabeled

• Supervised Learning– Teach and train with data that is well labeled with a

defined output• Reinforcement Learning

– Validity of data alignment is served as feedback• Semi-Supervised Learning

– Some of the data is labeled, some is unlabeled

Source: Mark Reynolds, compilation

Page 14: 2016 04-19 machine learning

14

Domestic• Nest® Thermostats• Pandora / Amazon• Spam Detection• Fraud Detection• Traffic Light Duty Cycle• Google

Upstream O&G• Pump-Jack Duty Cycle

(circa 1986)• Closed Loop Directional

Drilling (circa 2009)

Examples of Machine Learning

Page 15: 2016 04-19 machine learning

15

The Bridge Into Machine Learning

Today Tomorrow

Integrated Systems Engineering

Page 16: 2016 04-19 machine learning

16

Integrated Systems Engineering

Systems & Knowledge Engineer

O&G Systems

Control Systems

Remote Systems

Information Systems

Embedded Systems

Robotic Systems

Data Fusion

Real-Time Systems

Look-Back Analysis

Look-Ahead

SystemsLand and Regulatory

Geology Geophysics

Drilling Engineering

Completion Engineering

Production Engineering

Reservoir Engineering

Systems Engineering

Source: Mark Reynolds, compilation

Page 17: 2016 04-19 machine learning

17

Integrated Engineering – Top-Down

• Engineering the Source– Signals, content, and

characterizations• Engineering the Data

– Address errant data– Address valid spurious data– Address data quality

• Engineering the Store– Repository– Recall and Reporting– Representations

Data Acquisition

Data Transmission

Data Retention

Data Analysis

Data Reduction

Source: Mark Reynolds, compilation

Page 18: 2016 04-19 machine learning

18

Integrated Engineering – Top-Down

• Engineering the Store– Data distribution– Data staging

• Engineering the Recall– Simple query– Cube v Matrix

• Engineering the Use Case– Destination: human– Destination: machine

Classification

Regression

Clustering

Density Estimation

Dimensional Reduction

Page 19: 2016 04-19 machine learning

19

Integrated Engineering – System Flow

Acquire Analyze Annunciate Archive Analyze Anticipate Apply

Data InformationVisualization

KnowledgeForensics

UnderstandingAnalysis &

Mining

WisdomAnticipating Application

Creating Informational Accessibility and Transparency Discovering Experiential Performance Improvements Segmenting Processes and Process Results Replacing Human Decision w/ Automated Algorithms Innovating New Models, Products, Services

Source: Mark Reynolds, compilation

Page 20: 2016 04-19 machine learning

20

Integrated Engineering – Top-DownDa

taQ

uality

Data

Integrity

Data

Validation

DataModeling DataSecurity

Data Mining

Data

Analytics

Proactive & Closed-Loop

Systems

Mining and AnalyticsForensics

Control Visualization

and Observation

Source Capture and

Utilization

• Intelligence during operations (Observation and Anticipation)• Intelligence reviewing operations (Forensic)• Intelligence planning operations (Historical and Analytical)

Source: Mark Reynolds, compilation

Well Plan RT

Prod

RT Drill

Geo-steer

RT Frac

Daily RptsAFE

Page 21: 2016 04-19 machine learning

21

Applied Machine Learning 101

Training Data

Pre-Processing Learning Error

AnalysisModel

Learning (Phase 1)

Prediction (Phase 2)

New Data ModelPredictable

Result

Page 22: 2016 04-19 machine learning

22

Representative Algorithms

• Decision Tree Learning– Maps observation to conclusions

• Association Rule Learning– Discovering interesting relations

• Artificial Neural Networks– Incremental function modules

• Inductive Logic Programming– Rule based representations for input

--> output

• Support Vector Machines– Classification and regression

• Clustering– Assignment of observations to

clusters

• Bayesian Networks– Probabilistic models correlating

variables

• Reinforcement Learning– Finds policy to map states to desired

outcome

• Representation Learning– Principal component analysis

• Similarity & Metric Learning– Pairs of examples train others

• Sparse Dictionary Learning– Datum as linear combinations

• Genetic Algorithms– Mimics natural heuristics

Page 23: 2016 04-19 machine learning

23

Machine Learning: Data Diversity

• Macro (or field-level)– Spatial– Temporal

• Pad (or offset)– Spatial– Temporal

• Well (or wellbore)– Spatial– Temporal

• External– Uploads– Political, Climate, etc

• The 3 Cs of Data Quality– Consistency– Correctness– Completeness– [#4] Currency– [#5] Conformity

Source: Mark Reynolds, compilation

Data Diversity - Spatial, Temporal, Referential

Page 24: 2016 04-19 machine learning

24

The Fast Data ecosystem in O&G

Land

Drilling

Reservoir Completion

Water

Production

Steering Regulatory

Midstream

Source: Assorted web images

Page 25: 2016 04-19 machine learning

25

Algorithmic Approaches (revisited)

• Decision Tree Learning– Maps observation to conclusions

• Association Rule Learning– Discovering interesting relations

• Artificial Neural Networks– Incremental function modules

• Inductive Logic Programming– Rule based representations for input

--> output

• Support Vector Machines– Classification and regression

• Clustering– Assignment of observations to

clusters

• Bayesian Networks– Probabilistic models correlating

variables

• Reinforcement Learning– Finds policy to map states to desired

outcome

• Representation Learning– Principal component analysis

• Similarity & Metric Learning– Pairs of examples train others

• Sparse Dictionary Learning– Datum as linear combinations

• Genetic Algorithms– Mimics natural heuristics

Page 26: 2016 04-19 machine learning

26

Keep Your Eye on the Prize

Data

Information

Knowledge

Understanding

Wisdom

Application

The question is NOT“How can we … ?”

But instead“What is the objective?”

( or “Why?” )

Page 27: 2016 04-19 machine learning

27

Mark Reynolds

Mark Reynolds Vitae• Southwestern Energy• Lone Star College• Intent Driven Designs• Scan Systems• Sikorsky Aircraft• General Dynamics

• Southwestern Energy Email– [email protected]