Upload
mark-reynolds
View
131
Download
0
Embed Size (px)
Citation preview
Machine LearningApril 19, 2016
Roadmap to ConstructingA Top Down
Machine Learning Paradigm
2
Introduction to Southwestern Energy
Southwestern Energy Company (NYSE: SWN) is a leading natural gas and oil company with operations predominantly in the United States, engaged in exploration, development and production activities, including related natural gas gathering and marketing.
Source: http://www.swn.com/
3
Machine Learning, Deep Learning, AI
Roadmap to Constructing aTop Down Machine Learning Paradigm
E&P organizations are turning more attention to accumulated data to
enhance operating efficiency, safety, and recovery. The computing
paradigm is shifting, the O&G paradigm is shifting, and the rise of the
machine learning paradigm requires careful attention to top-down
integrated systems engineering. A system approach will be presented to
stimulate out-of-the-box thinking to address the machine learning
paradigm.
4
Past Paradigm Shifts• Seismic• Horizontal Drilling• Off Shore• Factory Drilling
Paradigm Shifts in Process• Big Crew Change• Mobility (anytime,
anywhere)• Big Data• Machine Learning
The Shifting O&G Paradigm
Source: Mark Reynolds, compilation
5
Changing Paradigms• Computing Paradigm
(4th Paradigm / eScience)• O&G Paradigm
(Shale 2.0)
New Paradigms• Machine Learning Paradigm
Paradigms We Are Discussing Today
6
The Structure of Scientific Revolutions
• Normal Science– Equilibrium, harmony
• Model Drift– Outliers cease to be outliers– Ripples turn to discontinuity
• Model Crisis– Alternate methods permitted– Out-of-the-box reconsidered
• Model Revolution– New model becomes the new-normal
• Paradigm Change– (Textbooks play catch-up)
Source: Thomas Kuhn, (1962) The Structure of Scientific Revolutions. University of Chicago Press Mark Reynolds, compilation
Normal
Science
Model Drift
(Anomaly)
Model
Crisis
Model
Revolution
Paradigm
Change
KuhnCycle
7
The Shifting Computing Paradigm
Descriptive and
Formulaic
Hypothetical and
Investigative
Expertise Driven Models
and Cases
Multivariant
Differential Modelling
Source: Mark Reynolds, compilation
eScience
Traditional ScienceEmpirical Theoretical Computation
alData
Exploration
1000s Years100s Years
10s YearsYears
8
The Shifting Computing Paradigms
• O&G is where we found itEmpirical
• O&G is where we expect itTheoretical
• O&G is where we estimate itComputational
• O&G is where we infer itData Exploration
Source: Mark Reynolds, compilation
9
The Machine Learning Paradigm
“ A computer program is said to learn from experience (E) with respect to some class of tasks (T) and performance measure (P), if its performance at tasks in T, as measured by P, improves with experience E. ”
~Tom Mitchell
Source: Tom Mitchell, Mitchell, T. (1997). Machine Learning, McGraw Hill. Mark Reynolds, compilation
Machine Learning is the “Extraction of Wisdom by Understanding the underlying Data”
10
The Catalyst• Data captured by
instruments• Data generated by
simulations• Data acquired by
sensor networks
The Destination• Solutions from data analysis• Solutions from data mining• Solutions from visualization• Solutions from drill down• Solutions for bottom line• Solutions using eScience
Machine Learning in the 4th Paradigm
Source: Mark Reynolds, compilationeScience and the Fourth Paradigm: Data-Intensive Scientific Discovery and Digital Preservation, Tony Hey, Microsoft Research http://www.alliancepermanentaccess.org/wp-content/uploads/2011/12/apa2011/15_%28Nov11%29TonyHey-APA%20Meeting.pdf
“ eScience is the set of tools and technologiesto support data federation and collaboration ”
~ Jim Grey
11
Predictive Analytics• Focuses on Prediction
– Based on Known Properties– Learned from Training Data
Data Mining• Focuses on Discovery
– Unknown Properties in Data– The Analysis Phase of
Knowledge Discovery
Precursors to Machine Learning
Machine Learning is the “Extraction of Wisdom by Understanding the underlying Data”
~Mark Reynolds
Source: Mark Reynolds, compilation
12
The Machine Learning Paradigm
Unsupervised Learning
Supervised Learning
Semi-Supervised Learning
Reinforcement Learning
24/7
Predictive Analytics
Data Mining
Machine Learning
AI
Source: Mark Reynolds, compilation
13
Principal Concepts in Machine Learning
• Unsupervised Learning– Data is unlabeled
• Supervised Learning– Teach and train with data that is well labeled with a
defined output• Reinforcement Learning
– Validity of data alignment is served as feedback• Semi-Supervised Learning
– Some of the data is labeled, some is unlabeled
Source: Mark Reynolds, compilation
14
Domestic• Nest® Thermostats• Pandora / Amazon• Spam Detection• Fraud Detection• Traffic Light Duty Cycle• Google
Upstream O&G• Pump-Jack Duty Cycle
(circa 1986)• Closed Loop Directional
Drilling (circa 2009)
Examples of Machine Learning
15
The Bridge Into Machine Learning
Today Tomorrow
Integrated Systems Engineering
16
Integrated Systems Engineering
Systems & Knowledge Engineer
O&G Systems
Control Systems
Remote Systems
Information Systems
Embedded Systems
Robotic Systems
Data Fusion
Real-Time Systems
Look-Back Analysis
Look-Ahead
SystemsLand and Regulatory
Geology Geophysics
Drilling Engineering
Completion Engineering
Production Engineering
Reservoir Engineering
Systems Engineering
Source: Mark Reynolds, compilation
17
Integrated Engineering – Top-Down
• Engineering the Source– Signals, content, and
characterizations• Engineering the Data
– Address errant data– Address valid spurious data– Address data quality
• Engineering the Store– Repository– Recall and Reporting– Representations
Data Acquisition
Data Transmission
Data Retention
Data Analysis
Data Reduction
Source: Mark Reynolds, compilation
18
Integrated Engineering – Top-Down
• Engineering the Store– Data distribution– Data staging
• Engineering the Recall– Simple query– Cube v Matrix
• Engineering the Use Case– Destination: human– Destination: machine
Classification
Regression
Clustering
Density Estimation
Dimensional Reduction
19
Integrated Engineering – System Flow
Acquire Analyze Annunciate Archive Analyze Anticipate Apply
Data InformationVisualization
KnowledgeForensics
UnderstandingAnalysis &
Mining
WisdomAnticipating Application
Creating Informational Accessibility and Transparency Discovering Experiential Performance Improvements Segmenting Processes and Process Results Replacing Human Decision w/ Automated Algorithms Innovating New Models, Products, Services
Source: Mark Reynolds, compilation
20
Integrated Engineering – Top-DownDa
taQ
uality
Data
Integrity
Data
Validation
DataModeling DataSecurity
Data Mining
Data
Analytics
Proactive & Closed-Loop
Systems
Mining and AnalyticsForensics
Control Visualization
and Observation
Source Capture and
Utilization
• Intelligence during operations (Observation and Anticipation)• Intelligence reviewing operations (Forensic)• Intelligence planning operations (Historical and Analytical)
Source: Mark Reynolds, compilation
Well Plan RT
Prod
RT Drill
Geo-steer
RT Frac
Daily RptsAFE
21
Applied Machine Learning 101
Training Data
Pre-Processing Learning Error
AnalysisModel
Learning (Phase 1)
Prediction (Phase 2)
New Data ModelPredictable
Result
22
Representative Algorithms
• Decision Tree Learning– Maps observation to conclusions
• Association Rule Learning– Discovering interesting relations
• Artificial Neural Networks– Incremental function modules
• Inductive Logic Programming– Rule based representations for input
--> output
• Support Vector Machines– Classification and regression
• Clustering– Assignment of observations to
clusters
• Bayesian Networks– Probabilistic models correlating
variables
• Reinforcement Learning– Finds policy to map states to desired
outcome
• Representation Learning– Principal component analysis
• Similarity & Metric Learning– Pairs of examples train others
• Sparse Dictionary Learning– Datum as linear combinations
• Genetic Algorithms– Mimics natural heuristics
23
Machine Learning: Data Diversity
• Macro (or field-level)– Spatial– Temporal
• Pad (or offset)– Spatial– Temporal
• Well (or wellbore)– Spatial– Temporal
• External– Uploads– Political, Climate, etc
• The 3 Cs of Data Quality– Consistency– Correctness– Completeness– [#4] Currency– [#5] Conformity
Source: Mark Reynolds, compilation
Data Diversity - Spatial, Temporal, Referential
24
The Fast Data ecosystem in O&G
Land
Drilling
Reservoir Completion
Water
Production
Steering Regulatory
Midstream
Source: Assorted web images
25
Algorithmic Approaches (revisited)
• Decision Tree Learning– Maps observation to conclusions
• Association Rule Learning– Discovering interesting relations
• Artificial Neural Networks– Incremental function modules
• Inductive Logic Programming– Rule based representations for input
--> output
• Support Vector Machines– Classification and regression
• Clustering– Assignment of observations to
clusters
• Bayesian Networks– Probabilistic models correlating
variables
• Reinforcement Learning– Finds policy to map states to desired
outcome
• Representation Learning– Principal component analysis
• Similarity & Metric Learning– Pairs of examples train others
• Sparse Dictionary Learning– Datum as linear combinations
• Genetic Algorithms– Mimics natural heuristics
26
Keep Your Eye on the Prize
Data
Information
Knowledge
Understanding
Wisdom
Application
The question is NOT“How can we … ?”
But instead“What is the objective?”
( or “Why?” )
27
Mark Reynolds
Mark Reynolds Vitae• Southwestern Energy• Lone Star College• Intent Driven Designs• Scan Systems• Sikorsky Aircraft• General Dynamics
• Southwestern Energy Email– [email protected]