43
#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC #OSIsoftUC #PIWorld ©2018 OSIsoft, LLC Advanced Analytics for PI Data for Data Scientists Ahmad Fattahi Manager, Data Science Enablement, OSIsoft Dallas Swift Data Scientist, OSIsoft 1

Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

Embed Size (px)

Citation preview

Page 1: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Advanced Analytics for PI Data for Data Scientists

Ahmad Fattahi – Manager, Data Science Enablement, OSIsoft

Dallas Swift – Data Scientist, OSIsoft

1

Page 2: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Agenda

• Definitions and general concepts

• CRISP-DM Process

• Best practices and pitfalls

• Case study

2

Goal: Gain a better understanding of data science practices for process data and the PI System

Page 3: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Artificial Intelligence

Machine Learning

Deep Learning

Nomenclature

Data Science is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms.

-wikipedia

3

Page 4: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

CRISP-DM

• CRoss Industry Standard Process for Data Mining

• Among most popular methodologies• Emphasizes cycles and iterations

Source: KDnuggets

4

Page 5: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC 5

Story: Optimize Building Energy Consumption

Page 6: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Start from a “Sharp Question”

•“Can the building wake up later?”

Business owner plays a key role

•Facilities Manager

Envision the delivery mechanism

•“Recommendation engine? Direct control?”

SME and data professionals start engaging

•Many conversations until they speak the same language

Inception: Management or SME

6

Page 7: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Myth: The data scientist can do it all!

Targeting the wrong question

Losing sight of bottom line value to the business

Getting crushed between political gears

Pitfalls

7

Page 8: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC 8

• A: Because “Superhero” is not a job title!

• Q: Why did you become data scientists?

Page 9: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Myth: The data scientist can do it all!

Targeting the wrong question

Losing sight of bottom line value to the business

Getting crushed between political gears

Pitfalls

9

Page 10: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Building the “Model”

Engage with data engineers, PI Admins

• Python and R libraries by OSIsoft, PI Web API, AF SDK, PI Integrators, PI SQL libraries

Build the features and the model

• Some features can be built in PI

Constantly ask for validation from the SME

• Does it make sense?

10

Page 11: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Features typically have to be engineered from raw data

It is usually not the traditional “time-series” analysis

PI System can do a lot!

• Raw, summarized, or interpolated data

• Event Frames

• Hierarchy in AF is crucial

SME plays a key role

Process Data Can Be Significantly Different!

11

Page 12: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC 14

… predict?

… control?

Is the goal of the project to…

Page 13: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Explainability

15

Page 14: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Tradeoff

Source: ResearchGate GmbH

17

Page 15: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Building model for something uncontrollable

Mixing correlation with causation

Not including data engineering concerns for deployment

Not leveraging PI capabilities in feature engineering

Pitfalls – Veering off the process

19

Page 16: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Guarantees we answered the right question

Forces us to measure real value, often in dollars, man-hours, or other tangible resources

Not trivial!

Caution: data scientists speak a different language than process people

Evaluation – Loop back with the Business

20

Page 17: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Productizing the model

Simpler models can be deployed in PI; some control models are built into the control network

Consult with PI Admins and Data Engineers early

Data Governance can pose challenges in production

Deployment – Data Engineers Are Key

21

Page 18: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Assume your work is going to be repeated and tweaked frequently

Over time:

• Models veer off

• Physical systems change

• Priorities evolve

• New business owners come

• You get reassigned!

Leverage tools such as Jupyter Notebooks or other commercial platforms

Reproducible Work Is the Differentiator

22

Page 19: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

The Cycle Repeats

23

Page 20: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Case study: Interacting with PI System data

24

Page 21: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Optimize the startup of the Variable Air Volume Cooling (VAVCO) units to improve the building’s energy efficiency

Reduce wasted cooling energy

Time

Unit Startup

Setpointreached

Occupancy start

Cooling Duration

Wasted Energy

7:00 AMCooling

Duration

Page 22: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Setting ourselves up for success

• How are the data streams structured?

• How do the data behave?

• What information is relevant for the problem?

26

Page 23: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Understanding the Asset Framework

Data Sources

PI Server

Data ArchiveAsset

Framework

27

Page 24: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Explore hierarchy and trends

PI System Explorer(Hierarchy)

PI Vision(Values)

28

Page 25: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

• Data science tools are great for data exploration

• R and Python libraries that use PI Web API are available via PI Developers Club

• https://github.com/osimloeff/PI-Web-API-Client-R

• https://github.com/osimloeff/PI-Web-API-Client-Python

29

Leveraging data science tools

Page 26: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Transforming data to information

• How should I aggregate time-series data?

• Which features are relevant for model prediction?

• How can I make the data available for modeling?

30

Page 27: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Time series data are complex!

31

VAVCO-1

Temperature

Air flowHumidity

Temperature

Air flowHumidity

CO2VAVCO-2

Page 28: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Need to shape and export data

33

Page 29: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Labeling the data – easy, right?

34

Temperature

Setpoint

Cooling rate

• Separate first cooling period of day from others

• When is a cooling period finished?

• Typical process data issues (data alignment, gaps, etc.)

Page 30: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Event Frames help aggregate data

35

Page 31: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Data ready to go into model

• PI Integrator for Business Analytics

• PI OLEDB Enterprise• Custom AF SDK

36

Page 32: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

That looks funny…

37

Data gap?

Humidity?Warning signs:• Unexpected straight lines• Missing data• System digital states

Page 33: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Potential energy savings discovered

• Identified important factors for predicting cooling time

• Linear regression fits the data

𝑡𝑐𝑜𝑜𝑙 = 𝑏 +𝑚1𝑥1 +⋯+𝑚𝑘𝑥𝑘

38

Page 34: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Putting the model to work

• How can I operationalize a model after it has been developed?

• What options are available for recording model predictions?

39

Page 35: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Data flow implemented in the lab

PI SystemApache Kafka

Python-based ML Model

PI Integrator for Business Analytics

Advanced

PI Web API• Microsoft Azure Event/IoT Hubs• SAP HANA Smart Data Streaming• Asset Analytics - MATLAB Integration

Consumer

40

Page 36: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Different tools for different stages

• Asset Framework

• PI Vision/PI ProcessBook

• PI DataLink (MS Excel)

• Python/R libraries

41

Page 37: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Different tools for different stages

• PI Integrator for Business Analytics

• PI OLEDB Enterprise

42

Page 38: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Different tools for different stages

• PI Integrators

• Asset Analytics with MATLAB Integration

43

Page 39: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

• Communication is king

• Process data has unique challenges

• PI System has tools to enable data science

• Your knowledge of data science is a major differentiator. Leverage it!

Keys to success

Page 40: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Keep on learning!

• Labs and online courses

• PI World presentations

• Talk to other users, partners, and us

45

List of talks available on PI Square

bit.ly/DSPIWorld18

Page 41: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Ahmad [email protected], Data Science EnablementOSIsoft

Dallas [email protected] ScientistOSIsoft

46

Page 42: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Questions

Please wait for the

microphone before asking

your questions

State your

name & company

Please remember to…

Complete the Online Survey

for this session

47

Page 43: Advanced Analytics for PI Data for Data Scientists · Scientists Ahmad Fattahi –Manager, ... Consult with PI Admins and Data Engineers early ... •PI DataLink (MS Excel)

#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC#OSIsoftUC #PIWorld ©2018 OSIsoft, LLC

Thank You

Merci

Grazie

48