37
Building Streaming Data Pipelines Using Azure Cloud Services https://www.linkedin.com/in/rolftesmer/ https://mrfoxsql.wordpress.com/

Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

Building Streaming Data PipelinesUsing Azure Cloud Services

httpswwwlinkedincominrolftesmerhttpsmrfoxsqlwordpresscom

Why is data so importantBecause therersquos just so much of it

CLOUD

MOBILE

On-Prem vs IaaS vs PaaS vs SaaS ndash Which One

htt

p

azu

rep

latf

orm

azu

rew

eb

site

snet

Pre

view

Serv

ices

Cortana Intelligence Suite

Action

People

Automated Systems

Apps

Web

Mobile

Bots

Intelligence

Dashboards amp

Visualizations

Personal Digital

Assistant

Bot

Framework

Cognitive

Services

Power BI

Information

Management

Event Hubs

Data Catalog

Data Factory

Machine Learning

and Analytics

HDInsight

(Hadoop and

Spark)

Stream Analytics

Cortana Intelligence

Suite

Data Lake

Analytics

Machine

Learning

Big Data Stores

SQL Data

Warehouse

Data Lake Store

Data Sources

Apps

Sensors and devices

Data

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-vehicle-telemetry

SP

EED

LA

YER

BA

TC

H L

AY

ER

SER

VIN

G L

AY

ER

a data pipeline is the software that consolidates data from

multiple sources and makes it available to be used strategically

a pipeline is a set of data processing elements connected in series where

the output of one element is the input of the next one The elements of a

pipeline are often executed in parallel or in time-sliced fashion

1 Customers are on a multi-year transformational journey

2 Many data sources are not static or at rest

3 Solutions cannot wait for data to be landed before using it

4 building pipelineshellip

bull Historically Complex costly time consuming

bull Today Fast simple ldquofit for purposerdquo services from same data platform

As modern day Data Professionals we have to deal with it

was

Up till ~5-10 years ago it was a central relational platform

hellipandhellip included relational-like services (OLTP OLAP DW ETL MDM +)

hellipandhellip often on-prem or in a hosted DC

hellipandhellip rarely hosted in external public cloud providers (Azure AWS +)

Occasionally included special projects (ie Big Data NoSQL IoT)

httpsmrfoxsqlwordpresscom20170419what-exactly-is-the-data-platform-nowadays

is now

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 2: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

Why is data so importantBecause therersquos just so much of it

CLOUD

MOBILE

On-Prem vs IaaS vs PaaS vs SaaS ndash Which One

htt

p

azu

rep

latf

orm

azu

rew

eb

site

snet

Pre

view

Serv

ices

Cortana Intelligence Suite

Action

People

Automated Systems

Apps

Web

Mobile

Bots

Intelligence

Dashboards amp

Visualizations

Personal Digital

Assistant

Bot

Framework

Cognitive

Services

Power BI

Information

Management

Event Hubs

Data Catalog

Data Factory

Machine Learning

and Analytics

HDInsight

(Hadoop and

Spark)

Stream Analytics

Cortana Intelligence

Suite

Data Lake

Analytics

Machine

Learning

Big Data Stores

SQL Data

Warehouse

Data Lake Store

Data Sources

Apps

Sensors and devices

Data

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-vehicle-telemetry

SP

EED

LA

YER

BA

TC

H L

AY

ER

SER

VIN

G L

AY

ER

a data pipeline is the software that consolidates data from

multiple sources and makes it available to be used strategically

a pipeline is a set of data processing elements connected in series where

the output of one element is the input of the next one The elements of a

pipeline are often executed in parallel or in time-sliced fashion

1 Customers are on a multi-year transformational journey

2 Many data sources are not static or at rest

3 Solutions cannot wait for data to be landed before using it

4 building pipelineshellip

bull Historically Complex costly time consuming

bull Today Fast simple ldquofit for purposerdquo services from same data platform

As modern day Data Professionals we have to deal with it

was

Up till ~5-10 years ago it was a central relational platform

hellipandhellip included relational-like services (OLTP OLAP DW ETL MDM +)

hellipandhellip often on-prem or in a hosted DC

hellipandhellip rarely hosted in external public cloud providers (Azure AWS +)

Occasionally included special projects (ie Big Data NoSQL IoT)

httpsmrfoxsqlwordpresscom20170419what-exactly-is-the-data-platform-nowadays

is now

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 3: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

On-Prem vs IaaS vs PaaS vs SaaS ndash Which One

htt

p

azu

rep

latf

orm

azu

rew

eb

site

snet

Pre

view

Serv

ices

Cortana Intelligence Suite

Action

People

Automated Systems

Apps

Web

Mobile

Bots

Intelligence

Dashboards amp

Visualizations

Personal Digital

Assistant

Bot

Framework

Cognitive

Services

Power BI

Information

Management

Event Hubs

Data Catalog

Data Factory

Machine Learning

and Analytics

HDInsight

(Hadoop and

Spark)

Stream Analytics

Cortana Intelligence

Suite

Data Lake

Analytics

Machine

Learning

Big Data Stores

SQL Data

Warehouse

Data Lake Store

Data Sources

Apps

Sensors and devices

Data

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-vehicle-telemetry

SP

EED

LA

YER

BA

TC

H L

AY

ER

SER

VIN

G L

AY

ER

a data pipeline is the software that consolidates data from

multiple sources and makes it available to be used strategically

a pipeline is a set of data processing elements connected in series where

the output of one element is the input of the next one The elements of a

pipeline are often executed in parallel or in time-sliced fashion

1 Customers are on a multi-year transformational journey

2 Many data sources are not static or at rest

3 Solutions cannot wait for data to be landed before using it

4 building pipelineshellip

bull Historically Complex costly time consuming

bull Today Fast simple ldquofit for purposerdquo services from same data platform

As modern day Data Professionals we have to deal with it

was

Up till ~5-10 years ago it was a central relational platform

hellipandhellip included relational-like services (OLTP OLAP DW ETL MDM +)

hellipandhellip often on-prem or in a hosted DC

hellipandhellip rarely hosted in external public cloud providers (Azure AWS +)

Occasionally included special projects (ie Big Data NoSQL IoT)

httpsmrfoxsqlwordpresscom20170419what-exactly-is-the-data-platform-nowadays

is now

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 4: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

htt

p

azu

rep

latf

orm

azu

rew

eb

site

snet

Pre

view

Serv

ices

Cortana Intelligence Suite

Action

People

Automated Systems

Apps

Web

Mobile

Bots

Intelligence

Dashboards amp

Visualizations

Personal Digital

Assistant

Bot

Framework

Cognitive

Services

Power BI

Information

Management

Event Hubs

Data Catalog

Data Factory

Machine Learning

and Analytics

HDInsight

(Hadoop and

Spark)

Stream Analytics

Cortana Intelligence

Suite

Data Lake

Analytics

Machine

Learning

Big Data Stores

SQL Data

Warehouse

Data Lake Store

Data Sources

Apps

Sensors and devices

Data

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-vehicle-telemetry

SP

EED

LA

YER

BA

TC

H L

AY

ER

SER

VIN

G L

AY

ER

a data pipeline is the software that consolidates data from

multiple sources and makes it available to be used strategically

a pipeline is a set of data processing elements connected in series where

the output of one element is the input of the next one The elements of a

pipeline are often executed in parallel or in time-sliced fashion

1 Customers are on a multi-year transformational journey

2 Many data sources are not static or at rest

3 Solutions cannot wait for data to be landed before using it

4 building pipelineshellip

bull Historically Complex costly time consuming

bull Today Fast simple ldquofit for purposerdquo services from same data platform

As modern day Data Professionals we have to deal with it

was

Up till ~5-10 years ago it was a central relational platform

hellipandhellip included relational-like services (OLTP OLAP DW ETL MDM +)

hellipandhellip often on-prem or in a hosted DC

hellipandhellip rarely hosted in external public cloud providers (Azure AWS +)

Occasionally included special projects (ie Big Data NoSQL IoT)

httpsmrfoxsqlwordpresscom20170419what-exactly-is-the-data-platform-nowadays

is now

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 5: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

Cortana Intelligence Suite

Action

People

Automated Systems

Apps

Web

Mobile

Bots

Intelligence

Dashboards amp

Visualizations

Personal Digital

Assistant

Bot

Framework

Cognitive

Services

Power BI

Information

Management

Event Hubs

Data Catalog

Data Factory

Machine Learning

and Analytics

HDInsight

(Hadoop and

Spark)

Stream Analytics

Cortana Intelligence

Suite

Data Lake

Analytics

Machine

Learning

Big Data Stores

SQL Data

Warehouse

Data Lake Store

Data Sources

Apps

Sensors and devices

Data

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-vehicle-telemetry

SP

EED

LA

YER

BA

TC

H L

AY

ER

SER

VIN

G L

AY

ER

a data pipeline is the software that consolidates data from

multiple sources and makes it available to be used strategically

a pipeline is a set of data processing elements connected in series where

the output of one element is the input of the next one The elements of a

pipeline are often executed in parallel or in time-sliced fashion

1 Customers are on a multi-year transformational journey

2 Many data sources are not static or at rest

3 Solutions cannot wait for data to be landed before using it

4 building pipelineshellip

bull Historically Complex costly time consuming

bull Today Fast simple ldquofit for purposerdquo services from same data platform

As modern day Data Professionals we have to deal with it

was

Up till ~5-10 years ago it was a central relational platform

hellipandhellip included relational-like services (OLTP OLAP DW ETL MDM +)

hellipandhellip often on-prem or in a hosted DC

hellipandhellip rarely hosted in external public cloud providers (Azure AWS +)

Occasionally included special projects (ie Big Data NoSQL IoT)

httpsmrfoxsqlwordpresscom20170419what-exactly-is-the-data-platform-nowadays

is now

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 6: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-vehicle-telemetry

SP

EED

LA

YER

BA

TC

H L

AY

ER

SER

VIN

G L

AY

ER

a data pipeline is the software that consolidates data from

multiple sources and makes it available to be used strategically

a pipeline is a set of data processing elements connected in series where

the output of one element is the input of the next one The elements of a

pipeline are often executed in parallel or in time-sliced fashion

1 Customers are on a multi-year transformational journey

2 Many data sources are not static or at rest

3 Solutions cannot wait for data to be landed before using it

4 building pipelineshellip

bull Historically Complex costly time consuming

bull Today Fast simple ldquofit for purposerdquo services from same data platform

As modern day Data Professionals we have to deal with it

was

Up till ~5-10 years ago it was a central relational platform

hellipandhellip included relational-like services (OLTP OLAP DW ETL MDM +)

hellipandhellip often on-prem or in a hosted DC

hellipandhellip rarely hosted in external public cloud providers (Azure AWS +)

Occasionally included special projects (ie Big Data NoSQL IoT)

httpsmrfoxsqlwordpresscom20170419what-exactly-is-the-data-platform-nowadays

is now

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 7: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

a data pipeline is the software that consolidates data from

multiple sources and makes it available to be used strategically

a pipeline is a set of data processing elements connected in series where

the output of one element is the input of the next one The elements of a

pipeline are often executed in parallel or in time-sliced fashion

1 Customers are on a multi-year transformational journey

2 Many data sources are not static or at rest

3 Solutions cannot wait for data to be landed before using it

4 building pipelineshellip

bull Historically Complex costly time consuming

bull Today Fast simple ldquofit for purposerdquo services from same data platform

As modern day Data Professionals we have to deal with it

was

Up till ~5-10 years ago it was a central relational platform

hellipandhellip included relational-like services (OLTP OLAP DW ETL MDM +)

hellipandhellip often on-prem or in a hosted DC

hellipandhellip rarely hosted in external public cloud providers (Azure AWS +)

Occasionally included special projects (ie Big Data NoSQL IoT)

httpsmrfoxsqlwordpresscom20170419what-exactly-is-the-data-platform-nowadays

is now

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 8: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

1 Customers are on a multi-year transformational journey

2 Many data sources are not static or at rest

3 Solutions cannot wait for data to be landed before using it

4 building pipelineshellip

bull Historically Complex costly time consuming

bull Today Fast simple ldquofit for purposerdquo services from same data platform

As modern day Data Professionals we have to deal with it

was

Up till ~5-10 years ago it was a central relational platform

hellipandhellip included relational-like services (OLTP OLAP DW ETL MDM +)

hellipandhellip often on-prem or in a hosted DC

hellipandhellip rarely hosted in external public cloud providers (Azure AWS +)

Occasionally included special projects (ie Big Data NoSQL IoT)

httpsmrfoxsqlwordpresscom20170419what-exactly-is-the-data-platform-nowadays

is now

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 9: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

was

Up till ~5-10 years ago it was a central relational platform

hellipandhellip included relational-like services (OLTP OLAP DW ETL MDM +)

hellipandhellip often on-prem or in a hosted DC

hellipandhellip rarely hosted in external public cloud providers (Azure AWS +)

Occasionally included special projects (ie Big Data NoSQL IoT)

httpsmrfoxsqlwordpresscom20170419what-exactly-is-the-data-platform-nowadays

is now

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 10: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

is now

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 11: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

Event Hubs Stream Analytics

SQL Data

Warehouse

Storage blob

Logic App

Machine Learning

Data Lake

Data Factory

Machine LearningAPI Calls Selective

Load

SelectiveLoad

Report

CognitiveAPI Calls

Real-TimeReport

Archive

ScheduledPull

Full Load

IntelligenceReport

AnalyticsReport

TrendReport

Ingestion

CEPIn-Stream Analytics

Data Movement Orchestration

Reporting Visualisation

Workflow Logic

Intelligence

Operationalised Data Science

Unstructured Storage

Structured Storage

Data Flow

General Archive

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 12: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

Azure

Stream Analytics SQL Database

Storage blob

Machine Learning

EventJSON

G-Force PredictionAPI

All EventsCSV

REAL-TIME Event Telemetry Report

Streaming Dataset

Event ArchiveJSON

EventJSON

ON-DEMANDEvent Trend Report

SQL Query

Event Hubs

Alert EventsG-Force gt 3

JSON

Function

New EventTriggerJSON

EVENT-DRIVENTwilio Phone Call

IoT HubMobile

Alert EventCSV

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 13: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

SH_Data_Streaming

(West Europe)

Event Hub

SHIngress

JSONEvent Type

Blob Store ADLS

SHEventStore

Realtime Stream(200K rows moving window)

Reportingdata

Server

Server

Server

Server

Stream Analytics

SHEgressASDB

Telemetry

Bookings

Agents

Proviers

Stream Analytics

SHEgressPBI

AVRO Event ArchiveBatch

(COLD Path)

JSON EventsStream

(HOT Path)

TabularEvents

JSON EventsStream

(HOT Path)

Real Time

Dashboards

(troyearle)

Historical

Reports

(troyearle)

search

SQL SP

Logic App

SHLogicApp

PostEvents

JSONEvent

SQL DB

SHEventHistory

(Short Term Store)

ServiceBus QSHSBQEgress

JSON Report

alerts

reports

reports

ref data

Archive

Avg 56GBday

AEH Input SU12

Max 3900sec

Avg 2200sec

SQL DB P2 (20)

Max 3900sec

Avg 2200sec

(5 days = 1b rows)

(1 year = 72b rows)

Alerts Reports

~1hour

Service Bus Queue

~1hour

PBI Input

3900Sec

Telemetry Input

3900sec

SH event AEH ASA = lt 5 sec ASA SQL = lt 5 sec

Hourly

On-Demand

1 Min Window

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 14: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

Average Load 1410000000 week

= 201000000 day

= 8392000 hour

= 139000 min

= 2330 sec

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 15: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

600 increaseover 9 hours

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 16: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

httpsgallerycortanaintelligencecombrowsecategories=[10]amporderby=freshness desc

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 17: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

4 Customer ldquoexpectationrdquohellip

hellipThis is the ldquoDomain of the Data Professionalrdquo

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 18: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

bull Vehicle Telemetry

httpsgallerycortanaintelligencecomSolutionTelemetry-Analytics

httpsgallerycortanaintelligencecomSolutionPersonalized-Offers-2

httpsgallerycortanaintelligencecomSolutionDemand-Forecasting-3

bull Developing IoT Solutions with Azure IoT

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

bull Processing Real-Time Data Streams in Azure

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

bull Orchestrating Big Data with Azure Data Factory

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 19: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

Social Media PipelineRegion Australia SE

FunctionNet (C)

Azure SQL DBSentiment Schema

CallTweet DataSentiment

Key Phrases

powerbicom

Azure Machine Learning

DataConnection

NewPower BIReports

(optional)On demandData Science

Power BI DesktopOn-Prem

Office 365Power BI

Executive

Social Marketing

C Level Dashboards

MarketingDashboards

Azure Public Cloud

TweetsHandles

Tags

Azure Machine LearningRegion Southeast Asia

Azure Cognitive ServicesRegion West US

Text Analytic API

SentimentKey Phrases

(optional)ML Models

Twitter

Logic AppCheck TwitterEvery 3 min

httpspowerbimicrosoftcomen-ussolution-templatesbrand-management-twitter

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 20: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

httpazureplatformazurewebsitesneten-us

httpsazuremicrosoftcomen-aublogannouncing-azure-time-series-insights

httpscodemsdnmicrosoftcomwindowsappsService-Bus-Explorer-f2abca5a

httpsgallerycortanaintelligencecom

httpsdocsmicrosoftcomen-usazuremachine-learningcortana-analytics-playbook-predictive-maintenance

httpsdocsmicrosoftcomen-usazuremachine-learningmachine-learning-apps-anomaly-detection-api

httpswwwedxorgcoursedeveloping-iot-solutions-azure-iot-microsoft-dev225x

httpswwwedxorgcourseprocessing-real-time-data-streams-azure-microsoft-dat223-2x-0

httpswwwedxorgcourseorchestrating-big-data-azure-data-microsoft-dat223-3x-0

httpssocialtechnetmicrosoftcomwikicontentsarticles33626lambda-architecture-implementation-using-microsoft-azureaspx

httpsazuremicrosoftcomen-auupdatesmicrosoft-azure-iot-reference-architecture-available

httpsenwikipediaorgwikiLambda_architecture

httpsmsdnmicrosoftcomen-uslibraryazuredn834998aspx

httpsmsdnmicrosoftcomen-uslibraryazuredn835019aspx

httpsdocsmicrosoftcomen-usazurestream-analyticsstream-analytics-stream-analytics-query-patterns

httpstorageexplorercom

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 21: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

httpsazuremicrosoftcomen-usservicesevent-hubs

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 22: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

httpsazuremicrosoftcomen-usservicesstream-analytics

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 23: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

httpsazuremicrosoftcomen-ussolutionsdata-lakehttpsazuremicrosoftcomen-usservicesdata-lake-analytics

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 24: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

httpsazuremicrosoftcomen-usservicessql-data-warehousehttpsenwikipediaorgwikiMassively_parallel_(computing)

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 25: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

httpsazuremicrosoftcomen-usservicesmachine-learning

Classification

bull Assign a category to each item

(ie tweet data sentiment

analysis)

Regression

bull Predict a real value for each

item based on features

(ie predict house sale price)

Clustering

bull Partition items into

homogeneous groups

(ie finding similar companies

based on characteristics)

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 26: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

Azure Cognitive

Services APIrsquos

Give your solutions

a human side

httpswwwmicrosoftcomcognitive-servicesen-usdocumentation

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 27: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

httpsazuremicrosoftcomen-usservicesdata-factory

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-movement-activities

httpsazuremicrosoftcomen-usdocumentationarticlesdata-factory-data-transformation-activities

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 28: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

What is itFully managed cloud metadata repository service

Discover catalog and make searchable various business data sources

Manage the process of locating and securely consuming those sources

Crowdsource annotation of the data source tablesobjects and columns

Simple to use web interface for registering and managing data sources

ADC keeps track of the data sources it DOES NOT hold the data

What can you do with it (Use Cases)Want to centrally register all relevant business data sources

Self-Service BI and providing power users a central point to locate the data they need

Capturing tribal business data knowledge (crowdsourcing data documentation)

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup

Page 29: Building Streaming Data Pipelines - WordPress.com · 2017-05-16 · Azure Stream Analytics SQL Database Storage blob Machine Learning Event JSON G -Force Prediction API All Events

Azure CosmosDB (DocDB) (NoSQL) (PaaS)NoSQL document database-as-a-service (PaaS) managed by Microsoft Azure

Native support for JavaScript SQL and txns over schema-free JSON documents

[JSON = JavaScript Object Notation]

Built for cloud-designed apps

bull Write procedures triggers and UDFrsquos using JavaScript

bull Reliable and predictable performance scale up on demand

bull Automatic geo-redundant data copies automated backup