88
SQL Server 2014 Faster Insight from Any Data Stéphane Fréchette Friday May 9, 2014

SQL Server 2014 Faster Insights from Any Data

Embed Size (px)

DESCRIPTION

Presented @ Ottawa SQL Server Day SQL Server 2014’s mission is to deliver for our customers mission critical performance for the most demanding database applications, hitting on all aspects of mission critical criteria from performance to security, scalability and high availability along with the mission critical support. When it comes to business intelligence the mission is to deliver faster insights into any data big data, small data, all data and most importantly deliver BI in a consumable manner for business users through familiar tools.

Citation preview

Page 1: SQL Server 2014 Faster Insights from Any Data

SQL Server 2014Faster Insight from Any Data

Stéphane Fréchette

Friday May 9, 2014

Page 2: SQL Server 2014 Faster Insights from Any Data

Email: [email protected]

Twitter: @sfrechette

Blog: stephanefrechette.com

Stéphane FréchetteFounder, CEO | Strategic consultant

Microsoft SQL Server MVP

Page 3: SQL Server 2014 Faster Insights from Any Data

Session Overview

Page 4: SQL Server 2014 Faster Insights from Any Data
Page 5: SQL Server 2014 Faster Insights from Any Data
Page 6: SQL Server 2014 Faster Insights from Any Data

Excel BI | Capabilities

Page 7: SQL Server 2014 Faster Insights from Any Data

Microsoft Power BI for Office 365

1 in 4 enterprise customers on Office 3651 Billion Office Users

Analyze Visualize Share Find

Q&A

MobileDiscover

Scalable | Manageable | Trusted

Page 8: SQL Server 2014 Faster Insights from Any Data

Extend with Hybrid Cloud Solutions

Page 9: SQL Server 2014 Faster Insights from Any Data

Extend with Hybrid Cloud Solutions

Page 10: SQL Server 2014 Faster Insights from Any Data

Extend with Hybrid Cloud Solutions

Page 11: SQL Server 2014 Faster Insights from Any Data

Power Query, PowerPivot, Power View, and Power Map

Page 12: SQL Server 2014 Faster Insights from Any Data

Powerful Self-Service BI with Excel 2013

Page 13: SQL Server 2014 Faster Insights from Any Data

Power QueryEnable self-service data discovery, query, transformation and mashup experiences for Information

Workers, via Excel and PowerPivot

Discovery and connectivity to a wide range of data sources, spanning volume as well as variety of data.

Highly interactive and intuitive experience for rapidly and iteratively building queries over any data source, any size.

Consistency of experience, and parity of query capabilities over all data sources.

Joins across different data sources; ability to create custom views over data that can then be shared with team/department.

Page 14: SQL Server 2014 Faster Insights from Any Data

Power QueryDiscover, combine, and refine Big Data, small data, and any data with Data

Explorer for Excel.

Page 15: SQL Server 2014 Faster Insights from Any Data

S

Data Sources

Windows Azure

Marketplace

Windows Active

Directory

Azure SQL

DatabaseAzure HDInsight

Page 16: SQL Server 2014 Faster Insights from Any Data

Powerful Self-Service BI with Excel 2013

Page 17: SQL Server 2014 Faster Insights from Any Data

Introducing PowerPivot

Page 18: SQL Server 2014 Faster Insights from Any Data

PowerPivot for SharePoint

Page 19: SQL Server 2014 Faster Insights from Any Data

Powerful Self-Service BI with Excel 2013

Page 20: SQL Server 2014 Faster Insights from Any Data

Introducing Power View

Page 21: SQL Server 2014 Faster Insights from Any Data

Power View for Multidimensional Models• Power View on Analysis Services via BISM

• Native support for DAX in Analysis Services

• Better flexibility: Choice of DAX on Tabular or Multidimensional (cubes)

Page 22: SQL Server 2014 Faster Insights from Any Data

Architecture

Internet Explorer

Analysis Services

BI Semantic Model

Tabular

SharePoint

(2010 or 2013)

Reporting

Services

Power ViewAnalysis Services

BI Semantic Model

Multidimensional

SQL Server Data Tools

SQL Server Data Tools

1

2

35

6

4

Page 23: SQL Server 2014 Faster Insights from Any Data

BI Semantic Model: ArchitectureThird-party

applications

Reporting Services

(Power View) Excel PowerPivot

Databases LOB Applications Files OData Feeds Cloud Services

SharePoint

Insights

Page 24: SQL Server 2014 Faster Insights from Any Data

BISM-MD Object Tabular Object

Cube Model

Cube Dimension Table

Attributes (Key(s), Name) Columns

Measure Group Table

Measure Measure

Measure without MeasureGroup Within Table called “Measures”

MeasuregroupCube Dimension relationship Relationship

Perspective Perspective

KPI KPI

User/Parent-Child Hierarchies Hierarchies

Multidimensional-Tabular Mapping

Page 25: SQL Server 2014 Faster Insights from Any Data

Powerful Self-Service BI with Excel 2013

Page 26: SQL Server 2014 Faster Insights from Any Data

Power Map for Microsoft Excel enables information workers to discover and share new insights

from geographical and temporal data through three-dimensional storytelling.

What Is Power Map?

Page 27: SQL Server 2014 Faster Insights from Any Data

Map Data

• Data in Excel

• Geo-Code

• 3D and 3 Visuals

Discover Insights

• Play over Time

• Annotate points

• Capture scenes

Share Stories

• Cinematic Effects

• Interactive Tours

• Share Workbook

Power Map: Steps to 3D insights

Page 28: SQL Server 2014 Faster Insights from Any Data

Map Data

Page 29: SQL Server 2014 Faster Insights from Any Data

Discover Insights

Page 30: SQL Server 2014 Faster Insights from Any Data

Share Stories

•• Export to Video for Viral!

Page 31: SQL Server 2014 Faster Insights from Any Data

Power MapExcel Add-in to Enhance Data Visualization

Page 32: SQL Server 2014 Faster Insights from Any Data

Power BI Site

Page 33: SQL Server 2014 Faster Insights from Any Data

Power BI for Office 365 | Capabilities

Page 34: SQL Server 2014 Faster Insights from Any Data

Power BI for Office 365 | Capabilities

Page 35: SQL Server 2014 Faster Insights from Any Data

Power BI for Office 365 | Capabilities

Page 36: SQL Server 2014 Faster Insights from Any Data

Power BI for Office 365 | Capabilities

Corporate

Data Sources

Page 37: SQL Server 2014 Faster Insights from Any Data

Data Management Gateway

Enabling Corporate

OData Feeds

Enabling Excel Workbook

Data Refresh using

SharePoint Online

Enabling

Discovery in

Power Query

capabilities

Power BI Admin CenterData Management Gateway

Page 38: SQL Server 2014 Faster Insights from Any Data

Data Management Gateway - Conceptual

Power BI Admin CenterAllows IT to configure, manage

and monitor access to

corporate data sources.

Data Management Gateway

Connects to corporate data

sources and sends data to

Microsoft cloud services through a

secure channel (Service Bus).

Corporate Data SourcesThe Gateway can connect to

a variety of data sources.

Secure Credential Store

All credentials used by

the gateway are stored

on-premises.

Page 39: SQL Server 2014 Faster Insights from Any Data

Data Management Gateway Network Topology

MICROSOFT DATA CENTERINTERNETPERIMETER

NETWORKINTRANET

Data Management

Gateway

Data Management

Gateway Cloud

Services

Customer network

Power Query

Outgoing connection to cloud services

(Registration, Regular Heartbeat, Data

Source definition requests)

Connect to

Corporate

OData feedData

Per Machine: Single gateway installed

Credential

Management

Saves

credentials

Page 40: SQL Server 2014 Faster Insights from Any Data

Corporate OData Feeds and Data Management Gateway

Data Management

GatewayPower Query

(1) Using Power Query Anna connects to OData feed (URL: http://feedgwMyDB )

Example: Contoso\Anna

(2) The Data Management Gateway connects to SQL Server using either Windows account or Database account setup by Patrick when creating the feed

Example: DB1_Reader

(3) Returns Result

(4) Returns OData feed

Page 41: SQL Server 2014 Faster Insights from Any Data

Scheduled RefreshScenario: workbook is refreshed on schedule as configured by the author in BI Sites

• Scheduler runs in BI Azure and triggers refresh as configured in the BI Sites application

• The flow assumes the workbook has been added to Power BI, thus save back is done directly to SPO

• When refresh is called by BI Azure, SPO rehydrates the user identity and calls WAC in a back channel (i.e. redirect equivalent)

3. Refresh workbook

BI Azure

Office Web Apps

Service (WAC)

Excel Services

5. Get shadow

workbook refresh

Data Model

SPO

Azure Active

Directory(AAD)

OrgID, MSODS,ACS

Excel

Service

s SOAP

API

1. Verify user existence and license in MSODS and get

access token to target URL in SPO from ACS

2. Construct the user part of the access token, and trigger

refresh for a workbook on behalf of the scheduled refresh user

On-Prem

Data

Sources

Cloud Data

Sources

6. Get data from

cloud/on-prem

sources and re-

process the data

model

7. Save updated workbook to SPO

4. Power BI workbook?

Page 42: SQL Server 2014 Faster Insights from Any Data

On-premises Data Access from BI AzureScenario: Interactive refresh from Excel Web Access where the data source is on-premises

• For interactive refresh, shared data sources are configured in advance in the Power BI Admin Center

• For scheduled refresh, personal data sources can be configured by the workbook ownerAzure Active

Directory (AAD)

OrgID, MSODS,

ACS

BI Azure

Hybrid Proxy

ADO.NET

Provider

Discovery API Tenant

Configuration

SQL Azure

Hybrid Data Integration Service

Hybrid Proxy

Hybrid Delivery

1. Determine whether data

source is cloud or on-prem,

and retrieve registered ID

2. Authenticate &

retrieve tenant

information

3. Get registered

data source info

On-Prem

Cloud

4. Issue refresh query

Data Management Gateway

Windows Azure Service Bus

5. Send request to Gateway

(via Service Bus)

Hybrid Delivery

Client API

6. Read query request from

Service Bus queue

7. Retrieve data

source

credentials

Credential

Manager8. Run query and

retrieve the data

9. Coordinate

transfer job

Azure Storage

(temporary)

10. Compress &

stream data in

multiple chunks

11 . Receive & decompress

data

Azure Active Directory

(AAD)

OrgID, MSODS, ACS

BI Azure

Hybrid Proxy

ADO.NET

Provider

Discovery API

Hybrid Data Integration Service

Hybrid Proxy

Hybrid Delivery

Page 43: SQL Server 2014 Faster Insights from Any Data

Data Refresh in SPO– How does it work?

Data Management

Gateway

Excel Workbook in

SharePoint OnlineGateway

Cloud Service(1) Excel workbook

uploaded to SharePoint Online

(2) Click Data Refresh for Excel workbook

(3) Connects to Gateway Cloud Service

(4) Checks whether user is authorized to perform a refresh

(5) Sends command (SQL statement, connection string) to on-premise Data Management Gateway

(6) Sends SQL to SQL Server

(7) Return Results

(8) Efficiently transfer this to cloud service

(9) Returns data to Excel Workbook

Page 44: SQL Server 2014 Faster Insights from Any Data

Data Management Gateway - OData

Page 45: SQL Server 2014 Faster Insights from Any Data

Power BI for Office 365 | Capabilities

Page 46: SQL Server 2014 Faster Insights from Any Data

Engage customers with smart,

contextual mobile experiences

Boost agility with real-time access to

apps and data from anywhere

Enable Deep Business and Customer ConnectionsVirtually Anytime, Anywhere

Page 47: SQL Server 2014 Faster Insights from Any Data

Stay Productive on the GoDeliver Familiar, Connected Experiences to a Mobile Workforce

…while ensuring enterprise security, manageability, and compliance

Page 48: SQL Server 2014 Faster Insights from Any Data

Mobile BI Capabilities Available Today

Browser-based corporate BI solutions on iOS, Android and Windows:

• SharePoint Mobile enhancements

• PerformancePoint Services

• Excel Services

• SQL Server Reporting Services

“Ultimately, the new Microsoft mobile BI solution leads to more revenue for Recall

and gives us deeper customer insight, helping us stay ahead of our competitors.”

Recall Records Management Company Gets Real-Time BI, Boosts Sales with Mobile Solution case study. Full Case study.

Page 49: SQL Server 2014 Faster Insights from Any Data

Excel Web App

Page 50: SQL Server 2014 Faster Insights from Any Data

Excel Web App

Quick Explore

Page 51: SQL Server 2014 Faster Insights from Any Data

Mobile-Friendly Apps for Office

Page 52: SQL Server 2014 Faster Insights from Any Data

Power BI for Office 365 | Capabilities

Page 53: SQL Server 2014 Faster Insights from Any Data

Tabular models for Power BI

Page 54: SQL Server 2014 Faster Insights from Any Data

Datasources

Page 55: SQL Server 2014 Faster Insights from Any Data

Creating & managing models in Power BI

Page 56: SQL Server 2014 Faster Insights from Any Data

Reliable Persistent Storage (RPS)

Power BI Tabular Model Architecture

SSDT

SQL Azure

HDInsight

Azure Tables

External Data Sources

AS Instance AS Instance AS Instance AS Instance

On Prem SQL

Gateway

Power BI Portal

in O365

Excel

XMLA REST

Page 57: SQL Server 2014 Faster Insights from Any Data

Service Health Monitoring

At a glance view of the health of IT managed gateways

Page 58: SQL Server 2014 Faster Insights from Any Data

Enabler of Self Service BI

Varying levels of control across data sources, departments

Oversight and monitoring of cloud data access

Ability to make corporate data sources easier to discover, and easier to access

Role of the IT Admin in Power BI

Page 59: SQL Server 2014 Faster Insights from Any Data

https://itadmin.clouddatahub.net/

Power BI Admin Center

Page 60: SQL Server 2014 Faster Insights from Any Data

Power BI Admin Portal & Data Management Gateway

Power BI Admin CenterPower BI Admin Center

Page 61: SQL Server 2014 Faster Insights from Any Data

HDInsight, Polybase, and StreamInsight

Page 62: SQL Server 2014 Faster Insights from Any Data

Key Trends

Page 63: SQL Server 2014 Faster Insights from Any Data

Big Data Analytics

Page 64: SQL Server 2014 Faster Insights from Any Data

Internet of things

Audio / Video

Log Files

Text/Image

Social Sentiment

Data Market Feeds

eGov Feeds

Weather

Wikis / BlogsClick Stream

Sensors / RFID / Devices

Spatial & GPS Coordinates

WEB 2.0Mobile

Advertising CollaborationeCommerce

Digital Marketing

Search Marketing

Web Logs

Recommendations

ERP / CRM

Sales Pipeline

Payables

Payroll

Inventory

Contacts

Deal Tracking

Terabytes

(10E12)

Gigabytes

(10E9)

Exabytes

(10E18)

Petabytes

(10E15)

Velocity - Variety - variability

Vo

lum

e

1980

190,000$

2010

0.07$

1990

9,000$2000

15$Storage/GB

ERP / CRM WEB

2.0

Internet of things

What Is Big Data?

Page 65: SQL Server 2014 Faster Insights from Any Data

Modern Data Warehousing

Page 66: SQL Server 2014 Faster Insights from Any Data

Hadoop Distributed Architecture

Page 67: SQL Server 2014 Faster Insights from Any Data

MapReduce: Move Code to the Data

Page 68: SQL Server 2014 Faster Insights from Any Data

So How Does It Work?

Page 69: SQL Server 2014 Faster Insights from Any Data

Distributed Storage

(HDFS)

Query

(Hive)

Distributed Processing

(MapReduce)

OD

BC

Legend

Red = Core

Hadoop

Blue = Data

processing

Gray= Microsoft

integration

points and

value adds

Orange = Data

Movement

Green =

Packages

HDInsight and Hadoop Ecosystem

Page 70: SQL Server 2014 Faster Insights from Any Data

Record

readerMap Combiner

Partitioner

Shuffle

and sort

ReduceOutput

format

Page 71: SQL Server 2014 Faster Insights from Any Data
Page 72: SQL Server 2014 Faster Insights from Any Data

MapReduce Summary

Page 73: SQL Server 2014 Faster Insights from Any Data

Programming HDInsight

Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus…

C#, F# Map/Reduce, LINQ to Hive, Microsoft .NET

management clients

JavaScript Map/Reduce, browser hosted console, Node.js

management clients

PowerShell, cross-platform CLI tools

Page 74: SQL Server 2014 Faster Insights from Any Data

RDBMS vs. Hadoop

Page 75: SQL Server 2014 Faster Insights from Any Data

Microsoft Hadoop VisionInsights to all users by activating new types of data

Page 76: SQL Server 2014 Faster Insights from Any Data

Polybase

76

DBHDFS

SQL Server PDW querying HDFS data, in-situ

=

Page 77: SQL Server 2014 Faster Insights from Any Data

Polybase in PDW V2

77

Hadoop

HDFS DB

(a) PDW query in, results out

Hadoop

HDFS DB

(b) PDW query in, results stored in HDFS

Page 78: SQL Server 2014 Faster Insights from Any Data

Sensor

& RFID

Web

Apps

Unstructured data Structured data

Traditional schema-

based DW applications

RDBMSHadoop

Social

Apps

Mobile

Apps

How to overcome the

“impedance mismatch”

Increasingly massive amounts of unstructured data driven by new sources

At the same time, vast amounts of corporate data and data sources, and the bulk of their data analysis

Polybase addresses this challenge for advanced data analytics by allowing native query across PDW and Hadoop, integrating structured and unstructured data

Native Query Across Hadoop and PDW

Page 79: SQL Server 2014 Faster Insights from Any Data

• Querying data in Hadoop from PDW using regular SQL queries, including

• Full SQL query access to data stored in HDFS, represented as ‘external tables’ in PDW

• Basic statistics support for data coming from HDFS

• Querying across PDW and Hadoop tables (joining ‘on the fly’)

• Fully parallelized, high performance import of data from HDFS files into PDW tables

• Fully parallelized, high performance export of data in PDW tables into HDFS files

• Integration with various Hadoop distributions: Hadoop on Windows Server, Hortonwork and Cloudera.

• Supporting Hadoop 1.0 and 2.0

Native Query Across Hadoop and PDWPolybase Features in SQL Server PDW

Page 80: SQL Server 2014 Faster Insights from Any Data

Native Query Across Hadoop and PDWCreating “External Tables”

• Internal representation of data residing in Hadoop/HDFS (delimited text files only)

• High-level permissions required for creating external tables

• ADMINISTER BULK OPERATIONS & ALTER SCHEMA

• Different than ‘regular SQL tables’: essentially read only (no DML support)

CREATE EXTERNAL TABLE table_name ({<column_definition>} [,...n ])

{WITH (LOCATION =‘<URI>’,[FORMAT_OPTIONS = (<VALUES>)])}

[;]

Indicates

“External” Table

1

Required location of

Hadoop cluster and file

2

Optional Format Options associated

with data import from HDFS

3

Page 81: SQL Server 2014 Faster Insights from Any Data

Native Query Across Hadoop and PDWQuerying Unstructured Data

1. Querying data in HDFS and displaying results in table form (using external tables)

2. Joining data from HDFS with relational PDW data

Example – Creating external table ‘ClickStream’:

CREATE EXTERNAL TABLE ClickStream(url varchar(50), event_date date, user_IP

varchar(50)), WITH (LOCATION =‘hdfs://MyHadoop:5000/tpch1GB/employee.tbl’,

FORMAT_OPTIONS (FIELD_TERMINATOR = '|'));

Text file in HDFS with | as field delimiter

SELECT top 10 (url) FROM ClickStream where user_IP = ‘192.168.0.1’ Filter query against data in

HDFS

SELECT url.description FROM ClickStream cs, Url_Description url

WHERE cs.url = url.name and cs.url=’www.cars.com’;

Join data coming from files in

HDFS (Url_Description is a second text file in HDFS)

Query Examples

1

2

SELECT user_name FROM ClickStream cs, Users u WHERE

cs.user_IP = u.user_IP and cs.url=’www.microsoft.com’;

3Join data from HDFS

with relational PDW table(Users is a distributed PDW table)

Page 82: SQL Server 2014 Faster Insights from Any Data

Native Query Across Hadoop and PDWParallel Data Import from HDFS into PDW

Persistently storing data from HDFS in PDW tablesFully parallelized via CREATE TABLE AS SELECT (CTAS) with external tables as source table and PDW tables (either distributed or replicated) as destination

CREATE TABLE ClickStream_PDW WITH DISTRIBUTION = HASH(url)

AS SELECT url, event_date, user_IP FROM ClickStream

Retrieval of data in HDFS “on-the-fly”

Enhanced

PDW query

engine

CTAS Results

External Table

DMS

Reader

1

DMS

Reader

N

HDFS bridge

Parallel

HDFS Reads

Parallel

Importing

Sensor

& RFID

Web

Apps

Unstructured data

Hadoop

Social

Apps

Mobile

Apps

Structured data

Traditional DW

applications

PDW

Page 83: SQL Server 2014 Faster Insights from Any Data

Sensor

& RFID

Web

Apps

Unstructured data

Social

Apps

Mobile

Apps

HDFS data nodes

Native Query Across Hadoop and PDWParallel Data Export from PDW into HDFS• Fully parallelized via CREATE EXTERNAL TABLE AS SELECT (CETAS) with external tables as

destination table and PDW tables as source

• ‘Round-trip of data’ possible with first importing data from HDFS, joining it with relational data, and then exporting results back to HDFS

CREATE EXTERNAL TABLE ClickStream (url, event_date, user_IP)

WITH (LOCATION =‘hdfs://MyHadoop:5000/users/outputDir’, FORMAT_OPTIONS

(FIELD_TERMINATOR = '|')) AS SELECT url, event_date, user_IP FROM ClickStream_PDW

Enhanced

PDW query

engine

CETAS Results

External Table

DMS

Writer

1

DMS

Writer

N

HDFS bridge

Parallel

HDFS Writes

Parallel

Reading

Structured data

Traditional DW

applications

PDW

Page 84: SQL Server 2014 Faster Insights from Any Data

In-Memory for big data analyticsInteractive Analytics over “Big Data”

84

• SQL Server Analysis Services scaled out to very

large data volumes

• Sourced from “Big Data” sources, e.g.

• Hadoop, Isotope, etc.

• Enterprise data sources (SQL Server, Oracle, SAP,

etc.)

• Built upon the In-Memory Analytics engine

• In-memory, column-store, 10x compression

• Deployment vehicles: Box, Appliance, Cloud

• Customers:

• Skype, Klout, Halo 4, UBS, AdCenter, Windows

Update

XMLAWeb services

External

Data Sources

GW

Mgmt

Deploy

Monitor

AS

Instance

AS

Instance

AS

Instance

Reliable Persistent Storage

Excel, PV

3rd party apps,

tools, etc.

Page 85: SQL Server 2014 Faster Insights from Any Data

StreamInsightManaging Streaming Data In-Memory

Customer benefits

85

Event

Output

streamInput

stream

Page 86: SQL Server 2014 Faster Insights from Any Data

Complete and Consistent Data Platform

Page 87: SQL Server 2014 Faster Insights from Any Data

What Questions Do You Have?

Page 88: SQL Server 2014 Faster Insights from Any Data

Thank Youfor attending this session