High Friction & Control Resources Dedicated Shared Low 100% of API, Virtualized Roll-your-own HA/DR/scale SQL Server in IaaS Virtualized Machine SQL

Tobias TernströmProgram Manager, SQL

Azure SQL database: Under the hood

3-630

Three different ways to run SQL Business Continuity Some development best practices Partitioning across databases (Sharding)

Stuff to cover

High Friction & Control

Reso

urc

es

Dedicated

Shared

Low

100% of API, Virtualized Roll-your-own

HA/DR/scaleSQL Server in IaaS

Virtualized Machine

SQL Server Raw iron

Scale-upFull h/w controlRoll-your-own HA/DR/scale

Auto HA, Fault-ToleranceFriction-free scaleSelf-provisioning, mgmt @

scale

Virtualized DatabaseSQL Database - PaaS

Three different ways to run SQLBest fit for your structured relational data needs

Premium, up to 500GB

Web/Business, up to 150GB

Business ContinuityAll editions• High-availability – Local machine failures

Premium edition• Oops recovery – Human errors• Disaster recovery – Catastrophic

geographic failures

• Reads are completed at the primary

• Writes are replicated to secondary replicas

DB

Single LogicalDatabase Multiple synchronous replicas

Transparent automatic failover

P

SS WriteWrite

AckAck

ReadValue WriteAck

P

S

S

High-Availability

Self-Service Restore

Point-in-time Restore

Restore points available up to 35 days back

Creates a side-by-side database copy, non-disruptive

REST API, PowerShell or Azure Portal

Available in Premium edition

Programmatic “oops recovery” of data deletion or alteration

Geo-replicated

Restore from backup

Azure Storage

SQL Database Backups sabcp01bl21 sabcp01bl21

Self-service activation

Create up to 4 readable secondary replicas

Replicate to any Azure region

Automatic data replication, asynchronous

REST API, PowerShell or Azure Portal

Available in Premium edition

Active Geo-Replication

Mission-critical business continuity on your terms

DMV Values Visibility

is _interlink connected

YesNo

sys.dm_database_copies

Database state ONLINE COPYING

sys.databases

Replication state SEEDINGCATCH_UP

sys.dm_database_copies

Replication lag Seconds sys.dm_continuous_copy_status

Last_replication Timestamp sys.dm_continuous_copy_status

Task API Details

Start Continuous Copy

PowerShellREST

Optional RPO setting

Stop Continuous Copy

PowerShellREST

Forced or friendly termination

Get Status PowerShellREST T-SQL

Retrieves DMV

Geo-Replicatio

n Monitoring

and Control

Some Development best practices

Why is Azure SQL Database different?

Master MSDB

Temp

Master MSDB

User DB

TempDB

Instance CollationLogins

CredentialsLinked Server Defs.

CLR…

AgentReplication

DB Mail…

…TempDB Collation

Other AppsOther DBs

User DBUser DB

User DB

Azure SQL Database DMV Surface Area

Use database level DMV’s to identify top resource consumers Snapshot current requests

Order by elapsed time

DMV Details Use

sys.dm_exec_query_stats

Cumulative view of query statistics Total and average resource consumption

sys.dm_exec_query_sql_text

Returns the text of the SQL batch that is identified by the specified sql_handle

Provide overall batch text for statement

sys.dm_exec_query_plan

Returns plan in XML for specified plan handle Provide plan for tuning and analysis

sys.dm_exec_requests Current requests executing on your DB Check for blocking, contention related issues, convoys, etc

Execute in isolation with STATISTICS (IO/TIME) ON

Identifying Top Resource Consumers

2

1

3

4

5

Connection ReliabilityIssue typesException Message from server/app indicating failureTimeout (or user impatience) No message from server/app within desired response time

User Client app

Web Serve

r

DBServe

r

DurableNot durable

Unreliable

Let’s look at what can happen

Web Server

DBServer

X

X?

1. Request has not yet reached the server Retry for reads and writes is safe

2. The request has reached the server Retry for reads is safe Retry for writes is NOT safe

1. Request is executed on the Server Retry for reads is safe Retry for writes is NOT safe

Safe retry-protocol for writes

Client1. Assign a unique ID (ex. GUID) to

request2. Send request to server

9. Send transaction received notification

Server

3. If request ID already exists return error message

4. Begin transaction5. Save request ID6. Execute request7. Commit transaction8. Return result

10. Delete request ID

Retention clean-up

Retry considerations

A general system wide back-off strategy is typically a good idea

Query dependent back-off strategy to avoid overloading the system

Low cost operation High cost operation

Just re-execute Requires tracking of transaction ID Requires outer transaction

Read Write

Reducing round-trips

Get all @ onceIssue one query that gets the data you needConsider using FOR XML for simpler consumption

Batch updatesExample - “Update the following 1000 entities”N updates N round-trips vs. N updates <N (1?) round-trips

SELECT…SELECT…SELECT…

SELECT…

N NOne client server roundtrip per execution

N <NXMLDelimited listTable Valued ParameterAll executions in one batch

Reducing round-trips cont’d

Performance is goodNo SQL Injection

Requires SQLCLRData is not strongly typedCumbersome implementationCan be simplified by created one TVF per “list type”

Delimited listCan be strongly typedNo SQL InjectionNice option if your data is already XML!Great flexibility

Not strongly typed by defaultPerformance is ok but not the bestLess cumbersome than the delimited list but still somewhat cumbersome

XMLStrongly typedNo SQL InjectionPerformance is great!Easy to useAllows for some level of streaming

Less flexible than XMLAllows for streaming, but only to the server

Table Valued Parameter

Fully streamingEasy to use

Poor performancePotential for SQL Injection attacks

Roundtrip per execution

Partitioning across databases (Sharding)

Large Azure SQL Database customers are often ISVs Old pattern was: Build a DB app, sell it to a

customer and they run it on their hardware

Common Scaling Patterns

App

DB

Customer

App

DB

Customer

App

DB

Customer

App

DB

Customer

App

DB

Customer

SaaS vendors frequently run their code as a layer Code is usually shared across many customers Sometimes databases are shared too When they have more customers, they often have more

databases

Software as a Service vendors

SaaS

Customer DB1

Customer DB2

Customer DB3

Customer DB N

…

Typical OLTP databases look something like this picture Everything goes into a single database, but you usually only query for

individual customers at a time (example: customer places an order) Reports run on the same database or are moved to a secondary

replica to avoid contention on locks, resources

Data Model Sharding

Sharded Models split the data across multiple databases that each have the same schema

All data about one customer is located within a single database – OLTP operations work fine

Cross-database operations do not work at all (without manual work) Data is automatically spread across many machines in a cluster, not just one

Sharded Model

If you have a whole bunch of databases, you need a directory to keep track of which Customers are in each database

A “Directory” database stores this data

General Login Path1. Connect to Root Database, find tenant

Cache!

2. Connect to Right Client Database for this Customer3. Perform Work on per-customer data

Central Metadata Databases

Directory

Shard 1

Shard 2

1

2

3

Single-tenancy case (1 customer = 1 database) is easy

It is also possible to do multi-tenant cases (manually)

Databases become containers and you build code to copy customer data from one database to another

You can pack cold tenants to save money

Multi-Tenancy

Usage Pattern

Distribution of CSV Tenants

Long Tail of Colder Databases CSV Goal: COGS Reduction

Small but growing set of highly active users

How do you run a monthly report over all customers? Iterate over each DB Collect intermediate results in a

single database Finish query over intermediate

results

Key Details Not transactionally consistent Intermediate results needs to fit in

one database Some operations may fail; re-run

pieces that fail On huge systems, it can take

hours to run

Reporting “Queries”

Shard 1

Report Program

Shard 2 Shard 3

Temp Storage

1

2

3,6,9

4

5

7

8

10

11

Windows Azure SQL Database and SQL Server -- Performance and Scalability Compared and Contrasted

http://msdn.microsoft.com/library/azure/jj879332.aspx

Performance Considerations with Windows Azure SQL Database


Performance Guidance for SQL Server in Windows Azure Virtual Machines

http://msdn.microsoft.com/en-us/library/azure/dn248436.aspx

Resources



http://msdn.microsoft.com/en-us/library/azure/dn248436.aspx

Your Feedback is Important

Fill out an evaluation of this session and help shape future events.

Scan the QR code to evaluate this session on your mobile device.

You’ll also be entered into a daily prize drawing!

© 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Documents

High Friction & Control Resources Dedicated Shared Low 100% of API, Virtualized Roll-your-own HA/DR/scale SQL Server in IaaS Virtualized Machine SQL