Upload
godwin-alexander-mcdonald
View
223
Download
1
Embed Size (px)
Citation preview
Tobias TernströmProgram Manager, SQL
Azure SQL database: Under the hood
3-630
Three different ways to run SQL Business Continuity Some development best practices Partitioning across databases (Sharding)
Stuff to cover
High Friction & Control
Reso
urc
es
Dedicated
Shared
Low
100% of API, Virtualized Roll-your-own
HA/DR/scaleSQL Server in IaaS
Virtualized Machine
SQL Server Raw iron
Scale-upFull h/w controlRoll-your-own HA/DR/scale
Auto HA, Fault-ToleranceFriction-free scaleSelf-provisioning, mgmt @
scale
Virtualized DatabaseSQL Database - PaaS
Three different ways to run SQLBest fit for your structured relational data needs
Premium, up to 500GB
Web/Business, up to 150GB
Business ContinuityAll editions• High-availability – Local machine failures
Premium edition• Oops recovery – Human errors• Disaster recovery – Catastrophic
geographic failures
• Reads are completed at the primary
• Writes are replicated to secondary replicas
DB
Single LogicalDatabase Multiple synchronous replicas
Transparent automatic failover
P
SS WriteWrite
AckAck
ReadValue WriteAck
P
S
S
High-Availability
Self-Service Restore
Point-in-time Restore
Restore points available up to 35 days back
Creates a side-by-side database copy, non-disruptive
REST API, PowerShell or Azure Portal
Available in Premium edition
Programmatic “oops recovery” of data deletion or alteration
Geo-replicated
Restore from backup
Azure Storage
SQL Database Backups sabcp01bl21 sabcp01bl21
Self-service activation
Create up to 4 readable secondary replicas
Replicate to any Azure region
Automatic data replication, asynchronous
REST API, PowerShell or Azure Portal
Available in Premium edition
Active Geo-Replication
Mission-critical business continuity on your terms
DMV Values Visibility
is _interlink connected
YesNo
sys.dm_database_copies
Database state ONLINE COPYING
sys.databases
Replication state SEEDINGCATCH_UP
sys.dm_database_copies
Replication lag Seconds sys.dm_continuous_copy_status
Last_replication Timestamp sys.dm_continuous_copy_status
Task API Details
Start Continuous Copy
PowerShellREST
Optional RPO setting
Stop Continuous Copy
PowerShellREST
Forced or friendly termination
Get Status PowerShellREST T-SQL
Retrieves DMV
Geo-Replicatio
n Monitoring
and Control
Some Development best practices
Why is Azure SQL Database different?
Master MSDB
Temp
Master MSDB
User DB
TempDB
Instance CollationLogins
CredentialsLinked Server Defs.
CLR…
AgentReplication
DB Mail…
…TempDB Collation
Other AppsOther DBs
User DBUser DB
User DB
Azure SQL Database DMV Surface Area
Use database level DMV’s to identify top resource consumers Snapshot current requests
Order by elapsed time
DMV Details Use
sys.dm_exec_query_stats
Cumulative view of query statistics Total and average resource consumption
sys.dm_exec_query_sql_text
Returns the text of the SQL batch that is identified by the specified sql_handle
Provide overall batch text for statement
sys.dm_exec_query_plan
Returns plan in XML for specified plan handle Provide plan for tuning and analysis
sys.dm_exec_requests Current requests executing on your DB Check for blocking, contention related issues, convoys, etc
Execute in isolation with STATISTICS (IO/TIME) ON
Identifying Top Resource Consumers
2
1
3
4
5
Connection ReliabilityIssue typesException Message from server/app indicating failureTimeout (or user impatience) No message from server/app within desired response time
User Client app
Web Serve
r
DBServe
r
DurableNot durable
Unreliable
Let’s look at what can happen
Web Server
DBServer
X
X?
1. Request has not yet reached the server Retry for reads and writes is safe
2. The request has reached the server Retry for reads is safe Retry for writes is NOT safe
1. Request is executed on the Server Retry for reads is safe Retry for writes is NOT safe
Safe retry-protocol for writes
Client1. Assign a unique ID (ex. GUID) to
request2. Send request to server
9. Send transaction received notification
Server
3. If request ID already exists return error message
4. Begin transaction5. Save request ID6. Execute request7. Commit transaction8. Return result
10. Delete request ID
Retention clean-up
Retry considerations
A general system wide back-off strategy is typically a good idea
Query dependent back-off strategy to avoid overloading the system
Low cost operation High cost operation
Just re-execute Requires tracking of transaction ID Requires outer transaction
Read Write
Reducing round-trips
Get all @ onceIssue one query that gets the data you needConsider using FOR XML for simpler consumption
Batch updatesExample - “Update the following 1000 entities”N updates N round-trips vs. N updates <N (1?) round-trips
SELECT…SELECT…SELECT…
SELECT…
N NOne client server roundtrip per execution
N <NXMLDelimited listTable Valued ParameterAll executions in one batch
Reducing round-trips cont’d
Performance is goodNo SQL Injection
Requires SQLCLRData is not strongly typedCumbersome implementationCan be simplified by created one TVF per “list type”
Delimited listCan be strongly typedNo SQL InjectionNice option if your data is already XML!Great flexibility
Not strongly typed by defaultPerformance is ok but not the bestLess cumbersome than the delimited list but still somewhat cumbersome
XMLStrongly typedNo SQL InjectionPerformance is great!Easy to useAllows for some level of streaming
Less flexible than XMLAllows for streaming, but only to the server
Table Valued Parameter
Fully streamingEasy to use
Poor performancePotential for SQL Injection attacks
Roundtrip per execution
Partitioning across databases (Sharding)
Large Azure SQL Database customers are often ISVs Old pattern was: Build a DB app, sell it to a
customer and they run it on their hardware
Common Scaling Patterns
App
DB
Customer
App
DB
Customer
App
DB
Customer
App
DB
Customer
App
DB
Customer
SaaS vendors frequently run their code as a layer Code is usually shared across many customers Sometimes databases are shared too When they have more customers, they often have more
databases
Software as a Service vendors
SaaS
Customer DB1
Customer DB2
Customer DB3
Customer DB N
…
Typical OLTP databases look something like this picture Everything goes into a single database, but you usually only query for
individual customers at a time (example: customer places an order) Reports run on the same database or are moved to a secondary
replica to avoid contention on locks, resources
Data Model Sharding
Sharded Models split the data across multiple databases that each have the same schema
All data about one customer is located within a single database – OLTP operations work fine
Cross-database operations do not work at all (without manual work) Data is automatically spread across many machines in a cluster, not just one
Sharded Model
If you have a whole bunch of databases, you need a directory to keep track of which Customers are in each database
A “Directory” database stores this data
General Login Path1. Connect to Root Database, find tenant
Cache!
2. Connect to Right Client Database for this Customer3. Perform Work on per-customer data
Central Metadata Databases
Directory
Shard 1
Shard 2
1
2
3
Single-tenancy case (1 customer = 1 database) is easy
It is also possible to do multi-tenant cases (manually)
Databases become containers and you build code to copy customer data from one database to another
You can pack cold tenants to save money
Multi-Tenancy
Usage Pattern
Distribution of CSV Tenants
Long Tail of Colder Databases CSV Goal: COGS Reduction
Small but growing set of highly active users
How do you run a monthly report over all customers? Iterate over each DB Collect intermediate results in a
single database Finish query over intermediate
results
Key Details Not transactionally consistent Intermediate results needs to fit in
one database Some operations may fail; re-run
pieces that fail On huge systems, it can take
hours to run
Reporting “Queries”
Shard 1
Report Program
Shard 2 Shard 3
Temp Storage
1
2
3,6,9
4
5
7
8
10
11
Windows Azure SQL Database and SQL Server -- Performance and Scalability Compared and Contrasted
http://msdn.microsoft.com/library/azure/jj879332.aspx
Performance Considerations with Windows Azure SQL Database
http://msdn.microsoft.com/library/azure/jj156164.aspx
Performance Guidance for SQL Server in Windows Azure Virtual Machines
http://msdn.microsoft.com/en-us/library/azure/dn248436.aspx
Resources
Your Feedback is Important
Fill out an evaluation of this session and help shape future events.
Scan the QR code to evaluate this session on your mobile device.
You’ll also be entered into a daily prize drawing!
© 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.