DBA421 Building Very Large Databases with SQL Server 2000 (32/64bit) Gert E.R. Drapers Software Architect Customer Advisory Team SQL Server Development

DBA421

Building Very Large Databases with SQL Server 2000 (32/64bit)

Gert E.R. Drapers Software ArchitectCustomer Advisory TeamSQL Server Development

SQL Server DevelopmentCustomer Advisory Team

We invest in Large Scale SQL Server projects across the world

Technical Assistance with design, performance tuning, etc.

Product group programs

We use our customer experiences to make a better product

We share the knowledge learned from these challenging implementations so that more can be built

Session Objective

Show that Large Scale projects using the Microsoft database platform is not unique

Give you some thoughts on clever design and implementation tips dealing with large scale SQL Server applications

Make you feel comfortable that it absolutely can be done

Scaling up / out with SQL Server - It can be done!Scaling up / out with SQL Server - It can be done!

Agenda

General comments on building large scale applicationsScaling up Large Scale OLTP

Best PracticesCase Study Example

Scaling up Large Scale Data WarehouseBest PracticesCase Study Example

Scale out - Clarification64-bit – Awesome, but understand the valueSummary / Q & A

What is large scale?

Any database application that requires special configuration work outside defaultNumbers do not mean anything but I always get asked so here are some guidelines:

TB+ size databases1000(s) of Database transactions / sec1000(s) of connected UsersThroughput of the IO subsystem exceeding 80 mb/secGreater than 8 way processorGreater than 8GB Ram

Large Scale is when you think it is challenging and your vendor says it is a piece of cake. This is when you start worrying

Building Large Scale SQL Server applications

The General Implementation Rules Change slightly (true for all RDBMS)

It’s not quite auto-tuningYou need to understand your entire platform: Hardware, OS and Database Server configurationYou may have to change defaults that work great for 99% of other applications

DOPRecovery IntervalQuery / Index Hintsetc

Experience and know how is what makes large scale applications workIt’s not a piece of cake but it ABSOLUTELY CAN BE DONE!

Some interesting SQL Serverimplementations in Production

25+ Terabyte or greater applications in productionCurrently active on at least as many as we speak7+TB Telco Billing reporting system growing to 15TB3TB GIS mapping application growing to 10-20TB13TB Debt processing financial applicationCurrently working on system that will grow to 50 TBs

Many High Volume mission Critical applications in production on SQL Server

Trading application exceeding 60000+ Database transactions / secCredit card processing system processing 3000 authentications/secBanking application processing money transfers of greater than 40m trx / day while reporting is active

Agenda





OLTP Large Scale Tips and Best Practices

Eliminate Log Bottleneck

Eliminate ineffective Stored Procedure Recompiles

Eliminate Lock escalation that causes performance issues or deadlocks

Understand your scheduler and what hardware resources needed

Eliminate Log Bottleneck

Set log on its own drive or RAID 10 stripe setAssign its own Controller or own LUN(s) if on SANSet controller or IO Subsystem cache for log as 100% write through (unless reading log with replication or a lot of trigger activity)Monitor Waitstats to ensure log is not waiting – DBCC waitstatsDBCC SQLPERF(waitstats, clear) goDBCC SQLPERF (waitstats)

Create Waitstats Scriptdbcc sqlperf(waitstats,clear) -- clear out wait

statisticsdrop table waitstatscreate table waitstats (Wait_Type varchar(80),

Requests numeric(18,1),Wait_Time numeric (18,1),Signal_Wait_Time numeric(18,1),timenow datetime default getdate())

declare @start int, @finish intselect @start = 1, @finish = 10while (@start < @finish)begin

begin transaction insert into waitstats (Wait_Type, Requests,

Wait_Time,Signal_Wait_Time)exec (‘dbcc sqlperf(waitstats)')

commit

select @start = @start + 1waitfor delay '00:00:10' -- every 10 seconds

End

select * from waitstats order by Wait_Time desc

Eliminate ineffective Stored Procedure Recompiles

In some cases, stored procedure recompilation behavior adversely affects performance

Multiple recompilationsEntire stored procedure query plan recompiled (especially for very large stored procedures)Stored procedure recompiles cause serialization, proc execution wait serially during stored proc recompilationRecompile triggered by

Set options cause recompilationDML/DDL mixedTempDBTable cardinality changesEtc.

Customers miss recompilation aspect of performance

Possible Solution / SuggestionWhitepaper on MSDN and KB Article: Q243586 Execute proc [ keepfixedplan at stmt level ]Statement level recompilation (Yukon)Be on the look out of Recompilations using Perfmon.

Locking / Lock Escalation

Lock EscalationSQL Server will escalate to a table lock when a specific percentage of memory is used up for locking or the system allocates 2500 locks(5000 in SS SP2).Table Lock escalation saves on memory allocation but can cause of deadlocks and other locking problems

Resolution:Force page locks which will allocate fewer locksOr get clever like one customer did

Insert dummy record 9999 99 99999 999 999Open thread and grab update lock on dummy recordNext thread can not escalate because you can not escalate to a table lock when another lock is held on same table.

Worst case – There is a trace flag to shut it off, call your Microsoft support prior to using this option.Monitor through normal locking tools: profiler, sp_who, sp_who2, sp_lock and sp_blockinfo

Locking script

Send email to [email protected] for:sp_blockinfo script

waitstats script

Other documentation

mailto:[email protected]

SQL Server Scheduler – understand it!

5 BCP.exe jobs5 BCP.exe jobs

4 way CPU4 way CPU

UMSUMS

UMSUMS

UMSUMS

UMSUMS

1)1) Connections get assigned to UMSConnections get assigned to UMS2)2) UMS schedules across ProcessorsUMS schedules across Processors3)3) A connection stayed on UMS for life of threadA connection stayed on UMS for life of thread4)4) Two heavy threads will fight on the same UMSTwo heavy threads will fight on the same UMS

SQL Server Scheduler – continued

5 BCP.exe jobs5 BCP.exe jobs

4 way CPU4 way CPU

UMSUMS

UMSUMS

UMSUMS

UMSUMS

1)1) Each load job takes 2 minutes to runEach load job takes 2 minutes to run2)2) Task 2,3 and 4 each finished in 2 minutesTask 2,3 and 4 each finished in 2 minutes3)3) Task 1 & 5 took closer to 4 minutesTask 1 & 5 took closer to 4 minutes4)4) Using 4 threads instead of 5 would be aUsing 4 threads instead of 5 would be a better solution for this example.better solution for this example.5) 5) Monitor using DBCC SQLPERF (UMSTATS)Monitor using DBCC SQLPERF (UMSTATS)

TaskTask

11

33

44

55

22 1, 51, 5

22

33

44

UMSSTATS example

DBCC SQLPERF (UMSSTATS,clear)

go

waitfor delay '00:00:15'

go

DBCC SQLPERF (UMSSTATS)

OLTP Case Study – 60000+ database trx/sec

Financial trading messages: Bids / Asks coming off Tandem to message queue at a rate of 12000-15000/sec

Each message goes to a SQL Server stored procedure and makes approximately 5 database transactions

Simultaneously real time queries are reading from active data

Topology

Tandem Non-stop SQLTandem Non-stop SQL

Message QueueMessage Queue

Read

from

R

ead

from

Q

ueu

eQ

ueu

e

SS Stored Proc

Microsoft SQL Server Microsoft SQL Server 8-way Dell8-way Dell

6 Different Databases6 Different Databases

ApplApplTierTier

……..

SS Stored Proc

Queries – real timeQueries – real time

BrokersBrokers

OLTP Lessons learned

ChallengesScheduling a mix workload evenly across Schedulers

Database Log to handle 60000+ database trx/sec

Real time reporting and loading data

Can be done with SQL Server 2000Multiple database logs

Read-only queries

Priority scheduling of connections

Agenda





Data Warehouse Lessons Learned

Loading large quantity quickly

Maintaining very large databases

Optimizing TempDB

Optimizing your IO subsystem

Data Loading Strategy For VLDB

Bulk Insert without indexes or minimal indexes seems to be fastest (hundreds of thousands rows/sec) but not applicable to larger systems with incremental data loads

Bulk Insert with single NC index fairly linear.

Bulk Insert with Clustered index less linear due to fragmentation

Parallel Bulk Insert can be fastest but not linear with sorted data and partitioned load files

No contention using multiple Bulk Insert (one per table) if physical model works for data access (e.g. horizontal partitioning)

Maintenance Tips: Reorg & Create Index

Run DBCC ShowContig and look for Scan Density(logical extent density) and Avg Page Density(Avg free bytes/page). Below 30-50% then a Reorg is recommended.

Run Index Defrag if your data in not interleaved (that is single table/index in its own File Group).

Run DBCC Reindex if you want all clustered and non-clustering indexes rebuilt

Run Create Index with drop existing if you only want clustered index rebuilt and Index defrag is not helping

Create Index runs fastest with more memory

Index Defrag with Interleaved data

Single File GroupSingle File Group

Tab A….pageTab A….page Tab B….pageTab B….page Tab A….pageTab A….page Tab A….pageTab A….page

Tab B….pageTab B….page Tab B….pageTab B….page Tab A….pageTab A….page Tab B….pageTab B….page

Tab A….pageTab A….page Tab B….pageTab B….page

6……………….106……………….10 16………………2016………………201………………….51………………….5

11……………..1511……………..15

21……………….2521……………….25

Run DBCC Index Defrag (Tab A)Run DBCC Index Defrag (Tab A)

Index Defrag with Interleaved data…





1…………………..51…………………..5 16…………16……………20…20

6………………106………………10

11…………..11…………..1515

21……………….2521……………….25

1. Start with first extent and swap with next smallest extent2. If pages not at fill factor then will fill up to fill factor3. If pages are greater than fill factor it will NOT take away records to

meet fill factor4. Will not reorganize other objects in file group therefore data will not be

contiguous following Index Defrag

Index Defrag with Interleaved data…





1……………….51……………….5 6………………106………………10 11…………….1511…………….15

16……………..2016……………..20

21……………….2521……………….25

• When completed data is in correct When completed data is in correct • Data may or may not be contiguous Data may or may not be contiguous • Free space is used up but not given backFree space is used up but not given back• DBCC Reindex will put data back in contiguous order but takes longerDBCC Reindex will put data back in contiguous order but takes longer to run and is more intrusiveto run and is more intrusive

TempDB slows down when busy!

ChallengePerformance degradation during peak load

Many concurrent users creating temporary tablesPerformance will start to degradeSysprocesses shows locks on DBID #2 (TempDB)

ResolutionMore files for TempDB file groupPlace TempDB across more spindlesCould possibly use trace flag to disable single page allocation but please please please call Microsoft support before turning this trace flag on.

Challenge: New to SAN technology

Never have worked with SAN technology can be challenging

Does the data layout rules change?What does the cache really buy me?How do I monitor?How do I determine if the problem is SAN or database?

Some thoughts:Test your SAN throughput out prior to database activity. Use IOMETER or just copy large filesYou should be getting approx. 120mb/sec throughput per channelIf heavy updated system then make cache more write through vs read. This will significantly help log writes (Should get 1 ms / write – Avg Disk Writes /sec)Still make sure data is spread across as many drives as possible. Individual disks are still the slowest component for database activity.

Case Study Data WarehouseLarge Telco processing call detail records (CDRs)

Loads into SQL Server 350 million call detail records / day. Bulk loaded

Must keep a history

Reporting done all day long based off daily, weekly, monthly quarterly and yearly intervals

Very partitionable by period of time

5-7 TB database

This was based off a prototype, project being developed.

Case Study – Data Warehouse design

3 types of databasesType 1 contains 24 - 1 hour tablesType 2 contains 7 - 1 day tablesType 3 contains 7 – 1 week tables

Online consolidation was much more difficult, so all 3 are built in parallel. This is good design if Hardware has head roomData is loaded without indexes and then indexedData made available to users once indexedYesterday, Last Week and Last 7 weeks are available while Today and This week are being loaded.

Challenges for data warehouse case study

Real time reporting

Partition management – History

Maintenance of very large databases

Auto Updating statistics on new high volume data coming in.

Design using 1 server

TodayTodayTodayTodayYesterdayYesterday

YesterdayYesterdayThis weekThis week

Last weekLast week

This weekThis week

Last weekLast week

Week n-7Week n-7

PViewPView

ViewViewSQL ServerSQL Server

Design using 4 servers



Last weekLast week

This weekThis week

Last weekLast week

Week n-7Week n-7

Design using 7 servers



Last weekLast week

This weekThis week

Last weekLast week

TodayTodayYesterdayYesterday

Case Study Data Warehouse – Summary

Large Database with Very Large requirement of number of rows loaded / day.

Loaded in Parallel to different tables with Hardware being the determining bottleneckPartitioning using a data value of time

Could be partitioned using Partitioned Views or partitioned within applicationData partitioning allowed for easy scalability to multiple databases, instances or servers with minimal changes to code

Maintenance decisions weren’t made because of prototype nature of the project. Being made now

Agenda





What is Scale Out?

Multiple NodesEvenly distributed data as in data partitioning

Data Warehouse Telco example is data partitioningDistributed Partitioned Views is data partitioning

Evenly distributed workload/data as in function partitioning

Online shopping is function partitioningBreak up unrelated functions across resources vs breaking up related data

Scale Out Case Study: 13TB Data Analysis System

Provides bad loan/credit collection service and reporting (Online and batch).Designed to create an environment where credit grantors, debt buyers, and collection agencies can meet their business objectives Large volume of data inserted every night from several sourcesHeavy analytical reporting/data mining performed all day long to predict and quantify the likelihood of collecting each delinquent account in a client’s portfolio.<24 hour turn around to customer requests

Scale Out Case Study:Challenges

Manage 13TBs of data

Manage large files from several external customers / day

Load millions of rows from multiple files and run batch reporting at the same time

How do I manage different SLAs from different customers with same system.

Scale Out Case Study:Function Partitioning

Account Function Account Function Tracking DBTracking DB

Function X TablesFunction X Tables

Function Y TablesFunction Y Tables

Function Z TablesFunction Z Tables

Common DatabaseCommon Database

Common Function Common Function Procedures - DataProcedures - Data

Individual AccountIndividual AccountDatabasesDatabases

Daily Customer Source filesDaily Customer Source files

Function Partitioning

Partitioning is a VERY good ideaPartitioning by business function is very logical

Application must developed to partitioned data versus using database technology to access partitioned data. NOT DIFFICULT

Putting all the data in one single database doesn’t make you a hero.

No matter how you look at it 13TB of data is 13TB of data used to solve a business problem.

Agenda




Scale out - Clarification64 bit – Awesome, but understand the valueSummary / Q&A

64-bit – Is it an upgrade lock?

What is the hype of 64-bit?Produced world record benchmarks

New advancements in CPU – Itanium

It begs the questions:Should you automatically upgrade?

Is it a performance guarantee?

Is it an easy upgrade?

64-Bit Clear Winners

OLAP Applications with large dimensionsBreaks through the 3GB limitEssential for dimensions with millions of membersAlso can leverage large file system cache

Relational applications requiring extreme memory (> 32GB)

VLDBs with random access patterns that can be ‘cached’ to DB buffers for performanceAny DB requiring > 64GB buffersAlternative to 32-bit apps using AWE

No pressure on ‘real’ memory No overhead for mapping AWE pages

64-bit winners…

If you need to keep procedure/statement cache in the virtual address space If you need to keep the page headers in the virtual address space. The more memory we use via AWE, the more page headers we have to keep and the less space for real data in the virtual address space If you need to keep memory associated with a server side cursor in virtual address space Calculating memory for hash joins and sort operations we only can calculate with the max of the virtual address space. We can not use AWE memory for such operations. Example: Spill out in TempDB Going beyond 16GB, we have to reduce the virtual address space to 2GB. This is contra productive since we get more data to be held in virtual address space (page headers). Concurrency issues of AWE under W2K got improved with W2K3. Problem showed up with 32way hardware.

Some realities!

Documentation states that Itanium twice the CPU power/speed as Xeon although slower clock speedYou pay premium $ for a slower Itanium compared to XeonCPU-bound SQL relational workloads will see performance improvement on same number of CPUs BUT, this does NOT mean 4way Itanium is equal to 8way Xeon.Very Large Memory could take a long time to shutdown due to a system checkpoint with a lot of data to flush. Recommendation – Keep system checkpoint at small interval (example: 1 minute).

Takeaways

For relational DBMS workloads, 64-bit is not a slam-dunk

Often better to have more CPUs

Often better to have faster CPUs

Qualify before recommending. Look forMemory-intensive workloads

Beyond what can be gracefully handled with AWE today, e.g. >> 32 GB

Large-Scale OLAP workloads

Agenda





Summary

We can build VLDB systems with SQL Server today and we are doing it.Design Wins Program may be able to assist. Send email to your local Microsoft representative or [email protected] is not auto-tuning , auto configuration. It will take effort from you. Your Microsoft MCS can helpHaving this knowledge will help you be more successful while building large scalable systems

Questions…Questions…

[email protected]@microsoft.com

Community Resources

Community Resourceshttp://www.microsoft.com/communities/default.mspx

Most Valuable Professional (MVP)http://www.mvp.support.microsoft.com/

NewsgroupsConverse online with Microsoft Newsgroups, including Worldwidehttp://www.microsoft.com/communities/newsgroups/default.mspx

User GroupsMeet and learn with your peershttp://www.microsoft.com/communities/usergroups/default.mspx

http://www.microsoft.com/communities/default.mspx

http://www.mvp.support.microsoft.com/

http://www.mvp.support.microsoft.com/

http://www.microsoft.com/communities/newsgroups/default.mspx

http://www.microsoft.com/communities/usergroups/default.mspx

http://www.microsoft.com/communities/usergroups/default.mspx

Suggested Reading And Resources

The tools you need to put technology to work!The tools you need to put technology to work!

TITLETITLE AvailableAvailable

Microsoft® SQL Server™ 2000 Microsoft® SQL Server™ 2000 Administrator's Companion:0-Administrator's Companion:0-7356-1051-77356-1051-7 TodayToday

Microsoft® SQL Server™ 2000 Microsoft® SQL Server™ 2000 High Availability: 0-7356-1920-4High Availability: 0-7356-1920-4

7/9/037/9/03

Microsoft Press books are 20% off at the TechEd Bookstore

Also buy any TWO Microsoft Press books and get a FREE T-Shirt

evaluationsevaluations

© 2003 Microsoft Corporation. All rights reserved.© 2003 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.

Documents

DBA421 Building Very Large Databases with SQL Server 2000 (32/64bit) Gert E.R. Drapers Software Architect Customer Advisory Team SQL Server Development