Upload
almagesto52
View
15
Download
0
Embed Size (px)
DESCRIPTION
463C_02
Citation preview
Module 2
Planning Data Warehouse Infrastructure
Module Overview
•Considerations for Data Warehouse Infrastructure
• Planning Data Warehouse Hardware
Lesson 1: Considerations for Data Warehouse Infrastructure
• System Sizing Considerations
•Data Warehouse Workloads
• Typical Server Topologies for a BI Solution
• Scaling-out a BI Solution
• Planning for High Availability
System Sizing Considerations
Data Volume Analysis/Report Complexity
Number of Users Availability Requirements
Data Warehouse Workloads
ETL
Reporting
• Control flow tasks
• Data query and insert
• Network data transfer
• In-memory data pipeline
• SSIS Catalog or MSDB I/O
• Processing
• Aggregation storage• Multidimensional on disk
• Tabular in memory
• Query execution
• Client requests
• Data source queries
• Report rendering
• Caching
• Snapshot execution
• Subscription processing
• Report Server Catalog I/O
Operations and
Maintenance
• OS activity
• Logging
• SQL Server Agent Jobs
• SSIS packages
• Indexes
• Backups
DW
Typical Server Topologies for a BI Solution
DW
Single Server Architecture
Distributed Architecture
ServersFew Many
Hardware costs
Software license costs
Configuration complexity
Scalability & Performance
Flexibility
Scaling-out a BI Solution
Analysis ServicesData Warehouse
Integration Services Reporting Services
Planning for High Availability
Data Warehouse• AlwaysOn Failover Cluster
• RAID Storage
Analysis Services• AlwaysOn Failover Cluster
Integration Services• AlwaysOn Availability Group
Reporting Services• NLB Report Servers
• AlwaysOn Availability Group
Or
• AlwaysOn Failover Cluster
Lesson 2: Planning Data Warehouse Hardware
• SQL Server Fast Track Reference Architectures
•Core-Balanced System Architecture
•Demonstration: Calculating Maximum
Consumption Rate
•Determining Processor and Memory Requirements
•Determining Storage Requirements
•Considerations for Storage Hardware
• SQL Server Data Warehouse Appliances
• SQL Server Parallel Data Warehouse
SQL Server Fast Track Reference Architectures
• Pre-tested and approved
hardware specifications and
guidance
•Available from multiple
hardware vendors in
partnership with Microsoft
• Support for a range of data
warehouse sizes
• Tools provided to calculate
required specification
Core-Balanced System Architecture
Fib
er
Sw
itch
Server
SQL Server
Windows Server
Quad
Core
CPU
Dual Port
FC HBA
Dual Port
FC HBA
Dual Port
FC HBA
Quad
Core
CPU
Storage Enclosure
Storage
Processors
4-Spindle RAID 10 Disk Groups
Storage Enclosure
Storage
Processors
4-Spindle RAID 10 Disk Groups
Storage Enclosure
Storage
Processors
4-Spindle RAID 10 Disk Groups
Per-Core MCR = 200 MB/s
Total MCR = 1600 MB/s
Max I/O Rate = 2000 MB/s
2 x FC Port per processor
Max I/O Rate = 2000 MB/s
Max I/O Rate = 1800 MB/s
Demonstration: Calculating Maximum Consumption Rate
In this demonstration, you will see how to:
•Create tables for benchmark queries
• Execute a query to retrieve I/O statistics
•Calculate MCR from the I/O statistics
Determining Processor and Memory Requirements
Estimating CPU Requirements:
• Determine core MCR
• Apply formula to estimate required
number of cores:
• Spread cores across CPUs based on the
number of storage arrays
Estimating RAM Requirements:
•Use a minimum of 4 GB per core
(or 64 - 128 GB per socket)
• Target 20% of data volume
((Average query size in MB/ MCR) x Concurrent users) / Target response time
Determining Storage Requirements
Data Warehouse
• Determine initial data volume
• Number of fact table rows x row size
• Use 100 bytes per row as an estimate if unknown
• Add 30-40% for dimensions and indexes
• Project data growth
• Number of new fact rows per month
• Factor in compression
• Typically 3:1
Other storage requirements
• Configuration databases
• Log files
• TempDB
• Staging tables
• Backups
• Analysis Services models
Considerations for Storage Hardware
•Use the fastest disks you can afford
• Consider solid state disks―especially for random I/O
•Use RAID 10, or minimally RAID 5
•Consider a dedicated storage area network for
manageability and extensibility
• Balance I/O across enclosures, storage processors, and
disk groups
•Use more smaller disks instead of
fewer larger disks
SQL Server Data Warehouse Appliances
• Pre-built hardware and software solutions based
on tested configurations
• Part of a range of SQL Server-based appliances
•Available from multiple hardware vendors
SQL Server Parallel Data Warehouse
• Special SQL Server Edition only available in
hardware appliances
•Massively parallel processing
• Shared-nothing architecture
•Dedicated control nodes, compute nodes, and
storage nodes
Du
al Fib
er
Ch
an
nel
Database servers
(compute nodes)
Infi
nib
an
d
Storage Arrays
Control Node
Cluster
Management
Servers
Landing Zone
(ETL Interface)
Backup Nodes
Lab: Planning Data Warehouse Infrastructure
• Exercise 1: Planning Data Warehouse Hardware
Logon Information
Virtual Machine: 20463C-MIA-SQL
Use Name: ADVENTUREWORKS\Student
Password: Pa$$w0rd
Estimated Time: 30 Minutes
Lab Scenario
You are planning a data warehouse solution for
Adventure Works Cycles, and have been asked to
specify the hardware required. You must design a
SQL Server-based solution that provides the right
balance of functionality, performance, and cost.
Lab Review
Review DW Hardware Spec.xlsx in the
D:\Labfiles\Lab02\Solution folder. How does the
hardware specification in this workbook compare to
the one you created in the lab?
Module Review and Takeaways
•Review Question(s)