Upload
mongodb
View
1.161
Download
1
Embed Size (px)
DESCRIPTION
Deploying MongoDB can be a challenge if you don't understand how resources are used nor how to plan for the capacity of your systems. If you need to deploy, or grow, a MongoDB single instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment. This talk will cover what resources MongoDB uses, and how to plan for their use in your deployment. Topics covered will include understanding how to model and plan capacity needs from the perspective of a new deployment, growing an existing one, and defining where the steps along scalability on your path to the top. The goal of this presentation will be to provide you with the tools needed to be successful in managing your MongoDB capacity planning tasks.
Citation preview
Server Engineer
Shaun Verch
Capacity Planning:
Deploying MongoDB
Capacity Planning
• Why is it important?
• What is it?
• When is it important?
• How is it actually done?
Why?
• What are the consequences of not planning?
Why does it matter?
What?
What is Capacity Planning?
Requirements
Resources
• Availability• Throughput• Responsiveness
Requirements
• Availability• Throughput• Responsivenes
s
Requirements to Hardware
Resource Usage
• Storage
– IOPS
– Size
– Data & Loading Patterns
• Memory
– Working Set
• CPU
– Speed
– Cores
• Network
– Latency
– Throughput
Storage
• Active
• Archival
• Loading Patterns
• Integration (BI/DW)
Storage
• Active
• Archival
• Loading Patterns
• Integration (BI/DW)
Example IOPS
Example IOPS
7,200 rpm SATA ~ 75-100 IOPS
15,000 rpm SAS ~ 175-210 IOPS
Amazon
EBS/Provisioned
~ 100 IOPS "up to" 2,000
IOPS
Amazon SSD 9,000 – 120,000 IOPS
Storage Capability
Intel X25-E (SLC) ~ 5,000 IOPS
Fusion IO ~ 135,000 IOPS
Violin Memory 6000 ~ 1,000,000 IOPS
Example IOPS
7,200 rpm SATA ~ 75-100 IOPS
15,000 rpm SAS ~ 175-210 IOPS
Amazon
EBS/Provisioned
~ 100 IOPS "up to" 2,000
IOPS
Amazon SSD 9,000 – 120,000 IOPS
Storage Capability
Intel X25-E (SLC) ~ 5,000 IOPS
Fusion IO ~ 135,000 IOPS
Violin Memory 6000 ~ 1,000,000 IOPS
Cost of IOPS
7,200 rpm SATA ~ 75-100 IOPS
15,000 rpm SAS ~ 175-210 IOPS
Amazon
EBS/Provisioned
~ 100 IOPS "up to" 2,000
IOPS
Amazon SSD 9,000 – 120,000 IOPS
Storage Costs
Memory
• Working Set– Active Data in Memory
– Measured Over Periods
Memory
• Work:
–Sorting
–Aggregation
–Connections
SORTS
Connections
Aggregations
Memory & Storage
><?
Working Set
Number of distinct pages
accessed per unit of time
Working Set
Number of distinct pages
accessed per second
Working Set
4 distinct pages per second
Working Set
4 distinct pages per second
Working Set
4 distinct pages per second
Worst case 4 disk accesses
Working Set
6 distinct pages per second
Working Set
6 distinct pages per second
Working Set
6 distinct pages per second
Working Set
6 distinct pages per second
Worst case disk access on every op
Memory & Storage
MOPs
PFs
CPU
• Non-indexed Data
• Sorting
• Aggregation
– Map/Reduce
– Framework
• Data
– Fields
– Nesting
– Arrays/Embedded-Docs
Network
• Latency
– WriteConcern
– ReadPreference
– Batching
• Throughput
– Update/Write Patterns
– Reads/Queries
What is failure?
• We have failed at Capacity Planning when our
resources don’t meet our requirements
• Because our requirements can have many
dimensions, we may exceed our requirements in
one characteristic but not meet them in another
• This means that we can spend many $$$ and still
fail!
What about Legacy Hardware?
• Let’s hope whatever worked for this legacy
technology also works for MongoDB
• Same principles of Capacity Planning still apply
When?
• Before it's too late!
• When?
Capacity Planning: When
Start Launch Version 2
Capacity Planning is Measurement
Measuring early gives you a comparison point for when you need to do it again
Velocity of Change
• Limitations -> takes time
– Data Movement
– Allocation/Provisioning (servers/mem/disk)
• Improvement
– Limit Size of Change (if you can)
– Increase Frequency
– MEASURE its effect
– Practice
Repeat (continuously)
• Repeat Testing
• Repeat Evaluations
• Repeat Deployment
How?
Monitoring
Monitoring Storage
Memory
CPU
Network
Application Metrics
Tools
• MMS (MongoDB Monitoring Service)
• MongoDB: mongotop, mongostat
• Linux: iostat, vmstat, sar, etc
• Windows: Perfmon
Measure realistic loads (generated by Load testing)
Models
• Load/Users
– Response Time/TTFB
• System Performance
– Peak Usage
– Min Usage
Starter Questions
• What is the working set?
– How does that equate to memory
– How much disk access will that require
• How efficient are the queries?
• What is the rate of data change?
• How big are the highs and lows?
Deployment Types
All of these use the same resources:
• Single Instance
• Multiple Instances (Replica Set)
• Cluster (Sharding)
• Data Centers
Questions?
Server Engineer, MongoDB
Shaun Verch
Thank You