MongoUK - Approaching 1 billion documents with MongoDB1 Billion Documents

Preview:

DESCRIPTION

Presentation given at the MongoUK conference on 18th June 2010 by David Mytton on approaching 1 billion documents in MongoDB.

Citation preview

Approaching 1 Billion Documents in MongoDB

David Myttondavid@boxedice.com / @davidmytton1/30

Server Density Monitoring

Processing Database UI

www.serverdensity.com2/30

Cache / Data Store

Postback

checksLatest checksHistorical

3/30

db.stats()

Documents 937,393,315

Collections 27,566

Indexes 45,277

Stored data 638GB

Inserts 5000-8000/s

As of 17th Jun 2010.4/30

13 months ago

Why we moved: http://bit.ly/mysqltomongo5/30

Initial Setup

MasterDC1

8GB RAM

SlaveDC2

8GB RAM

Replication

6/30

Vertical Scaling

MasterDC1

72GB RAM

SlaveDC2

8GB RAM

Replication

7/30

Tip #1

Keep your indexes in memory at all times.

db.stats()

8/30

i/o not an issue

9/30

Data is flushed to disk every 60s.

db.runCommand({fsync:1});

--syncdelay [60]

Tip #2

10/30

Sharding solves everything

11/30

Manual Partitioning

Master ADC1

16GB RAM

Slave ADC2

16GB RAM

Replication

Master BDC1

16GB RAM

Slave BDC2

16GB RAM

Replication

12/30

Sustained Traffic

Avg out: 2.4Mbit/s

Avg in: 3.8Mbit/s

Master

Avg out: 4.0Mbit/s

Avg in: 111.2Kbit/s

Slave

13/30

Database vs collections

• Many databases = many data files (small but quickly get large).

• Many collections = watch namespace limit.

14/30

Namespaces = Number of collections + number of indexes

15/30

Tip #3

Monitor the 24,000 namespace limit.

16/30

Using Server Density

17/30

Console

db.system.namespaces.count()

18/30

Replica Pairs = Failover

Master ADC1

16GB RAM

Slave ADC2

16GB RAM

Replica Pair

Master BDC1

16GB RAM

Slave BDC2

16GB RAM

Replica Pair

19/30

Tip #4

Pre-provision your oplog files.

20/30

for i in {0..40} do echo $i head -c 2146435072 /dev/zero > local.$i done

A shell script to generate 75GB oplog files

21/30

Tip #5

Expect slower performance during initial replica sync.

22/30

Tip #6

You can rotate your log files from the console.

23/30

Rotating your log files

db.runCommand("logRotate")

24/30

Tip #7

Index creation blocks by default. Use background

indexing if necessary.

MongoDB Manual: http://bit.ly/mongobgindex25/30

Tip #8

Increase your OS file descriptor limit + use

persistent connections.

26/30

Too many open files!

mongo hard nofile 10000mongo soft nofile 10000

/etc/security/limits.conf

UsePAM yes

/etc/ssh/sshd_config

user type limit

27/30

Space is not reused

Data + indexes 551GB

Actual disk usage 638GB

Fixed in

1.1.4 1.3.x 1.5.0 1.5.1 1.5.2 1.5.3 1.5.4?

JIRA: SERVER-36628/30

Summary1. Keep indexes in memory.

2. Data is flushed to disk every 60s.

3. Monitor the 24k namespace limit.

4. Pre-provision oplog files.

5. Expect slower performance on replica sync.

6. Rotate logs from the console.

7. Index creation blocks by default.

8. OS file descriptor limit + persistent connections.29/30

David Myttondavid@boxedice.com / @davidmytton

Slides

blog.boxedice.com/mongodb

30/30

Recommended