Mythbusting: Understanding How We Measure the Performance of MongoDB
Preview:
DESCRIPTION
Citation preview
- 1. Senior Director of Performance Engineering, MongoDB Alvin
Richards #MongoDBWorld Mythbusting: Understanding How We Measure
the Performance of MongoDB
- 2. Before we start We are going to look a lot at C++ kernel
code Java benchmarks JavaScript tests And lots of charts
- 3. Measuring "Performance"
https://www.youtube.com/watch?v=7wm-pZp_mi0
- 4. Benchmarking Some common traps Performance measurement &
diagnosis What's next
- 5. Part One Some Common Traps
- 6. The Milk Train Doesn't Stop Here Anymore Tennessee Williams
"We all live in a house on fire, no fire department to call; no way
out, just the upstairs window to look out of while the fire burns
the house down with us trapped, locked in it."
- 7. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); #1 Time taken to Insert
x Documents
- 8. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); #1 Time taken to Insert
x Documents
- 9. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); #1 Time taken to Insert
x Documents
- 10. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); #1 Time taken to Insert
x Documents
- 11. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); #1 Time taken to Insert
x Documents
- 12. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); So that looks ok,
right?
- 13. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); What are else you
measuring? Object creation and GC management?
- 14. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); What are else you
measuring? Thread contention on nextInt()? Object creation and GC
management?
- 15. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); What are else you
measuring? Time to synthesize data? Object creation and GC
management? Thread contention on nextInt()?
- 16. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); What are else you
measuring? Object creation and GC management? Thread contention on
addAndGet()? Thread contention on nextInt()? Time to synthesize
data?
- 17. long startTime = System.currentTimeMillis(); for (int
roundNum = 0; roundNum < numRounds; roundNum++) { for (int i =
0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new
BasicDBObject(); doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = ""
doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal);
aDocs[i]=doc; } coll.insert(aDocs); numInserts +=
documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); }
long endTime = System.currentTimeMillis(); What are else you
measuring? Object creation and GC management? Clock resolution?
Thread contention on nextInt()? Time to synthesize data? Thread
contention on addAndGet()?
- 18. // Pre Create the Object outside the Loop BasicDBObject[]
aDocs = new BasicDBObject[documentsPerInsert]; for (int i=0; i <
documentsPerInsert; i++) { BasicDBObject doc = new BasicDBObject();
String cVal = ""; doc.put("c",cVal); String padVal = "";
doc.put("pad",padVal); aDocs[i] = doc; } Solution: Pre-Create the
objects Pre-create non varying data outside the timing loop
Alternative Pre-create the data in a file; load from file
- 19. // Use ThreadLocalRandom generator or an instance of
java.util.Random per thread java.util.concurrent.ThreadLocalRandom
rand; for (long roundNum = 0; roundNum < numRounds; roundNum++)
{ for (int i = 0; i < documentsPerInsert; i++) { id++; doc =
aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand,
numMaxInserts)+1); } coll.insert(aDocs); numInserts +=
documentsPerInsert; } // Maintain count outside the loop
globalInserts.addAndGet(documentsPerInsert * roundNum); Solution:
Remove contention Remove contention nextInt() by making Thread
local
- 20. // Use ThreadLocalRandom generator or an instance of
java.util.Random per thread java.util.concurrent.ThreadLocalRandom
rand; for (long roundNum = 0; roundNum < numRounds; roundNum++)
{ for (int i = 0; i < documentsPerInsert; i++) { id++; doc =
aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand,
numMaxInserts)+1); } coll.insert(aDocs); numInserts +=
documentsPerInsert; } // Maintain count outside the loop
globalInserts.addAndGet(documentsPerInsert * roundNum); Solution:
Remove contention Remove contention on addAndGet() Remove
contention nextInt() by making Thread local
- 21. long startTime = System.currentTimeMillis(); long endTime =
System.currentTimeMillis(); long startTime = System.nanoTime();
long endTime = System.nanoTime() - startTime; Solution: Timer
resolution "resolution is at least as good as that of
currentTimeMillis()" "granularity of the value depends on the
underlying operating system and may be larger" Source
http://docs.oracle.com/javase/7/docs/api/java/lang/System.html
- 22. General Principal #1 Know what you are measuring
- 23. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); #2 Response time to return all
results
- 24. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); #2 Response time to return all
results
- 25. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); #2 Response time to return all
results
- 26. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); #2 Response time to return all
results
- 27. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); So that looks ok, right?
- 28. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); What are else you measuring? Each doc
is is 4080 bytes on disk with powerOf2Sizes
- 29. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); What are else you measuring? Each doc
is is 4080 bytes on disk with powerOf2Sizes Unrestricted
predicate?
- 30. BasicDBObject doc = new BasicDBObject(); doc.put("v", str);
// str is a 2k string for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new
BasicDBObject(); long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate); DBObject foundObj; while
(cur.hasNext()) { foundObj = cur.next(); } long endTime =
System.currentTimeMillis(); What are else you measuring? Each doc
is is 4080 bytes on disk with powerOf2Sizes Measuring Time to parse
& execute query Time to retrieve all document But also Cost of
shipping ~4MB data through network stack Unrestricted
predicate?
- 31. BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte",
20)); BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1); long startTime =
System.currentTimeMillis(); DBCursor cur = coll.find(predicate,
projection ); DBObject foundObj; while (cur.hasNext()) { foundObj =
cur.next(); } long endTime = System.currentTimeMillis(); Solution:
Limit the projection Return fixed range
- 32. BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte",
20)); BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1); long startTime =
System.currentTimeMillis(); DBCursor cur = coll.find(predicate,
projection ); DBObject foundObj; while (cur.hasNext()) { foundObj =
cur.next(); } long endTime = System.currentTimeMillis(); Solution:
Limit the projection Only project _id Return fixed range
- 33. BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte",
20)); BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1); long startTime =
System.currentTimeMillis(); DBCursor cur = coll.find(predicate,
projection ); DBObject foundObj; while (cur.hasNext()) { foundObj =
cur.next(); } long endTime = System.currentTimeMillis(); Solution:
Limit the projection Only project _id Only 46k transferred through
network stack Return fixed range
- 34. General Principal #2 Measure only what you need to
measure
- 35. Part Two Performance measurement & diagnosis
- 36. The Physical Principles of the Quantum Theory (1930) Werner
Heisenberg "Every experiment destroys some of the knowledge of the
system which was obtained by previous experiments."
- 37. Broad categories Micro Benchmarks Workloads
- 38. Micro benchmarks: mongo-perf
- 39. mongo-perf: goals Measure commands Configure Single mongod,
ReplSet size (1 -> n), Sharding Single vs. Multiple DB O/S
Characterize Throughput by thread count Compare
- 40. What do you get? Better
- 41. What do you get? Measured improvement between rc0 and rc2
Better
- 42. tests.push( { name: "Commands.CountsIntIDRange", pre:
function( collection ) { collection.drop(); for ( var i = 0; i <
1000; i++ ) { collection.insert( { _id : i } ); }
collection.getDB().getLastError(); }, ops: [ { op: "command", ns :
"testdb", command : { count : "mycollection", query : { _id : {
"$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
- 43. tests.push( { name: "Commands.CountsIntIDRange", pre:
function( collection ) { collection.drop(); for ( var i = 0; i <
1000; i++ ) { collection.insert( { _id : i } ); }
collection.getDB().getLastError(); }, ops: [ { op: "command", ns :
"testdb", command : { count : "mycollection", query : { _id : {
"$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
- 44. tests.push( { name: "Commands.CountsIntIDRange", pre:
function( collection ) { collection.drop(); for ( var i = 0; i <
1000; i++ ) { collection.insert( { _id : i } ); }
collection.getDB().getLastError(); }, ops: [ { op: "command", ns :
"testdb", command : { count : "mycollection", query : { _id : {
"$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
- 45. tests.push( { name: "Commands.CountsIntIDRange", pre:
function( collection ) { collection.drop(); for ( var i = 0; i <
1000; i++ ) { collection.insert( { _id : i } ); }
collection.getDB().getLastError(); }, ops: [ { op: "command", ns :
"testdb", command : { count : "mycollection", query : { _id : {
"$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
- 46. Code Change
- 47. Workloads "public" workloads YCSB Sysbench "real world"
simulations Inbox fan in/out Message Stores Content Management
- 48. Example: Bulk Load Performance 16m Documents Better 55%
degradation 2.6.0-rc1 vs 2.4.10
- 49. Ouch where's the tree in the woods? 2.4.10 -> 2.6.0 4495
git commits
- 50. git-bisect Bisect between good/bad hashes git-bisect
nominates a new githash Build against githash Re-run test Confirm
if this githash is good/bad Rinse and repeat
- 51. Code Change - Bad Githash
- 52. Code Change - Fix
- 53. Bulk Load Performance - Fix Better 11% improvement 2.6.1 vs
2.4.10
- 54. The problem with measurement Observability What can you
observe on the system? Effect What effects can an observation
cause?
- 55. mtools
- 56. mtools MongoDB log file analysis Filter logs for
operations, events Response time, lock durations Plot
https://github.com/rueckstiess/mtools
- 57. Response Times > 100ms Bulk Insert 2.6.0-rc0 Ops/Sec
Time
- 58. Response Times > 100ms Bulk Insert 2.6.0-rc0 vs.
2.6.0-rc2 Floor raised
- 59. Code Change Yielding Policy
- 60. Code Change
- 61. Response Times Bulk Insert 2.6.0 vs 2.6.1 Ceiling similar,
lower floor resulting in 40% improvement in throughput
- 62. Secondary effects of Yield policy change Write lock time
reduced Order of magnitude reduction of write lock duration
- 63. > db.serverStatus() Yes will cause a read lock to be
acquired > db.serverStatus({recordStats:0}) No lock is not
acquired > mongostat Yes - until SERVER-14008 resolved, uses
db.serverStatus() Unexpected side effects of measurement?
- 64. CPU sampling Get an impression of Call Graphs CPU time
spent on node and called nodes
- 65. > sudo apt-get install google-perftools > sudo
apt-get install libunwind7-dev > scons --use-cpu-profiler mongod
Setup & building with google- profiler
- 66. > mongodb dbpath Note: Do not use fork > mongo >
use admin > db.runCommand({_cpuProfilerStart: {profileFilename:
'foo.prof'}}) Execute some commands that you want to profile >
db.runCommand({_cpuProfilerStop: 1}) Start the profiling
- 67. Sample start vs. end of workload
- 68. Sample start vs. end of workload
- 69. Code change
- 70. Public Benchmarks Not all forks are the same YCSB
https://github.com/achille/YCSB sysbench-mongodb
https://github.com/mdcallag/sysbench-mongodb
- 71. Part Three And next?
- 72. Beavis & Butthead "The future sucks. Change it." "I'm
way cool Beavis, but I cannot change the future."
- 73. What we are working on mongo-perf UI refactor Adding more
micro benchmarks Workloads Adding external benchmarks Creating
benchmarks for common use cases Inbox fan in/out Analytical
dashboards Stream / Feeds Customers, Partners & Community
- 74. Here's how you can help change the future! Got a great
workload? Great benchmark? Want to donate it?
alvin@mongodb.com
- 75. Don't be that benchmark #1 Know what you are measuring #2
Measure only what you need to measure
- 76. alvin@mongodb.com Senior Director of Performance
Engineering, MongoDB Alvin Richards #MongoDBWorld Thank You