Upload
morgan-tocker
View
444
Download
3
Embed Size (px)
DESCRIPTION
ZendCon 2014 Presentation
Citation preview
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
MySQL Performance Metrics That Matter
Morgan Tocker MySQL Community Manager October, 2014
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
3
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |4
Today’s Agenda
Introduction
From the Database
From the Operating System
From the Application
1
2
3
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Today’s Question Start With
• Will my database server be able to scale another 2x? • How about 10x?
• Is it under load? • When I run this benchmark on my macbook…
5
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
These are actually hard questions
• Number of active user sessions by 10x? • Database size by 10x? • Working set by 10x?
• Some query patterns will scale better than others. • Unfortunate answer: it depends
6
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
What questions can we answer?
• What can we measure that is useful to gauge performance?
• Let’s include scope as: • Inside MySQL • From Operating System • From your application
7
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Performance Phrasebook
• Response -‐ “seconds per query”. • Throughput -‐ “queries per second”. • Scalability -‐ how big can I make throughput?
Most benchmarks measure throughput. Users care about response time.
8
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Performance Consistency
• Response Time should be measured in a percentile-‐based score. • i.e. 99th percentile.
• Measuring throughput only does not show sudden stalls that users may experience. • e.g. an expensive item in cache expiring.
9
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 10
Today’s Agenda
Introduction
From the Database
From the Operating System
From the Application
1
2
3
4
2
1
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |11
Transactio
nSystem
Storage
Caching
SYS_TABLES
ibda
ta1
spac
e 0
Page Cache
A.ibd
B.ibd
C.ibd
IBUF_HEADERIBUF_TREETRX_SYS
FIRST_RSEGDICT_HDR
Data Dict.
SYS_COLUMNSSYS_INDEXESSYS_FIELDS
Block 1 (64 pages)Block 2 (64 pages)
iblogfile0 iblogfile1 iblogfile2
Tables withfile_per_tableDoublewrite Buffer
Buffe
r Poo
l Data Dictionary Cache
Adaptive Hash Indexes
Buffer Pool LRU
Additional Mem Pool
Log Buffer
Log
Gro
up
Buffer Pool Flush List
In Memory On Diskhttps://github.com/jeremycole/innodb_diagrams
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |12
InnoDBTransa
ctio
nSystem
Storage
Caching
SYS_TABLES
ibda
ta1
spac
e 0
Page Cache
A.ibd
B.ibd
C.ibd
IBUF_HEADERIBUF_TREETRX_SYS
FIRST_RSEGDICT_HDR
Data Dict.
SYS_COLUMNSSYS_INDEXESSYS_FIELDS
Block 1 (64 pages)Block 2 (64 pages)
iblogfile0 iblogfile1 iblogfile2
Tables withfile_per_tableDoublewrite Buffer
Buffe
r Poo
l Data Dictionary Cache
Adaptive Hash Indexes
Buffer Pool LRU
Additional Mem Pool
Log BufferLo
g G
roup
Buffer Pool Flush List
SELECT * FROM a WHERE id = 10;
mysqld
Not Found
Query -‐ Non Cached
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |13
InnoDBTransa
ctio
nSystem
Storage
Caching
SYS_TABLES
ibda
ta1
spac
e 0
Page Cache
A.ibd
B.ibd
C.ibd
IBUF_HEADERIBUF_TREETRX_SYS
FIRST_RSEGDICT_HDR
Data Dict.
SYS_COLUMNSSYS_INDEXESSYS_FIELDS
Block 1 (64 pages)Block 2 (64 pages)
iblogfile0 iblogfile1 iblogfile2
Tables withfile_per_tableDoublewrite Buffer
Buffe
r Poo
l Data Dictionary Cache
Adaptive Hash Indexes
Buffer Pool LRU
Additional Mem Pool
Log BufferLo
g G
roup
Buffer Pool Flush List
SELECT * FROM a WHERE id = 10;
mysqld
Query -‐ Cached
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |14
Update Query in a Transaction (simplified)
InnoDBTransa
ctio
nSystem
Storage
Caching
SYS_TABLES
ibda
ta1
spac
e 0
Page Cache
A.ibd
B.ibd
C.ibd
IBUF_HEADERIBUF_TREETRX_SYS
FIRST_RSEGDICT_HDR
Data Dict.
SYS_COLUMNSSYS_INDEXESSYS_FIELDS
Block 1 (64 pages)Block 2 (64 pages)
iblogfile0 iblogfile1 iblogfile2
Tables withfile_per_tableDoublewrite Buffer
Buffe
r Poo
l Data Dictionary Cache
Adaptive Hash Indexes
Buffer Pool LRU
Additional Mem Pool
Log Buffer
Log
Gro
up
Buffer Pool Flush List
UPDATE a SET col1 = ‘new’ WHERE id = 10;
mysqld
commit;
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Metrics to Measure
• Size of the buffer pool: innodb_buffer_pool_size • Amount of the buffer pool currently being used:
15
----------------------BUFFER POOL AND MEMORY----------------------Total large memory allocated 137494528Dictionary memory allocated 3129906Buffer pool size 8192Free buffers 1037Database pages 7153..5.59 reads/s, 341.58 creates/s, 378.21 writes/s
Quick Math:
8192 as 16K pages= 128M (default)
16MB free.
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Metrics to Measure
• If there are free buffers, and still reads per second, the caches are probably still warming up:
16
----------------------BUFFER POOL AND MEMORY----------------------Total large memory allocated 137494528Dictionary memory allocated 3129906Buffer pool size 8192Free buffers 1037Database pages 7153..5.59 reads/s, 341.58 creates/s, 378.21 writes/s
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Metrics to Measure
• If there are few free buffers, and a large number of reads, your working set is not fitting in memory:
17
----------------------BUFFER POOL AND MEMORY----------------------Total large memory allocated 137494528Dictionary memory allocated 3129906Buffer pool size 8192Free buffers 1037Database pages 7153..5.59 reads/s, 341.58 creates/s, 378.21 writes/s
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Metrics to Measure
• Modified (dirty) pages shows work that has been delayed to be completed later. innodb_max_dirty_pages_pct (default: 75) helps ensures this number does not get too high:
18
----------------------BUFFER POOL AND MEMORY----------------------Total large memory allocated 137494528Dictionary memory allocated 3129906Buffer pool size 8192..Modified db pages 32175.59 reads/s, 341.58 creates/s, 378.21 writes/s
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
InnoDB Page Churn
• 5.1 and below -‐ LRU. • 5.5 and above -‐ midpoint insertion.
• 5.6 and above -‐ midpoint insertion with default 1s wait before hotlist promotion.
19
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Reducing Buffer Pool Churn
• Optimize Queries
• Add Indexes / Remove Indexes
• Increase buffer pool size / add more RAM
• Transparent Page Compression
• Table Partitioning
• Normalize Table Schema
• Denormalize Table Schema
• Change Page Size
• Optimize Data Types
• Optimize Tables
20
.. By Improving Working Set Efficiency
Danger: Not all of these techniques work all of the time!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
How much churn is too much?
• Rule of thumb is that if you need to access a page, you should be able to keep it in memory for 5 minutes.
• Thus, having to churn out pages any more frequently means too little memory.
21
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
You can get close to this number from MySQL…
• It is possible to track distinct pages accessed over 5 minutes.
22
http://www.tocker.ca/2013/05/31/estimating-‐mysqls-‐working-‐set-‐with-‐information_schema.html
mysql> call test.estimate_working_set(10, 30);..+----------------------+| pages_in_working_set |+----------------------+| 100679 |+----------------------+1 row in set (5 min 55.61 sec)
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Twice as Much Ram != Twice as Fast
• Production • 10G of RAM
• 100 IOPS Storage Device (single hard disk)
• Workload is 100% reads
• 10K queries/second
• QA • 9G of RAM
• 100 IOPS Storage Device (single hard disk)
• Same workload (100% reads)
• Can only do 1K queries/second!
23
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
InnoDB Log System
• Like a tank tread. • Concatenated log files cyclic write.
• Initial memory buffer, once per second + once per commit flush.
24
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Log Metrics that Matter
• Need to retain sufficient space • Head can not overwrite tail.
25
---LOG---Log sequence number 478969738Log flushed up to 478969738Pages flushed up to 418299472Last checkpoint at 4165427590 pending log flushes, 0 pending chkp writes5784 log i/o's done, 5.95 log i/o's/second
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Log Metrics that Matter
• These are byte counts: • 57.8MB bytes log space used here
• Compare to innodb_log_file_size*innodb_log_files_in_group (default:2) and innodb_adaptive_flushing_lwm (default: 10)
26
Log sequence number 478969738Log flushed up to 478969738Pages flushed up to 418299472..
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Increasing Background Work
• Specifying innodb_io_capacity, innodb_lru_scan_depth (mysql 5.6).
• Rule of thumb: • 7200 RPM Hard Drive = 100 IOPS • Faster Hard drive = 200 IOPS • SSD = 3000+ IOPS (consult manufacturer details)
27
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
InnoDB Throughput Number
• Pages read/second -‐ my favourite number to read, since it is most “raw” cost.
• Queries/second doesn’t show how many rows/pages need to be touched.
28
--------------ROW OPERATIONS--------------0 queries inside InnoDB, 0 queries in queue0 read views open inside InnoDBProcess ID=16185, Main thread ID=5137293312, state: sleepingNumber of rows inserted 2093751, updated 0, deleted 0, read 208873547462.57 inserts/s, 0.00 updates/s, 0.00 deletes/s, 47462.57 reads/s
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Additional Useful Metrics to Measure
• I like to look at working set size as percentage of data. • Table growth rate:
• http://www.percona.com/blog/2010/12/08/getting-‐history-‐of-‐table-‐sizes-‐in-‐mysql/
29
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
How slow is a slow query?
• When optimizing for response: • My first preference is to find slow queries from the application.
• Accounting for network for latency, any query can be slow (in aggregate).
30
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
How slow is a slow query? (cont.)
• When optimizing for throughput: • Performance Schema (5.6) SYS statement digest can show query performance for right now.
• Alternative: • The slow query log w/a zero second long query time • Capturing “a typical workload” (i.e. 20 minutes) • Aggregate and analyze.
31
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Surrogate Measures
• Temp tables on disk • Sort merge passes • etc.
32
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 33
Today’s Agenda
Introduction
From the Database
From the Operating System
From the Application
1
2
3
4
1
3
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
CPU Usage
• Is 50% CPU usage a good thing? How about 100%? • New versions use more CPUs. • It can be a good thing!
34
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Reason is Mutexes
• “Internal Locking” that every multi-‐threaded program requires to protect access to resources.
• Many mutexes refactored and impact reduced.
35
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Query Cache -‐ An Example
• Uses a global lock to protect cache • Only one person can insert into the cache at a time. • Now disabled by default. • Enabling it may reduce CPU usage, but look at throughput numbers • May perform worse.
36
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Mpstat
37
mpstat -P ALLLinux 2.6.32-100.28.5.el6.x86_64 (dev-db) 07/09/2011 _x86_64_ (4 CPU)
10:28:04 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle10:28:04 PM all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.9910:28:04 PM 0 0.01 0.00 0.01 0.01 0.00 0.00 0.00 0.00 99.9810:28:04 PM 1 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 99.9810:28:04 PM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.0010:28:04 PM 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Memory Usage
• Understand difference between virtual and resident size:
38
[root@ip-10-183-136-169 mysql]# top -bn1top - 18:24:52 up 1:27, 1 user, load average: 0.73, 0.47, 0.22Tasks: 92 total, 1 running, 91 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.1%sy, 0.0%ni, 98.0%id, 0.6%wa, 0.0%hi, 0.0%si, 1.2%stMem: 70201328k total, 7684128k used, 62517200k free, 16356k buffersSwap: 0k total, 0k used, 0k free, 4233840k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND.. 18842 mysql 20 0 43.2g 2.9g 5832 S 0.0 4.3 0:09.86 mysqld..
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Memory Usage (cont.)
• After 40G buffer pool fills:
39
[root@ip-10-183-136-169 ~]# top -bn1 | grep 'PID\|mysqld' PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND18842 mysql 20 0 43.3g 42g 6336 S 65.4 64.1 20:53.31 mysqld18611 root 20 0 105m 1464 1172 S 0.0 0.0 0:00.03 mysqld_safe
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
IO Usage
• iostat is the canonical tool. • Not easy to read -‐ 100% utilized may be very healthy.
40
$ iostat -xLinux 2.6.32-100.28.5.el6.x86_64 (dev-db) 07/09/2011
avg-cpu: %user %nice %system %iowait %steal %idle 5.68 0.00 0.52 2.03 0.00 91.76
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilsda 27.86 63.53 61.77 132.91 1096.46 1598.40 13.84 0.21 1.06 2.28 44.45sda1 0.69 33.22 48.54 129.63 773.30 1328.84 11.80 1.39 7.82 2.28 40.57sda2 27.16 30.32 13.23 3.28 323.13 269.56 35.90 0.55 32.96 3.44 5.68sdb 39.15 215.16 202.20 169.04 945.80 1073.13 5.44 1.05 2.78 1.64 60.91sdb1 39.15 215.16 202.20 169.04 945.77 1073.13 5.44 1.05 2.78 1.64 60.91sdc 8.90 3.63 356.56 51.40 207.01 972.24 2.89 1.04 2.56 1.55 63.30sdc1 8.90 3.63 356.55 51.40 206.99 972.24 2.89 1.04 2.56 1.55 63.30
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 41
Today’s Agenda
Introduction
From the Database
From the Operating System
From the Application
1
2
3
4
1
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
My favourite statistics
• Goal for the page. • Something you set. i.e. 50ms.
• Number of queries • Total time spend in MySQL • Break down of time per query with query text.
42
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Why?
• Much easier to optimize around a task that is too slow, than a complete system.i.e. • “logins are slow”: Look at the breakdown of time, fix the part that takes the most time for the least value.
• “the database is slow”: Look at slow queries, and potentially optimize some queries part of background tasks.
43
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Rule of Thumb
• Per Query Network Latency between 0.1ms and 1ms. • Higher (maybe 3ms) if using multiple AZs etc.
44
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Query Math On Example Table
• Retrieving all 239 rows in one query: 1.14 ms • With network latency: 2.14ms
• Retrieving each row at a time: 0.239ms • With network latency: 296ms
45
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Back to Today’s QuestionIt started with “it depends”
46
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Will my database server be able to scale another 10x?• Working set:
• If you can project it to stay in memory, you have the most predictable growth curve.
• Database size: • Not always an issue in itself, but certain operations become harder.
• i.e. mysqldump, time it takes to perform DDL.
47
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Will my database server be able to scale another 10x? (cont.)• Active Sessions:
• You may start experiencing deadlocks and other soft-‐errors.
• Your application is expected to handle these. • Query Volume:
• MySQL 5.7 is up to 600K QPS :) • The server itself scales very well.
48
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Hidden Growth Cliffs
• Watch out for query patterns that are non scalable over time. e.g. • A time series database that recalculates statistics from the beginning of time.
• On day 1 when it is built, it will be fast. • On day 500, it’s ~500x more complex.
49
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |50