MYSQL PERFORMANCE TUNING - files.meetup.com Performance Tuning Chicago Meetup...• MariaDB –...

Preview:

Citation preview

MYSQL PERFORMANCE TUNING

Arnaud Adant 18th of July 2016

WHO AM I• aadant@jumptrading.com • 3.5 years @ Oracle MySQL as support engineer

– Filed hundreds of bugs and feature requests • InnoDB • Optimizer • High availability • Performance

• Joined Jump Trading in 2015

AGENDA

• Introduction • Mostly performance tuning tips

• this presentation is an update from the 50 tips presentation done while at Oracle for MySQL 5.6

• Questions

HOW TO PRONOUNCE MYSQL• MySQL is officially pronounced

/maɪˌɛskjuːˈɛl/ ("My S-Q-L")

RTFM : The official way to pronounce “MySQL” is “My Ess Que Ell” (not “my sequel”), but we do not mind if you pronounce it as “my sequel” or in some other localized way.

A BIT OF HISTORY• MySQL was created by Michael “Monty” Widenius, a

Swede living in Finland • Monty got 3 children :

– Maria => MariaDB – My => MySQL – Max => MaxDB

• MySQL started in 1995 • now one of the most successful open source databases

AT THE HEART OF INTERNET

MYSQL FLAVORS• Oracle MySQL • Percona • MariaDB • Webscale, Facebook, Taobao, Twitter, Alibaba • Amazon Aurora • Google cloudSQL • … Drizzle (dead)

MYSQL FLAVORS• Oracle MySQL

– Oracle proved to be a good steward since 2010 – Introduced best practices in development – Good support – Invested in R&D, QA, InnoDB … – Performance_schema – Crash safe GTID replication – JSON support – Worked hard on scalability

MYSQL FLAVORS• MariaDB

– Founded in 2009 by … Monty – A truly disruptive MySQL fork – GPL was not enough ! – “Drop in” compatibility – Provides support and consulting – Different implementation of GTID – MaxScale, integrates Galera, spider, column store – More features, will integrate anything GPL – Fast paced, QA is more challenging

MYSQL FLAVORS • Percona

– A branch of the main trunk – Founded in 2006, major player in consulting and support – Excellent support – Focus on performance and stability (additional QA) – Percona tools, Percona server, xtrabackup – Works closely with Oracle and adds performance features on top

• Helps innovation

MYSQL FLAVORS• Google, Facebook, Twitter, Linkedin, Taobao, Alibaba

– All major contributors – Give back to the community – WebscaleSQL – Facebook is pushing RocksDB

• LSM tree based engine • Linkbench • Online schema changes • Blogging, influencing

MARIADB / ORACLE VERSIONSVersion Original

release date Latest version Release date Status

5.1 2009-10-29 5.1.67 2013-01-30 Stable (GA)

5.2 2010-04-10 5.2.14 2013-01-30 Stable (GA)

5.3 2011-07-26 5.3.12 2013-01-30 Stable (GA)

5.5 2012-02-25 5.5.49 2016-04-22 Stable (GA)

10.0 2012-11-12 10.0.25 2016-04-30 Stable (GA)

10.1 2014-06-30 10.1.14 2016-05-10 Stable (GA)

10.2 2016-04-18 10.2.0 2016-04-18 Alpha

5.5 maps 5.5 10.0 and 10.1 map 5.6 10.2 maps 5.7, 5.8 is to become MySQL 8.0

Source : Wikipedia

TOP PERFORMANCE ISSUES• Bad SQL queries : 90 % of the time • Long running idle transaction

• high history_list_length • MDL

• Replication lag • no primary key and row based replication

• Wrong / bad configuration • smaller buffer pool, query cache, small redo logs

• OS, hardware

PERFORMANCE CHECK LIST• #1 Monitor the queries …

• tune the apps • #2 Monitor replication

• tune replication • #3 Monitor the status variables

• tune the config • #4 Monitor the OS

• tune the OS • #5 in doubt, benchmark

THE APP (DEFINITION)• Everything that is not shipped with the database

• schema • queries • user connections • the app data in data files • …

TUNING PROCESS

replication

status variables

mysqld

Monitor / measure

backup

Tunethe app

Config /design

config variables

benchmark OS

OS

config variables

OSthe app

app design data files

CHANGE PROCESS• One change at a time, test on dev = prod, then deploy

Development Production

deploy

Change TestMonitor

OK

Monitor

TUNING PROCESS1. Config

• OS • Config variables • App design • Benchmark

2. Monitor • App / status variables / replication / mysqld / os

3. Tune using the change process • App • Config variables • Table spaces • OS

CONFIG / OS 1/2• RAM • CPU • Storage • OS • OS limits • Battery backed disk cache • Memory allocator • CPU affinity

CONFIG / OS 2/2• I/O scheduler • File system • Mount options • Disk configuration • NUMA

RAM• The active data should fit in the buffer pool • MySQL connections and caches take memory • ECC RAM recommended • Extra RAM for

• FS cache • Monitoring • RAM disk (tmpfs)

• RAM is split over NUMA nodes

CPU• Fast CPU for single threaded performance • Recent servers have 32 to 80 cores. • Enable hyper-threading • MySQL 5.7 scales to 64 cores

CPU

http://dimitrik.free.fr/blog/archives/2016/02/mysql-performance-scalability-on-oltp_rw-benchmark-with-mysql-57.html

STORAGE• Good for IO bound loads • HDD for sequential reads and writes • Bus-attached SSD for random reads and writes

• NVMe : Non-Volatile Memory Express • Big sata or other disk for log files • Several disks ! • Life time for low end SSD is a concern

OPERATING SYSTEM• Linux !

• pick a modern distribution • also works well on Windows, FreeBSD, MacOS, Solaris

OS LIMITS (LINUX)• Max open files per process

• ulimit –n • limits the number of file handles (connections, open

tables, …) • Max threads per user

• ulimit –u • limits the number of threads (connections, event

scheduler, shutdown)

BATTERY BACKED DISK CACHE• Usually faster fsyncs

• InnoDB redo logs • binary logs • data files

• Crash safety • Durability (ACID) • Applies to SSD

MEMORY ALLOCATOR• jemalloc is a good malloc replacement

[mysqld_safe] malloc-lib=/usr/lib64/libjemalloc.so.1 Default with MariaDB

• tcmalloc shipped on Linux with MySQL [mysqld_safe] malloc-lib=tcmalloc

CPU USAGE TUNING

• taskset command on Linux for core assignment • taskset -c 1-4 `pidof mysqld` • taskset -c 1,2,3,4 `pidof mysqld`

• niceness [mysqld_safe] nice=-20

• CPU governor • set to performance

FILE SYSTEM• ext4 best choice for speed and ease of use

• fsyncs a bit slower than ext3 • more reliable

• xfs is excellent (for experts only) • With innodb_flush_method = O_DIRECT • less stable recently

• ext3 is also a good choice

MOUNT OPTIONS

• ext4 (rw,noatime,nodiratime,nobarrier,data=ordered) • xfs (rw, noatime,nodiratime,nobarrier,logbufs=8,logbsize=32k) • SSD specific

• trim • innodb_page_size = 4K • Innodb_flush_neighbors = 0

I/O SCHEDULER

• deadline is generally the best I/O scheduler • echo deadline > /sys/block/{DEVICE-NAME}/queue/scheduler • the best value is HW and WL specific • noop on high end controller (SSD, good RAID card …) • deadline otherwise

DISK CONFIGURATION

• everything on one disk is killing performance • several disks (RAID) • or distribute the data files on different disks

• data files (ibd files) • main InnoDB table space ibdata • redo logs • undo logs (if separate) • binary logs

NUMA

• NUMA architecture is the norm nowadays for mysql server • Problem : RAM is allocated to CPU sockets • The Buffer pool should be distributed in each RAM • Percona, MariaDB and MySQL >= 5.6 support NUMA • The usual trick is using mysql_safe

• flush the FS cache --flush-caches • use NUMA interleave --numa-interleave

CONFIG / VARIABLES 1/2• Buffer pool size • Query cache off • Use a thread pool • Cache the tables • Cache threads • Per thread memory usage • Default storage engine • Buffer pool contention

CONFIG / VARIABLES 2/2• Large redo logs • IO capacity • InnoDB flushing • Thread concurrency • InnoDB table spaces • Transaction isolation • Replication : sync_binlog • Replication : parallel threads • Connector configuration

INNODB BUFFER POOL SIZE• innodb_buffer_pool_size • Not too large for the data • Do not swap ! • Beware of memory crash if swapping is disabled • Active data <= innodb_buffer_pool_size <= 0.8 * RAM

DISABLE THE QUERY CACHE• Single threaded cache • Only if threads_running <= 4 • Becomes fragmented • Cache should be in the App ! • Off by default from 5.6 • query_cache_type = 0 • query_cache_size =0 • Problem if qcache_free_blocks > 50k

ENABLE THE THREAD POOL

• https://www.percona.com/blog/2014/01/23/percona-server-improve-scalability-percona-thread-pool/

• Stabilize TPS for high concurrency

• Useful if threads_running > hardware threads

• Decrease context switches • Several connections for one

execution thread • Acts as a Speed Limiter • MySQL commercial,

Percona, MariaDB

TABLE CACHING• table_open_cache • not too small, not too big • opened_tables / sec • table_definition_cache • do not forget to increase • opened_table_definitions / sec • table_cache_instances = 8 or 16 (MySQL and Percona only) • innodb_open_files • mdl_hash_instances = 256 (in 5.7, no more an issue)

THREAD CACHING• Thread creation is expensive, so caching is important • thread_cache_size

• decreases threads_created rate • capped by max user processes (see OS limits) • 5.7.2 refactors this code • Default value is calculated in 5.7 • http://dev.mysql.com/doc/refman/5.7/en/server-system-

variables.html#sysvar_thread_cache_size

INNODB STORAGE ENGINE• Should be the default storage engine • Do not use MyISAM unless you know what you are doing • The most advanced storage engine is InnoDB • Scalable • Temporary tables use InnoDB in 5.7

• the memory engine is still used when using less then tmp_table_size

BUFFER POOL CONTENTION• innodb_buffer_pool_instances >= 8 • Reduce rows_examined / sec (see Bug #68079) • 8 is the default value in 5.6 ! • innodb_spin_wait_delay = 96 on high concurrency • Use read only transactions when possible

LARGE REDO LOGS• Redo logs defer the expensive changes to the data files • Recovery time is no more an issue • innodb_log_file_size = 2047M before 5.6 • innodb_log_file_size >= 2047M from 5.6 • Bigger is better for write QPS stability • You want to avoid “furious flushing” • innodb_log_files_in_group = 2 is usually fine

IO CAPACITY• IO capacity should mirror device IO capacity in IOPS • innodb_io_capacity should be larger for SSD • Impacts flushing • In 5.6, innodb_lru_scan_depth is per buffer pool instance • so innodb_lru_scan_depth = innodb_io_capacity /

innodb_buffer_pool_instances • Default innodb_io_capacity_max = min(2000, 2 * innodb_io_capacity)

PER THREAD MEMORY USAGE• Each thread allocates memory • estimates = max_used_connections * (

read_buffer_size + read_rnd_buffer_size + join_buffer_size + sort_buffer_size + binlog_cache_size + thread_stack + 2 * net_buffer_length … )

For a more accurate measure, check the performance_schema memory metrics or MariaDB select * from information_schema.processlist.

THREAD CONCURRENCY• No thread pool :

• innodb_thread_concurrency = 16 - 32 in 5.5 • innodb_thread_concurrency = 36 in 5.6

• align to HW threads if less than 32 cores • Thread pool :

• innodb_thread_concurrency = 0 (unlimited) is fine • innodb_max_concurrency_tickets : higher for OLAP, lower for

OLTP

INNODB FLUSHING• Redo logs :

• innodb_flush_log_at_trx_commit = 1 // best durability • innodb_flush_log_at_trx_commit = 2 // better performance • innodb_flush_log_at_trx_commit = 0 // best performance

• Data files only : • innodb_flush_method = O_DIRECT // Linux, skips the FS cache

• Increase innodb_adaptive_flushing_lwm (fast disk)

INNODB_FILE_PER_TABLE• Default value : 1 table = 1 table space = 1 ibd file • Not so good for small tables. • Good for large tables. • Default value from 5.6, before all tables in system table space. • From 5.7, a user defined table space can now host several tables. • http://dev.mysql.com/doc/refman/5.7/en/create-tablespace.html

TRANSACTION ISOLATION• Default = REPEATABLE-READS • Oracle database, Sybase ASE default is READ-COMMITTED • Less locking, less overhead • transaction-isolation = REPEATABLE-READ • If you enable READ-COMMITTED, make sure binlog_format=ROW.

REPLICATION DURABILITY• Replication is crash safe from 5.6 • Replication state was stored in files • Now stored in InnoDB tables in the mysql schema • However the binary logs are stored on disk :

• sync_binlog = 1 • no reason not to use from 5.6

MULTI-THREADED REPLICATION

• MariaDB slave_parallel_threads • MySQL / Percona : slave_parallel_workers • >1 when lag is a concern • For a recent comparison, see https://www.percona.com/live/data-

performance-conference-2016/sites/default/files/slides/2016-04-21_plmce_mysql_parallel_replication-inventory_use-case_and_limitations.pdf

CONNECTOR TUNING• Connectors can also be tuned. • JDBC property for maximum performance :

• userConfigs=maxPerformance • Use if the server configuration is stable • Removes frequent

• SHOW COLLATION • SHOW GLOBAL VARIABLES

• Fast validation query : /* ping */

APPLICATION DESIGN 1/2• Schema design • Indexes at the right place • Remove redundant indexes • Reduce rows examined • Reduce sent rows • Minimize locking • Minimize temporary tables (on disk) • Minimize sorting on disk

APPLICATION DESIGN 2/2• Avoid long running transactions • Close prepare statements • Close idle connection • Do not use the information_schema in your app • Views may not not good for performance

– temporary tables (on disk)

• Replace truncate with drop table / create table • Tune the replication thread • Cache data in the app • Scale out, shard

SCHEMA DESIGN• create a PK for each table ! • integer primary keys • avoid varchar, composite for PK • latin1 vs. utf8 vs. utf8mb4 • the smallest varchar for a column • smallest data type for a column • keep the number of partitions low (< 100, optimal

performance <10) in 5.6. No more an issue in 5.7. • use compression for blob / text data types

INDEXES• Fast path to data • B-TREE, R-TREE (spatial), full text • for sorting / grouping

• without temporary table • covering indexes

• contain all the selected data • save access to full record • reduce random reads

REDUNDANT INDEXES• Not good for write performance

• duplicated data • resources to update • confuse the optimizer

• Use SYS schema views • schema_unused_indexes

REDUCE ROWS_EXAMINED• Rows read from the storage engines • Rows_examined

• slow query log • P_S statement digests • SYS schema

• Handler% • show session status where variable_name like ‘Handler%’ or

variable_name like ‘%tmp%’; • optimize if rows_examined > 10 * rows_sent • usually due to missing indexes

REDUCE ROWS_SENT• Found in the slow query log, the SYS schema • Number of rows that are returned by queries to the clients • rows_sent <= rows_examined • Network / CPU expensive • Client compression can help. • Usually bad design. • Use caching or LIMIT for UI • No human can parse 10k rows

REDUCE LOCKING• Locking has a performance impact because locks kept are in memory • Can be seen in show engine innodb status • UPDATE, SELECT FOR UPDATE, DELETE, INSERT SELECT • Use a PK ref, UK ref to lock • Avoid large index range and table scans • Reduce rows_examined for locking SQL • Commit when possible

TEMPORARY TABLES (ON DISK)• Large temporary tables on disk

• handler_write (handler_tmp_write, handler_tmp_update in MariaDB)

• created_tmp_disk_tables • monitor tmpdir usage • Frequent temporary tables on disk • High created_tmp_disk_tables / uptime • show global status like '%tmp%'; • Available in the SYS schema, select * from sys.statement_analysis

MIND THE SORT (ON DISK)• Key status variable is :

• sort_merge_passes : it is a session variable • if it occurs often, you can try to up sort_buffer_size • find the query and fix it with an the index if possible

• http://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html

LONG RUNNING (IDLE) TRANSACTIONS• Usually the oldest transactions in show engine innodb status :

• High history_list_length (a status variable in MariaDB) • Prevent the purge • Decrease performance • Can also prevent schema changes (due to MDL locks)

CLOSE PREPARE STATEMENTS• com_stmt_prepare – com_stmt_close ~= 0

CLOSE IDLE CONNECTIONS• Idle connections consume resources • Either refresh or disconnect them

MISCELLANEOUS• Do not use the information_schema in your App • Replace truncate with drop table / create table due to

• Bug #68184 Truncate table causes innodb stalls, fixed in MySQL 8.0

• Views may not be good for performance • temporary tables (on disk) • joining other views

• Scale out, shard

CACHE DATA IN THE APP• Good for CPU / IO • Cache the immutable or the expiring !

• referential data • memcached / redis

• Query cache can be disabled • Identify frequent statements

• perl mysqldumpslow.pl –s c slow60s.log • pt-statement-digest

MONITOR THE REPLICATION THREADS• Slow query log with

• log-slow-slave-statements is now dynamic (Bug #59860) from 5.6.11

• MySQL and Percona only • Performance_schema >= 5.6.14 • binlog_format = ROW • show global status like ‘Handler%’ • SYS (on table scans), detect tables not having PK • In case of issue with a particular table, check the MDL locks.

• MariaDB and 5.7 have a way to see them

MONITOR / MAINTAIN• Monitor the database and OS • Mine the slow query log • Use the performance_schema and SYS schema • Backup the database • Upgrade regularly

MONITOR THE DATABASE• essential for any professional DBA • part of the tuning and change processes • alerts • graphs • availability and SLAs • the effect of tuning • query analysis

MINE THE SLOW QUERY LOG• Dynamic collection • The right interval • Top queries • pt-schema-digest • Sort by query time desc • perl mysqldumpslow.pl –s t slow.log • Sort by rows_examined desc • Top queries at the 60s range

USE THE SYS SCHEMA• the SYS schema: • good entry point • ready to use views • IO / latency / waits / statement digests • ideal for dev and staging • https://github.com/mysql/mysql-sys • sys 5.6 works with MariaDB 10.1 • overhead is acceptable in most cases (5% for P_S)

BACKUP THE DATABASE• Backup is always required ! • Method depends on your business : logical vs. physical • Verify the backup • Decrease the overhead on prod

• LVM has an overhead • mysqldump eats MySQL resources • mysqlbackup / xtrabackup copy the data files and verify it

(in parallel)

OPTIMIZE THE DATABASE• Fragmentation has an impact on performance

• internal fragmentation (inside table spaces) • external fragmentation (on the file system)

• OPTIMIZE TABLE fixes it • can be done online

• MariaDB has the iterative defragmentation patch from Facebook / Kakao.

• https://mariadb.com/kb/en/defragmenting-innodb-tablespaces/

MONITORING FRAGMENTATION • There is no general formula

• except for fixed length records • create table t_defrag like t; insert into t_defrag select * from t limit 20000; • Fragmentation if Avg_row_length(t) >

Avg_row_length(t_defrag) • Avg_row_length from show table status

MONITORING FRAGMENTATION • There is no general formula

• except for fixed length records • create table t_defrag like t; insert into t_defrag select * from t limit 20000; • Fragmentation if Avg_row_length(t) >

Avg_row_length(t_defrag) • Avg_row_length from show table status

UPGRADE POLICY• Security vulnerability fixes • Bug fixes • Performance improvements • Ready for the next GA • Never upgrade without testing (see change process)

• can be automated • the DBA is also a QA engineer !

QUESTIONS?

Recommended