Optimizing InnoDB bufferpool usage

  • Upload
    zarafa

  • View
    11.802

  • Download
    4

Embed Size (px)

Citation preview

Slide 1

Optimizing your InnoDB buffer pool usage

Steve Hardy

It decreases I/O load

Why optimize buffer pool usage ?

It decreases I/O load

It decreases I/O load

Why optimize buffer pool usage ?

More RAM

Page compression

Less (smaller) data

Rearrange data

How ?

OVERSIMPLIFICATION WARNING

Do not complain about inaccuracies, Im just trying to make a general point

Lets say our buffer pool is 20% of our total DB size

Lets say accesses to cached pages take no time at all

Lets say all data is equally accessed

Lets look at the data access over a certain period of time, say 1hr

Lets take a look at why RAM cache works

Total gain from cache: 20% of the accesses are cached, so were20% faster than when he have no cache at all

Page accesses are more concentrated in some places than others

We want to cache the most-accessed places

Lets say accesses follow a 1/x curve

Fortunately, data access isnt really like that

Total gain from cache: 60% of accesses are in cached pages, 60% reduction in I/O, 2.5 times faster than without a cache.

The trick to optimizing your buffer pool usage is to make the histogram look more like a slope than a flat line

So now what?

A slope following 1/1 has an 20% increase with 20% cache size (1.2x faster)A slope following 1/x has an 60% increase with 20% cache size (2.5x faster)A slope following 1/x^2 has an 92% increase with 20% cache size (12.5x faster)

Get a hold of MariaDB or Percona Server (stock MySQL is uncapable for displaying these stats, maybe 5.6 ?)

See what pages youre storing in the buffer pool:

select * from information_schema.INNODB_BUFFER_POOL_PAGES_INDEX;

+---------------------+----------+---------+--------+------------+--------+-------------+----------+-------+-----+--------------+-----------+------------+| index_id | space_id | page_no | n_recs | data_size | hashed | access_time | modified | dirty | old | lru_position | fix_count | flush_type |+---------------------+----------+---------+--------+------------+--------+-------------+----------+-------+-----+--------------+-----------+------------+| 945 | 0 | 115129 | 290 | 15080 | 0 | 1213040390 | 0 | 0 | 0 | 0 | 0 | 0 || 945 | 0 | 32820 | 549 | 15965 | 0 | 1213040370 | 0 | 0 | 0 | 0 | 0 | 0 || 945 | 0 | 112322 | 366 | 15006 | 0 | 1213040370 | 0 | 0 | 0 | 0 | 0 | 0 || 945 | 0 | 32831 | 506 | 15961 | 0 | 1213040349 | 0 | 0 | 0 | 0 | 0 | 0 || 945 | 0 | 111817 | 350 | 15050 | 0 | 1213040332 | 0 | 0 | 0 | 0 | 0 | 0 || 945 | 0 | 49176 | 535 | 15959 | 0 | 1213040307 | 0 | 0 | 0 | 0 | 0 | 0 || 945 | 0 | 113198 | 318 | 15030 | 0 | 1213040299 | 0 | 0 | 0 | 0 | 0 | 0 || 945 | 0 | 32828 | 533 | 15966 | 0 | 1213040284 | 0 | 0 | 0 | 0 | 0 | 0 |

Inspecting your buffer pool

index_id: reference to information_schema. INNODB_SYS_INDEXES

space_id: tablespace (ibdata file or file number in file_per_table)

Page_no: page number (unique)

N_recs: number of records on the page

Data_size: size of data on the innodb page

Access_time: epoch timestamp of last access

Modified (1/0): something was modified in the page since load

Dirty (1/0): not flushed to disk yet

Inspecting your buffer pool

See what pages youre storing in the buffer pool per table/index:

select count(*) as pages, ta.SCHEMA as db, ta.NAME as tab, ind.NAME as ind FROM innodb_buffer_pool_pages_index AS bp JOIN innodb_sys_indexes AS ind ON bp.index_id=ind.id JOIN innodb_sys_tables AS ta ON ind.table_id = ta.id group by bp.index_id;

Inspecting your buffer pool

+-------+----------------+-------------+-----------+| pages | db | tab | ind |+-------+----------------+-------------+-----------+| 1 | | SYS_FOREIGN | FOR_IND || 1 | | SYS_FOREIGN | REF_IND || 1 | zarafa_db9 | tproperties | PRIMARY || 1 | zarafa_db9 | tproperties | hi || 4 | zarafa_db9 | properties | PRIMARY || 1 | zarafa_db9 | syncs | sync_time || 1 | zarafa_indexer | stores | PRIMARY || 1 | zarafa_indexer | stores | id || 14 | zarafa_indexer | docwords_2 | PRIMARY || 17 | zarafa_indexer | docwords | PRIMARY || 17 | zarafa_indexer | words | PRIMARY || 1 | zarafa_indexer | updates | PRIMARY || 1 | zarafa_indexer | updates | doc || 1 | zarafa_indexer | sourcekeys | PRIMARY |+-------+----------------+-------------+-----------+

More of an art than an exact science, here are some clues:

50% of your buffer pool is used by a table that contains 20% of your data

50% of your buffer pool is used by a table that you thought wouldnt really impact performance

How do I know something is wrong ?

Strategies:

Increase record density: make your records smaller. This will give you more records per page, and therefore less pages for the same data

Remove indexes if you can use others almost-as-efficiently

Increase record locality: make sure that records that are accessed around the same time have a higher chance of being on the same page

Other, non-application strategies:

Use page compression (may double or quadruple the number of pages you can hold in RAM) but isnt a sure-win since pages need to be decompressed in RAM as well.

Buy more RAM ;)

Try BKA (MariaDB 5.3) if you have this problem even within one query

Okay, so how do I fix it ?

If youre reading a range in an index, locality is high by definition

Things get less efficient when you join to some other table, each record will generate a single page read for the linked record

There are some enemies: random IDs (like GUIDs) and auto_increments that are accessed in some other order than the original insertion order

Eg.

table messages (id INTEGER auto_increment, userid INTEGER, data varchar(255), PRIMARY KEY (id));

This table is now read most optimally if you read them in the order they were received. In practice, this is really never needed (maybe during backup)

Locality?

Probably more interesting case, reading all the emails of a user:

SELECT data FROM emails WHERE userid=10;

The nave approach would be to use an index:

table messages (id INTEGER auto_increment, userid INTEGER, data varchar(255), PRIMARY KEY (id), KEY user (userid));

Increasing locality example

index

record

Fast range read (1 page)

Slow record lookup (eg 100 pages)

101 pages in the bufferpool

Better idea is to have records that are accessed together, more closely packed, using InnoDB clustered index:

table messages (id INTEGER auto_increment, userid INTEGER, data varchar(255), PRIMARY KEY (userid, id), UNIQUE KEY user (id));

Increasing locality example

records

Fast range read (~5 pages)

5 pages in the bufferpool

In Zarafa, we have summary information on emails (subject, from, to, x-mailer, etc)

(we have databases up to 1 TB in a single table for 1000s of users)

More locality: introducing redundant information

Hierarchy(id, parentid)

Properties(id, type, value)PRIMARY KEY(id,type)

4,15,16,17,1

4, subject, hello4, from, [email protected], x-mailer, zarafa5, subject, hello again

etc

Eg: sort by subject (get all subjects of folder X):

SELECT value FROM properties JOIN hierarchy ON properties.id = hierarchy.id WHERE hierarchy.parent=X and properties.type=Y;

+----+-------------+------------+--------+----------------+---------+---------+------------------------+------+-------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+------------+--------+----------------+---------+---------+------------------------+------+-------+| 1 | SIMPLE | hierarchy | ref | PRIMARY,parent | parent | 4 | const | 3 | || 1 | SIMPLE | properties | eq_ref | PRIMARY | PRIMARY | 8 | bla.hierarchy.id,const | 1 | |+----+-------------+------------+--------+----------------+---------+---------+------------------------+------+-------+

Result: one random access per email, for 10000 emails, worst case 50 seconds

Why this is slow

Data is redundant, if you lost it, you could regenerate it

Normally regarded as a bad idea

I dont care.

Add parentid into properties

Introduce a little garbage for locality

Hierarchy(id, parentid)

Properties(id, garbage, type, value)PRIMARY KEY(id,garbage, type)

4,15,16,17,1

4, 1, subject, hello4, 1, from, [email protected], 1, x-mailer, zarafa5, 1, subject, hello again

etc

Writes are reads

If you are write-heavy, this will not work since increasing read-locality normally decreases write-locality

Whats the catch?

records

Random writes causeRandom reads into buffer pool

You know, like {0999C37B-9F73-42bb-BA57-B88940FDD686}

Random inserts

Almost certain random lookups

Each lookup or write will cause a single page to be loaded into the buffer pool for the sole purpose of reading or writing a single record

Better idea: use fixed GUID + counter for example (at least inserts will go to the same place)

Whats wrong with GUIDs?

Questions ?