17
Big Tables and You Keeping DDL operations fast

Big tables and you - Keeping DDL operatations fast

Embed Size (px)

Citation preview

Big Tables and YouKeeping DDL operations fast

So You Want To Add a New Columnclass AddFoo < ActiveRecord::Migration def self.change add_column :foo, :bar, :string endend

ALTER TABLE foo ADD COLUMN bar varchar(256);

What is this doing?“ALTER TABLE makes a temporary copy of the original table...waits for other operations that are modifying the table...incorporates the alteration into the copy, deletes the original table, and renames the new one. While ALTER TABLE is executing, the original table is readable…

...writes to the table that begin after the ALTER TABLE operation begins are stalled until the new table is ready…”

Where Did Production Go?!

Whats wrong with this approach?Write operations are stalled and you’ve just crashed production

Multiple ALTER statements are applied separately making the time to execute T(n*rows)

Worse with indexes

Demo!ruby> File.open('/tmp/foo','w') {|f| (1..10_000_000).to_a.each{|r|f.puts(r)} } # 10 million rows

mysql> CREATE DATABASE temp_table_demo;mysql> USE temp_table_demo;mysql> CREATE TABLE foo (id int PRIMARY KEY AUTO_INCREMENT, bar VARCHAR(256));mysql> LOAD DATA INFILE "/tmp/foo" INTO TABLE foo;

Demo! (Continued)mysql> ALTER TABLE foo ADD COLUMN baz varchar(256);

Query OK, 10000000 rows affected (42.97 sec)Records: 10000000 Duplicates: 0 Warnings: 0

mysql> SHOW PROCESSLIST;

“State” => “copy to tmp table” ~90% of the execution time

Rethinking DDL Changes“ALTER TABLE makes a temporary copy of the original table...waits for other operations that are modifying the table...incorporates the alteration into the copy, deletes the original table, and renames the new one. While ALTER TABLE is executing, the original table is readable…

We can 1) make a temporary copy 2) incorporate changes 3) sync 4) delete 5) rename

DDL Plan of AttackCREATE TABLE foo_temp LIKE foo;ALTER TABLE foo_temp ADD COLUMN baz varchar(256);INSERT INTO foo_temp (id,bar) SELECT * FROM foo;# Syncing checks here for records modified during changeDROP TABLE foo;RENAME TABLE foo_temp TO foo;

What Changes?

90% of the time in “copy to tmp table”to90% of our time in “Sending data” (non blocking)

This means records can be inserted, updated, deleted without waiting for table metadata lock

Enter MySQL Big Table Migration

A Rails plugin that adds methods to ActiveRecord::Migration to allow columns and indexes to be added to and removed from large tables with millions ofrows in MySQL, without leaving processes seemingly stalled in state "copyto tmp table".

Example

class AddBazToFoo < ActiveRecord::Migration def self.up add_column_using_tmp_table :foo, :baz, :string endend

Additional Methods

● add_column_using_tmp_table● remove_column_using_tmp_table● rename_column_using_tmp_table● change_column_using_tmp_table● add_index_using_tmp_table● remove_index_using_tmp_table

When Should This Be Used?

A good rule of thumb is any table already in production

Another rule of thumb is any table with more than 1 million rows

Not necessary for small, or new tables

The “Meat”

def with_tmp_table(table_name) say "Creating temporary table #{new_table_name} like #{table_name}..." # DDL operations performed on temp table say "Inserting into temporary table in batches of #{batch_size}..." say "Replacing source table with temporary table..." say "Cleaning up, checking for rows created/updated during migration, dropping old table..."end

Demo!rails new temp_table_demo

# Gemfilegem 'mysql_big_table_migration', git: '[email protected]:thickpaddy/mysql_big_table_migration.git'

Run DDLs with and without temp table pattern

Questions from the Audience

Q&A