20
* © SkySQL Ab. Commercial in Confidence * © SkySQL Ab. Commercial in Confidence

CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

Embed Size (px)

DESCRIPTION

CCM Escape Case Study - Damien Mangin, DSI et Stephane Varoqui, SkySQL at the SkySQL Paris Meetup 17.12.2013

Citation preview

Page 1: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 2: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

CCM Escape Case Study- Elastic Statistics Cluster

Damien Mangin, Nicolas Payart, Stéphane Varoqui

* © SkySQL Ab. Commercial in Confidence ** © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 3: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Challenges

❏ Scale writes❏ Should not hurt API performance❏ Reduce the number of servers

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 4: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape candidates

❏ Redis Cluster is it ?(work in progress). Is it so fast for writes?

❏ Riak - LeveDB - InnoDB Do I really need Erlang for this? To store billions of entries use Innostore. InnoDB is a robust and well-known storage engine. Performance aside it appears that LevelDB may become a preferred choice for Riak.

❏ Cassandra – HbaseDo I really need Java for this? 10K insert per node group, is the price to pay for hight level design. We don’t need eventualy consistancy? Read performance will suffer from such design

❏ Syslogng & UDP We do real time processing , what if database stopped

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 5: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

MariaDB 10 my toolkit

❏ Storage EnginesInnoDB, Cassandra, Hbase, LevelDB, TokuDB, Oqgraph

❏ Sharding, Clustering, and FederationSpider, Multi source and Parallel replication, Galera , Connect, federatedx, sphinx, mroonga

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 6: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Architecture

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 7: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape TokuDB

• TokuDB and MyISAM performance 3 times faster on 32G of memory on the most demanding query

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 8: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Architecture

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 9: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Proxy Node Howto

* © SkySQL Ab. Commercial in Confidence *

CREATE TABLE ccmkw ( ip INT unsigned NOT NULL, date DATETIME NOT NULL, firstseenon VARCHAR(255) NOT NULL, keyword VARCHAR(128) NOT NULL, c1 VARCHAR(128) NOT NULL, c2 varchar(255) NOT NULL , crc64 BIGINT UNSIGNED AS (CONV(LEFT(MD5( IF(keyword='',CONCAT(firstseenon,c1),keyword) ) ,16),10,16)) PERSISTENT) ENGINE=spider COMMENT='user "skysql",password "skyvodka"'

PARTITION BY LIST (mod(crc64,24))(PARTITION pt01 VALUES IN (0) COMMENT = ' tbl "ccmkw", host "127.0.0.1", port "3306", database "ccmstats_shard01"' ENGINE = SPIDER, PARTITION pt02 VALUES IN (1) COMMENT = ' tbl "ccmkw", host "127.0.0.1", port "3306", database "ccmstats_shard02"' ENGINE = SPIDER,...;

CREATE TABLE shard01.kw ( ip INT unsigned NOT NULL, date DATETIME NOT NULL, firstseenon BINARY(255) NOT NULL, keyword BINARY(128) NOT NULL, domaine BINARY(128) NOT NULL, referer BINARY(255) NOT NULL , crc64 BIGINT UNSIGNED PRIMARY KEY ) ENGINE=blackhole;

CREATE TABLE shard02.kw ( ip INT unsigned NOT NULL, date DATETIME NOT NULL, firstseenon BINARY(255) NOT NULL, keyword BINARY(128) NOT NULL, c1 BINARY(128) NOT NULL, c2 BINARY(255) NOT NULL , crc64 BIGINT UNSIGNED PRIMARY KEY ) ENGINE=blackhole;

© SkySQL Ab. Commercial in Confidence

Page 10: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Proxy Node Howto

* © SkySQL Ab. Commercial in Confidence *

default-storage-engine=MyISAMskip-innodbskip_name_resolvback_log=1024 max_connections = 1024table_open_cache = 4096 table_definition_cache = 2048max_allowed_packet = 32Kbinlog_cache_size = 32K max_heap_table_size = 64Mthread_cache_size = 1024 query_cache_size = 0 expire_logs_days=4progress_report_time=0Binlog_ignore_db=ccmstats

spider_use_handler=1spider_sts_sync=0spider_remote_sql_log_off=1spider_remote_autocommit=0spider_direct_dup_insert=1spider_local_lock_table=0spider_support_xa=0spider_sync_autocommit=0spider_sync_trx_isolation=0spider_crd_sync=0spider_conn_recycle_mode=1spider_reset_sql_alloc=0

© SkySQL Ab. Commercial in Confidence

Page 11: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Data Node Multi source Howto

* © SkySQL Ab. Commercial in Confidence *

MariaDB [(none)]> pager egrep 'Connection_name|Slave_IO_Running|Slave_SQL_Running|Gtid_Slave_Pos'

MariaDB [(none)]> show all slaves status\G Connection_name: Slave_IO_Running: Yes Slave_SQL_Running: Yes Gtid_Slave_Pos: 0-31091-138522330 Connection_name: ccmstats_gertrude Slave_IO_Running: Yes Slave_SQL_Running: Yes Gtid_Slave_Pos: 0-31091-138522330 Connection_name: ccmstats_lucifer Slave_IO_Running: Yes Slave_SQL_Running: Yes Gtid_Slave_Pos: 0-31091-138522330 Connection_name: ccmstats_mysql1 Slave_IO_Running: Yes Slave_SQL_Running: Yes Gtid_Slave_Pos: 0-31091-138522330

Subset of shards per data node SET GLOBAL ESCapeProxy1.replicate_do_db='ccmstats_shard01,ccmstats_shard02,ccmstats_shard03,ccmstats_shard04,ccmstats_shard05,ccmstats_shard22,ccmstats_shard23,ccmstats_shard24;

Subset of shards per data node SET GLOBAL ESCapeProxy1.replicate_do_db='ccmstats_shard01,ccmstats_shard02,ccmstats_shard03,ccmstats_shard04,ccmstats_shard05,ccmstats_shard06,ccmstats_shard07,ccmstats_shard08';

Subset of shards per data node SET GLOBAL ESCapeProxy1.replicate_do_db='ccmstats_shard09,ccmstats_shard10,ccmstats_shard11,ccmstats_shard12,ccmstats_shard13,ccmstats_shard14,ccmstats_shard15,ccmstats_shard16';

© SkySQL Ab. Commercial in Confidence

Page 12: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Data Node: I’m a dummy

* © SkySQL Ab. Commercial in Confidence *

DECLARE l_idUrl INT unsigned DEFAULT 0; DECLARE i_idDom TINYINT unsigned DEFAULT 0; DECLARE c_kw VARCHAR(128); IF NOT EXISTS (SELECT 1 FROM ccmreferers WHERE keyword = NEW.keyword AND ip = NEW.ip) THEN

SET l_idUrl = ccmstats.GetIdUrl(NEW.firstseenon); SET i_idDom = ccmstats.GetIdDomaine(NEW.domaine); INSERT INTO stats_url_cur

SET keyword_crc64 = NEW.keyword_crc64, DATE = NEW.date, idUrl = l_idUrl, idDomaine = i_idDom, nb = 1 ON DUPLICATE KEY UPDATE nb=nb+1;

IF LENGTH(NEW.keyword) > 0 THEN SET c_kw = REPLACE(TRIM(NEW.keyword),' ',' '); INSERT INTO stats_url_kw_cur

SET keyword_crc64 = NEW.keyword_crc64, DATE = NEW.date, idUrl = l_idUrl, keyword = c_kw, idDomaine = i_idDom, nb = 1 ON DUPLICATE KEY UPDATE nb=nb+1;

END IF;END IF;

Not counting multiple time from a single

IP.

Massively DEADLOCKING

Trigger execution should never be re

entering to same table

Solution is a blacklist IP keyword table

© SkySQL Ab. Commercial in Confidence

Page 13: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Benchmarks Writes

Write performance should never be an issue full scalability with group commit

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 14: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Spider Open Resty and LibDrizzle

http { include mime.types; default_type application/octet-stream; sendfile on; keepalive_timeout 65; upstream backend { drizzle_server 192.168.0.30:5059 dbname=stat_shard01 password=skyvodka user=skysql protocol=mysql; drizzle_keepalive max=100 mode=single overflow=reject; } server { listen 81; server_name localhost; location / { root html; index index.html index.htm; } location /referers { set_unescape_uri $ip $arg_ip; set_unescape_uri $date $arg_date; set_unescape_uri $firstseenon $arg_firstseenon;set_quote_sql_str $quoted_firstseenon $firstseenon; set_unescape_uri $keyword $arg_keyword;set_quote_sql_str $quoted_keyword $keyword; set_unescape_uri $domaine $arg_domaine;set_quote_sql_str $quoted_domaine $domaine; set_unescape_uri $ref $arg_ref;set_quote_sql_str $quoted_ref $ref; set_unescape_uri $crc64 $arg_crc64; echo_location /mysql "INSERT INTO ref VALUES($ip,$date,$quoted_firstseenon,$quoted_keyword,$quoted_domaine,$quoted_ref,$crc64)"; location /mysql { drizzle_pass backend; drizzle_module_header off; drizzle_query $query_string; rds_json on; }

* © SkySQL Ab. Commercial in Confidence *

❏ Non blocking all the way

❏ Fast as mysqlslap in C

❏ A gazillon faster vs ApachePHP

❏ Limited usage but scripting via lua cosocket

❏ Non blocking MariaDB not in Resty but should be faster

© SkySQL Ab. Commercial in Confidence

Page 15: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Benchmarks

Latency for 1000 Key Point Access

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 16: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Spider Direct SQL voodoo

HOW TO MAP REDUCE

CREATE TEMPORARY TABLE `res` (

`keyword_crc64` bigint(20) unsigned NOT NULL,

`date` date NOT NULL DEFAULT '0000-00-00',

`idUrl` int(10) unsigned NOT NULL,

`keyword` varchar(128) NOT NULL DEFAULT '',

`idDomaine` tinyint(3) unsigned NOT NULL DEFAULT '0',

`nb` mediumint(8) unsigned NOT NULL DEFAULT '0',

`id` bigint(20) unsigned NOT NULL DEFAULT '0'

) ENGINE=MEMORY DEFAULT CHARSET=latin1;

select spider_bg_direct_sql('SELECT * FROM stats_url_kw_cur s WHERE s.id IN(241448386253908686)', 'res', concat('host "', host, '", port "', port, '", user "', username, '", password "', password, '", database "', tgt_db_name, '"')) a from

mysql.spider_tables where

db_name = 'commentcamarche' and table_name like 'stats_url_kw_cur#P#pt%’;

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 17: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Running non RC

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Be prepared for beta testing

❏ Presente yourself and the feature you would like to beta test IRC on freenode #mariadb

❏ Have a compilation machine

❏ Learn about reproductible test case

❏ Learn about Valgrind and mode debug

❏ Don’t get frustated if no answer it is a dev channel

Page 18: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

ESCape Running non RC

* © SkySQL Ab. Commercial in Confidence *

Replication Gotcha 10.0.6

Spider debug mysqld --debug=S:T:t:r:p:n:L:i:F:f:D:d,info,error,query,qcache,my,exit,general,where:O,/tmp/mysqld.trace

Having a compilation server

-DBUILD_CONFIG=mysql_release -

DCMAKE_BUILD_TYPE=Debug -DWITH_VALGRIND=ON

10.0.4, Oups my InnoDB datetime are broken after alter table Fixed in 10.0.6 MySQL 5.5, it is NOT marked as DATA_UNSIGNED despite it being treated by the handler as HA_KEYTYPE_ULONGLONG. This should perhaps be considered a bug in InnoDB, but it is ancient history

© SkySQL Ab. Commercial in Confidence

Page 19: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

* © SkySQL Ab. Commercial in Confidence *© SkySQL Ab. Commercial in Confidence

Page 20: CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

* © SkySQL Ab. Commercial in Confidence *

Thanks

© SkySQL Ab. Commercial in Confidence