47
Retaining globally distributed high availability Art van Scheppingen Head of Database Engineering

Retaining globally distributed high availability

Embed Size (px)

DESCRIPTION

Example of a solution for retaining globally distributed high availability with MySQL

Citation preview

Page 1: Retaining globally distributed high availability

Retaining globally distributed high availability Art van Scheppingen Head of Database Engineering

Page 2: Retaining globally distributed high availability

2  

1.  Who  is  Spil  Games?  2.  Theory  3.  Spil  Storage  Pla9orm  4.  Ques=ons?  

Overview

Page 3: Retaining globally distributed high availability

Who are we? Who  is  Spil  Games?    

Page 4: Retaining globally distributed high availability

4  

•  Company  founded  in  2001  •  350+  employees  world  wide  •  180M+  unique  visitors  per  month  •  45  portals  in  19  languages  •  Casual  games  •  Social  games  •  Real  =me  mul=player  games  •  Mobile  games  

•  35+  MySQL  clusters  •  60k  queries  per  second  (3.5  billion  qpd)  

Facts

Page 5: Retaining globally distributed high availability

5  

Geographic Reach 180  Million  Monthly  Ac=ve  Users(*)  

Source:  (*)  Google  Analy3cs,  August  2012  

 

•  Over  45  localized  portals  in  19  languages  •  Mul=  pla9orm:  web,  mobile,  tablet  •  Focus  on  casual  and  social  games  •  180M  MAU  per  month  (30M  YoY  growth)  •  Over  50M  registered  users  

Page 6: Retaining globally distributed high availability

6  

Girls,  Teens  and  Family    

spielen.com  juegos.com  gamesgames.com  games.co.uk  

Brands

Page 7: Retaining globally distributed high availability

Foundations The  exci2ng  theory    

Page 8: Retaining globally distributed high availability

8  

•  What  does  it  exactly  mean?  

Retaining globally distributed HA

Page 9: Retaining globally distributed high availability

9  

Wikipedia:  High  availability  is  a  system  design  approach  and  associated  service  implementa=on  that  ensures  a  prearranged  level  of  opera=onal  performance  will  be  met  during  a  contractual  measurement  period.    Oracle:  •  Availability  of  resources  in  a  computer  system    

What is high availability?

Page 10: Retaining globally distributed high availability

10  

•  Master  with  (many)  slave(s)  

How do we reach HA with MySQL?

Master

Slave Slave Slave

Page 11: Retaining globally distributed high availability

11  

•  Master  with  (many)  slave(s)  •  Mul=  Master  

How do we reach HA with MySQL?

Master

Slave

Master

Slave

Page 12: Retaining globally distributed high availability

12  

•  Master  with  (many)  slave(s)  •  Mul=  Master  •  Clustering  

How do we reach HA with MySQL?

MysqldMysqld

ndbd

ndbd ndbd

ndbd ndbd

mgmt

Page 13: Retaining globally distributed high availability

13  

•  Master  with  (many)  slave(s)  •  Mul=  Master  •  Clustering  •  Geographical  redundancy    

How do we reach HA with MySQL?

Master local DC

Slave local DC

Slave Asia Slave US

Page 14: Retaining globally distributed high availability

14  

•  Scale  up  •  Ver=cal  •  Faster  CPU/Memory/disks  •  Expensive  •  Costs  mul=ply  in  same  rate  as  #  of  nodes  

•  Scale  out  •  Horizontal  •  More  (small)  machines  •  Inexpensive  •  Par==oning/federa=ng  (sharding)  

What if we keep growing?

Page 15: Retaining globally distributed high availability

15  

•  Func=onal  •  Shard  your  database  func=onally  

•  Reads  •  Add  more  slaves  (keep  them  coming!)  

•  Writes  •  More  disks  •  Horizontal  par==oning  •  Federated  par==ons  

Scale out

Page 16: Retaining globally distributed high availability

16  

•  Breaking  up  tables  in  small  parts  on  the  same  host  •  Par==oned  on  a  column  •  Infinite  growth  (as  long  as  you  add  diskspace)  •  Less  used  data  to  slower  (cheaper)  disks  •  No  stored  procedures,  func=ons,  etc  •  Uneven  usage  of  par==ons  (hash  par==on  may  help)  •  Once  wrihen,  data  remains  on  the  par==on  

Horizontal partitioning

Page 17: Retaining globally distributed high availability

17  

•  Breaking  up  your  table  in  parts  on  mul=ple  hosts  •  Par==oned  on  a  column  •  Infinite  growth  (as  long  as  you  add  hosts)  •  Less  used  data  on  slower  hosts  •  Not  supported  in  (standard)  MySQL  •  Par==oning  on  applica=on  level  (or  proxy)  •  Alterna=vely:  NDB  

•  Uneven  usage  of  par==ons  •  Once  wrihen  data  (mostly)  remains  on  the  par==on  •  Parallel  queries  to  retrieve  data  from  all  shards  

Federated partitions (sharding)

Page 18: Retaining globally distributed high availability

18  

•  Parallel  execu=on  of  sequen=al  jobs  •  Limited  by  the  weakest  link  •  As  fast  as  the  slowest  node  •  Fix:  nonsequen=al  (asynchronous)  execu=on  

Amdahl's law

Page 19: Retaining globally distributed high availability

19  

Typical LAMP stack

Client  

Webserver  

PHP  

MySQL  

Memcache  

Webserver  

PHP  

Loadbalancer  

Page 20: Retaining globally distributed high availability

20  

A-typical LAMP stack

Client  

Webserver  

PHP  

MySQL  

Memcache  

Webserver  

PHP  

Loadbalancer  

MQ  

Jobs  

Page 21: Retaining globally distributed high availability

Spil Storage Platform Abstrac2ng  the  storage  layer    

Page 22: Retaining globally distributed high availability

22  

•  Dependent  on  one  storage  pla9orm  •  No  more  pla9orm-­‐specific  query  language  

•  Differen=ate  writes    •  Op=mis=c  (asynchronous)  •  Pessimis=c  (synchronous)  

•  Shard  data  beher  •  Par==on  on  user  and  func=on  •  Cluster  informa=on  by  users,  not  by  func=on  

•  Global  expansion  •  Par==on  on  geographic  loca=on  

•  Solve  uneven  usage  of  data  storage  •  Move  data  from  shard  to  shard  

•  Anything  may/could/will  fail  eventually  •  Not  designed  for  the  “happy”  flow  

What was our wishlist?

Page 23: Retaining globally distributed high availability

23  

Old architecture overview

Page 24: Retaining globally distributed high availability

24  

New architecture overview

Page 25: Retaining globally distributed high availability

25  

New architecture overview

Server API

Application Model

Storage platform

Client-side API

Presentation layer

Physical storage

Page 26: Retaining globally distributed high availability

26  

•  Everything  wrihen  in  Erlang  •  Piqi  as  protocol  •  binary  •  JSON  •  XML  

•  SSP  u=lizes  local  caching  (memcache)  •  Flexible  (persistent)  storage  layer  •  MySQL  (various  flavors)  •  Membase/Couchbase  •  Could  be  any  other  storage  product  

•  MQs  (DWH  updates)  

Our building blocks

Page 27: Retaining globally distributed high availability

27  

•  Predictable  •  Reliable  •  Decent  performance  •  Easy  to  comprehend  •  Excellent  eco  system  •  Libraries  •  Monitoring  tools  •  Knowledge  

Why choose MySQL?

Page 28: Retaining globally distributed high availability

28  

•  Func=onal  language  •  High  availability:  designed  for  telecom  solu=ons  •  Excels  at  concurrency,  distribu=on,  fault  tolerance  •  Do  more  with  less!  •  Other  companies  using  Erlang:  

Why Erlang?

Page 29: Retaining globally distributed high availability

29  

•  What  is  the  bucket  model?  •  Each  record  has  one  unique  owner  ahribute  (GID)  •  GID  (Global  IDen=fier)  iden=fying  different  types  •  Bucket(s)  per  func=onality  •  Bucket  is  structured  data  •  Ahributes  contain  data  of  records  •  Ahributes  do  not  have  to  correspond  to  schema  

How do we shard?

Page 30: Retaining globally distributed high availability

30  

$  curl  -­‐X  POST  -­‐H  'Accept:  applica=on/json'  -­‐H  \  'Content-­‐Type:  applica=on/json'  -­‐-­‐data-­‐binary  "{\"gid\":  \  288511851128422401}"  hhp://127.0.0.1:8777/demobucket/get  {      "records":  [          {              "gid":  288511851128422401,              "given_name":  "g",              "registered_on":  1,              "email":  "mail1",              "gender":  "m",              "birthdate":  {  "year":  1963,  "month":  6,  "day":  21  }          }      ],      "meta_info":  {  "total_ct":  1  }  }  

Example bucket

Page 31: Retaining globally distributed high availability

31  

CREATE  TABLE  demobucket  (      gid  bigint(20)  unsigned  not  null,      given_name  varchar(64)  not  null,      registered_on  =nyint(3)  unsigned  default  0,      email  varchar(255)  not  null,      gender  enum(‘m’,  ‘f’,  ‘u’)  not  null  default  ‘m’,      birthdate  date  not  null,      PRIMARY  KEY(gid)  );  

Example bucket MySQL 1

Page 32: Retaining globally distributed high availability

32  

CREATE  TABLE  demobucket  (      gid  bigint(20)  unsigned  not  null,      user_name  varchar(64)  not  null,      user_register  =mestamp  on  update  CURRENT_TIMESTAMP(),      user_emailaddress  varchar(255)  not  null,      user_gender  char(1)  not  null  default  ‘m’,      user_dob  varchar(10)  not  null,      PRIMARY  KEY(gid)  );  

Example bucket MySQL 2

Page 33: Retaining globally distributed high availability

33  

CREATE  COLUMNFAMILY  demobucket  (      gid  int  PRIMARY  KEY,      given_name  varchar,      registered_on  =mestamp,      email  varchar,      gender  varchar,      birth_date  varchar  );  

Example bucket Cassandra

Page 34: Retaining globally distributed high availability

34  

demobucket:get(  #demobucket_get_input{  gid=12345,  filters=  [                            #filter{  ahr=  <<"gender">>        ,  op=  <<"=">>        ,  parms=  {string,  <<"f">>}},                            #filter{  ahr=  <<"registered_on">>,  op=  <<"sort">>,  parms=asc  },                            #filter{  ahr=  <<"gid">>,  op=  <<"limit">>,    parms={int,  10  }}                    ]}  )  

Example Erlang filters

Page 35: Retaining globally distributed high availability

35  

Pipeline flow of a bucket

Page 36: Retaining globally distributed high availability

36  

•  Nearest  datacenter  (DC)  to  the  end  user  •  Satellite  DC  •  Processing  and  caching  •  Do  not  own/store  data  

•  Storage  DC    •  Processing,  caching  and  persistent  storage  •  Store  all  same  user  data  in  same  DC  

•  Par==on  on  user  globally  •  Global  IDen=fier  per  user  

Global distribution

Page 37: Retaining globally distributed high availability

37  

•  Contains  GIDs  and  their  master  DC  •  GIDs  master  DC  predefined  •  Migrated  GIDs  get  updated  

The lookup server

Page 38: Retaining globally distributed high availability

38  

•  Globally  sharded  on  GID  •  (local)  GID  Lookup  

How does this work?

GID lookup

Shard 1 Shard 2

Persistent storage

Page 39: Retaining globally distributed high availability

39  

Master/Satellite DC example

Page 40: Retaining globally distributed high availability

40  

•  Spread  data  even  on  shards  •  Migra=on  of  buckets  between  shards  

•  GID  migra=on  between  DCs  •  Crea=ng  a  new  storage  DC  needs  data  migra=on  •  Users  will  automa=cally  be  migrated  a�er  visi=ng  another  DC  many  =mes  

Why do we need data migration?

Page 41: Retaining globally distributed high availability

41  

•  Versioning  on  bucket  defini=ons    •  GIDs  are  assigned  to  a  bucket  version  •  Data  in  old  bucket  versions  remain  (read  only)  •  New  data  only  gets  wrihen  to  new  bucket  version  •  Updates  migrate  data  to  new  bucket  version  •  Migrates  can  be  triggered  

Seamless schema upgrades

Page 42: Retaining globally distributed high availability

42  

Seamless schema upgrades

Demobucket  v1  

GID  1234  1235  1236  1237  1238  1239  

name  Roy  Moss  Jen  

Douglas  Denholm  Richmond  

Demobucket  v2  

GID              

name              

gender              

GID  1241            

name  Patricia  

         

gender  f            

GID  1241  1235          

name  Patricia  Moss          

gender  f  m          

GID  1234    

1236  1237  1238  1239  

name  Roy    Jen  

Douglas  Denholm  Richmond  

GID  1234      

1237  1238  1239  

name  Roy      

Douglas  Denholm  Richmond  

GID  1241  1235  1236        

name  Patricia  Moss  Jen        

gender  f  m  f        

Page 43: Retaining globally distributed high availability

43  

•  Every  cluster  (two  masters)  will  contain  two  shards  •  Data  wrihen  interleaved  •  HA  for  both  shards  •  No  warmup  needed  

•  Both  masters  ac=ve  and  “warmed  up”  •  Slaves  added  (other  DC)  for  HA  and  backup  

Multi Master writes

SSP  

Shard  1                                      

Shard  2                                      

Page 44: Retaining globally distributed high availability

44  

•  SPAPI  is  in  place  •  SSP  is  (mostly)  running  in  shadow  mode  •  GID  buckets  running  in  produc=on  •  Ac=vity  feed  system  first  to  produc=on  •  Satellite  DC  in  early  2013!  

Where do we stand now?

Page 45: Retaining globally distributed high availability

45  

Page 46: Retaining globally distributed high availability

Questions?

Page 47: Retaining globally distributed high availability

47  

•  Presenta=on  can  be  found  at:  hhp://spil.com/perconalondon2012  •  If  you  wish  to  contact  me:  [email protected]  •  Don’t  forget  to  rate  my  talk!  

Thank you!