68
Cookpad’s Migration Path to AWS Cookpad Inc. Genki Sugawara

Cookpad AWS Seminar

  • Upload
    tapster

  • View
    4.294

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Cookpad AWS Seminar

Cookpad’s Migration Path to AWS

Cookpad Inc. Genki Sugawara

Page 2: Cookpad AWS Seminar

About Me•  My work at Cookpad

o  Head of Infrastructure o  Mission: Building and implementing Cookpad’s

infrastructure, always working to improve speed, scalability, availability, back up, and security.

•  Open source work o  Development of AWS tools

•  elasticfox-ec2tag, IAM Fox, R53 Fox o  Ruby Library Development

•  Zipruby, libarchive, rua, etc.

Page 3: Cookpad AWS Seminar

Contents

•  About Cookpad •  Why AWS? •  AWS server and network configuration   •  Migration of service

Page 4: Cookpad AWS Seminar

About Cookpad

Page 5: Cookpad AWS Seminar

About Cookpad

•  Recipe website used by over 15 million people

•  Over 1 million Recipes •  490 million monthly PVs •  Ruby on Rails + MySQL

Page 6: Cookpad AWS Seminar

About Cookpad

•  PC site o  cookpad.com

Page 7: Cookpad AWS Seminar

About Cookpad

•  Mobile site o m.cookpad.com

Page 8: Cookpad AWS Seminar

About Cookpad

•  iPhone •  Android

Page 9: Cookpad AWS Seminar

About Cookpad0:0

0

1:0

0

2:0

0

3:0

0

4:0

0

5:0

0

6:0

0

7:0

0

8:0

0

9:0

0

10:0

0

11:0

0

12:0

0

13:0

0

14:0

0

15:0

0

16:0

0

17:0

0

18:0

0

19:0

0

20:0

0

21:0

0

22:0

0

23:0

0

PV�

PV variation during a single day �

Page 10: Cookpad AWS Seminar

About Cookpad

4月 5月 6月 7月 8月 9月 10月 11月 12月 1月 2月 3月

PV�

Variation in PVs across the year

Page 11: Cookpad AWS Seminar

Why move to AWS?

Page 12: Cookpad AWS Seminar

Why AWS?

1.  Speed 2.  Distribution of Work 3.  Cost �

Page 13: Cookpad AWS Seminar

Why AWS?

o Development speed

Speed

Distribution

of Work

Cost

Page 14: Cookpad AWS Seminar

Why AWS?

o New servers currently require several weeks or more to prepare

o We lack the some of the know-how to build our own servers

Speed

Distribution

of Work

Cost

Page 15: Cookpad AWS Seminar

Why AWS?

o Getting caught up in infrastructure issues causes large delays in releases

Speed

Distribution

of Work

Cost

Page 16: Cookpad AWS Seminar

Why AWS?

o With AWS, it takes less than 10 minutes to start up an instance.

Speed

Distribution

of Work

Cost

Page 17: Cookpad AWS Seminar

Why AWS?

o  Ability to distribute work

Speed

Distribution

of Work

Cost

Page 18: Cookpad AWS Seminar

Why AWS?

Speed

Distribution

of Work

CostApp

Engineer

RequestInfra

Engineer

Prep

Before AWS

Page 19: Cookpad AWS Seminar

Why AWS?

After AWS Speed

Distribution

of Work

Cost App Engineer

Prep

Page 20: Cookpad AWS Seminar

Why AWS?

o  Without AWS, distributing work is difficult: •  Need infrastructure skills/knowledge •  Problems with security & stability

o  With AWS, distribution of work is made possible •  Very little specialized skill needed •  Security/stability issues can be solved by

giving authority where needed

Speed

Distribution

of Work

Cost

Page 21: Cookpad AWS Seminar

Why AWS?

o  EC2 seems a little too costly

Speed

Distribution

of Work

Cost �

Page 22: Cookpad AWS Seminar

Why AWS?For example, here’s an unexpected “surprise” in my EC2 monthly statement… Speed

Distribution

of Work

Cost �

Page 23: Cookpad AWS Seminar

Why AWS?iDC:Charged according to greatest bandwidth

Speed

Distribution

of Work

Cost �

Page 24: Cookpad AWS Seminar

Why AWS?AWS:Charged by data transmitted (Less cost for sites like Cookpad, which have peak and non-peak times) Speed

Distribution

of Work

Cost �

Page 25: Cookpad AWS Seminar

Why AWS?

o  Charged by amount of data transmitted •  Less costly when difference between peak

& non-peak times is especially large. o  Do away with excess investment into servers

Speed

Distribution

of Work

Cost �

Page 26: Cookpad AWS Seminar

Server & Network Configuration

Page 27: Cookpad AWS Seminar

Server & Network Configuration

Current NetworkNetwork

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 28: Cookpad AWS Seminar

Server & Network Configurationo  Simple 3-layer structure o Networks are partitioned at each layer

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 29: Cookpad AWS Seminar

Server & Network Configuration

EC2’s NetworkNetwork

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 30: Cookpad AWS Seminar

Server & Network Configurationo  All servers located in same segment o  Instead of partitioned networks,

security groups are used

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 31: Cookpad AWS Seminar

Server & Network Configuration

o  Two types of security groups set for instances •  Basic

•  Security groups for each role

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 32: Cookpad AWS Seminar

Server & Network Configuration

Security group organization/structureNetwork

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 33: Cookpad AWS Seminar

Server & Network Configuration

o  Basic allows for mutual communication between basic ports •  ping(icmp) •  http

o  Allows access from specific security groups •  Health monitoring tools (Nagios, etc.) •  Performance monitoring tools (Munin,

etc.)

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 34: Cookpad AWS Seminar

Server & Network Configuration

Security group organization/structureNetwork

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 35: Cookpad AWS Seminar

Server & Network Configuration

o  Security groups for each role •  Enables communication between

roles themselves •  Enables communication between

each role and basic.

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 36: Cookpad AWS Seminar

Server & Network Configurationo  Enable access from App groups to DB

groups Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 37: Cookpad AWS Seminar

Server & Network Configurationo  Allows queries from Basic to DNS

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 38: Cookpad AWS Seminar

Server & Network Configurationo  IP address are not specified for general

access. o  One exception are roles accessed from

Elastic Load Balancing, in which 10.0.0.0/8 access is allowed•  Cannot specify source IP •  Cannot specify security group

o  Start iptables on all servers •  Helps  eliminate  human  error

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 39: Cookpad AWS Seminar

Server & Network Configuration

o With EC2, internal IP addresses cannot be fixed •  Internal IP addresses end up

changed with stops & reactivations o Use Internal DNS to block out IP

addresses

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 40: Cookpad AWS Seminar

Server & Network Configurationo  DNS is organized into a 2-part Active-Active

configuration •  Each is assigned an Elastic IP

o  Each server references DNS with resolv.conf

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Server

Server

Page 41: Cookpad AWS Seminar

Server & Network Configurationo DNS obtains name tag information

and configures domain information Ex.) Name:dev → dev.ap-northeast-1.compute.internal

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 42: Cookpad AWS Seminar

Server & Network Configurationo  resolv.conf is periodically reset by cron

•  When internal IP address changes, resolv.conf is reset

•  If one DNS server stops, it is removed from resolv.conf

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Server

Page 43: Cookpad AWS Seminar

Server & Network Configurationo  Cron requests DNS’s Public DNS

Name(Public DNS Name is fixed by Elastic IP assignment)

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Request Public DNS

Name

Page 44: Cookpad AWS Seminar

Server & Network Configurationo  DNS’s internal IP is acquired as the IP

address associated with the Public DNS Name

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Acquire Public DNS

Name

Page 45: Cookpad AWS Seminar

Server & Network Configurationo  Acquired internal IP is written into resolv.conf o  If the request isn’t returned, then it is

removed from resolv.conf

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Write internal IP

Page 46: Cookpad AWS Seminar

Server & Network Configuration

o Clean installation of CentOS5.5 o Root Device = EBS

o Currently, a mix of 32bit and 64bit, but will move to 64bit only in the future.

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 47: Cookpad AWS Seminar

Server & Network Configurationo  AMI for each role is created from the base

AMI o  Each AMI is given its own version o  Also implement system management tools

such as Chef

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 48: Cookpad AWS Seminar

Server & Network Configuration

o  System network health monitoring •  Nagios + nrpe

o Performance monitoring •  Munin

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 49: Cookpad AWS Seminar

Server & Network Configurationo  Nagios monitors server health status o  Munin monitors and records server

performance data (e.g. CPU usage, load average, etc.)

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Server

Server

Page 50: Cookpad AWS Seminar

Server & Network Configurationo  Started instances are automatically

monitored by Nagios・Munin o  Each instance is given a tag so the

appropriate type of monitoring can be identified.

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 51: Cookpad AWS Seminar

Server & Network Configuration

o  Increasing availability •  Mutual monitoring using Elastic IP

•  Restoration from AMI using Nagios

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 52: Cookpad AWS Seminar

Server & Network Configuration

Mutual monitoring using Elastic IP o Used in Nagios & LDAP redundancy

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 53: Cookpad AWS Seminar

Server & Network Configurationo Monitor public DNS name of each

elastic IP Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Monitors Public DNS Name

Page 54: Cookpad AWS Seminar

Server & Network Configurationo  Health check is not performed if the

returning internal IP address is of the server itself.

o  If the address differs from the server, then health check is carried out

o  →Back up always performs health check for master

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Back-up performs master health

check

Page 55: Cookpad AWS Seminar

Server & Network Configurationo  If the master health check fails, then

the back-up assigns itself an elastic ID

o  Elastic IP is moved from the master to the back-up, and switched to failover

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Elastic IP moved to back-up

Page 56: Cookpad AWS Seminar

Server & Network Configuration

Restoration from AMI using Nagios o When Nagios fails its health check, it

is restored from AMI o Used in Munin, etc.

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Monitor

Starts instance

Server (new instance)

Page 57: Cookpad AWS Seminar

Server & Network Configuration

o Mutual monitoring using Elastic IP •  Applied to the server that we most

want to minimize downtime o Restoration from AMI using Nagios •  Applied to server allowing 5〜~10

minutes downtime

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 58: Cookpad AWS Seminar

Server & Network Configuration

o Downtime is longer compared to keepalived, etc.

o Currently looking into redundancy using Heartbeat

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL

Page 59: Cookpad AWS Seminar

Server & Network Configuration

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL �

Data

Data

(Daily)

Page 60: Cookpad AWS Seminar

Server & Network Configurationo  EC2 used only for Slaves o Data in EBS

o  Snapshots of data taken daily

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL �

Data

Data

(Daily)

Page 61: Cookpad AWS Seminar

Server & Network Configurationo New slave created from snapshots

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL �

(Daily)

Start up

Data

Data

New DB �

Restoration

Page 62: Cookpad AWS Seminar

Server & Network Configurationo Data created from snapshot has same

replication position o  Simplification of slave failover

Network

Security

DNS

AMI

Monitoring

Redundancy

MySQL �

DataNew Data (EBS)

New DB �

Restore Create

Page 63: Cookpad AWS Seminar

Service Migration

Page 64: Cookpad AWS Seminar

Service MigrationiDC & EC2 Hybrid

Internet

Page 65: Cookpad AWS Seminar

Service Migrationo  Service access is divided up between EC2 & iDC

using round robin o Read from DB comes from EC2

o Write to DB takes place in iDC

Page 66: Cookpad AWS Seminar

Service MigrationMoving the master DB to EC2

Internet

Page 67: Cookpad AWS Seminar

Service Migrationo  The master DB is moved to EC2 o Before the move, iDC access is gradually stopped

o  Finally, iDC is completely removed.  

Page 68: Cookpad AWS Seminar

Thank you!