31
Copyright © 2011 Constant Contact Inc. 1 Constant Contact March 2011 Dave Connors – VP Operations Jim Ancona – Systems Architect Mark Schena – Manager Systems Automation Cassandra & Puppet: Scaling data at $15/month

Cassandra & puppet, scaling data at $15 per month

Embed Size (px)

DESCRIPTION

Constant Contact shares lessons learned from DevOps approach to implementing Cassandra to manage social media data for over 400k small business customers. Puppet is the critical in our tool chain. Single most important factor was the willingness of Development and Operations to stretch beyond traditional roles and responsibilities.

Citation preview

Page 1: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 1

Constant ContactMarch 2011

Dave Connors – VP OperationsJim Ancona – Systems ArchitectMark Schena – Manager Systems Automation

Cassandra & Puppet:Scaling data at $15/month

Page 2: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 2

Constant Contact

2000 – 2010

Market leader for Small Businesses• Email, Event & Survey• Over 400k paying customers• No. 134 on the Deloitte Technology Fast 500 listing

Business model• Many customers pay as little as $15 a month• ~2 million database transactions per minute

Constant Contact

Page 3: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 3

Constant Contact

The business problem

Page 4: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 4

Constant Contact

Small Businesses are looking to us for help with Social Media marketing

• Social Media 10-100 times more data

• Challenge with our business model

Page 5: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 5

The Key Challenge

Integrate social media data

• Solution = NoSQL

• Cost = Low

• Time to market = ?

The Key Challenge

Page 6: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 6

Implementation

Ops and Dev both face issues

• Data model• Monitoring• Authentication• Logging• Risk profile• Roles & Responsibilities

Implementing NoSQL

Page 7: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 7

Dev

Ops

Page 8: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 8

Apache Cassandra

• Developed at Facebook• Open sourced in 2008• Incubated at Apache• Became an Apache top-level project in 2010

• http://cassandra.apache.org

• In use at Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco, …

• Largest production cluster has over 100 TB of data in over 150 machines

Apache Cassandra

Page 9: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 9

What is Cassandra?

• Implemented in Java

• Fault Tolerant• Elastic• Durable

• Rich data model• Replicated data • Consistency

options

What is Cassandra

Page 10: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 10

Replication

X

X X

How many copies of each piece of data

do we want?

N=3

Replication

Page 11: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 11

Y

Y Y

Y

X Y

Consistency LevelONE

WriterReade

r

YX

X X

Consistency Level One

Page 12: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 12

Y

Y Y

X

X Y

WriterReade

r

XX

X X

Consistency Level Quorum

Page 13: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 13

Risks and Mitigation

• Moving target• Developer

unfamiliarity• Operational

procedures• Reliability concerns

• Deployment automation

• Community involvement

• Training/Consulting• Application

selection• Lots of monitoring• Phased rollout

Risks and Mitigation

Page 14: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 14

Development Challenges

Understanding the data modelChoosing a client

■ Clients available for Java, Python, .NET, Ruby, PHP

■ Don’t use Thrift

Moving target

Development Challenges

Page 15: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 15

• Not “one neck to wring”• Paid support and training is available:

http://datastax.com• Community

■ Mailing lists■ IRC #cassandra at freenode

• Contribute

Open Source

Page 16: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 16

• Switchable modes• Mirroring• Dial-able traffic

Phased Rollout

Page 17: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 17

• Big, complex project• Close collaboration• Flexible roles• Ability to iterate

Collaboration

Page 18: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 18

Dev

Ops

Page 19: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 19

“Are you sure you really want that?”

• 3 500G disks• 1 250G disk• No SWAP• RAID Zero Root Partition and Data Storage• 32G Memory

“Are you sure you really want that?”

Page 20: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 20

We will need how many servers?We will need how many servers?

Page 21: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 21

• Quorum = 3 • Multiple Datacenters = 2• Use only half the available disk = 2• 12 Servers = ~1 TB Of Data Storage• ~6 TB of Data Storage

3 x 2 = 6x 2 = 12

72x 6 =

How many nodes?

Page 22: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 22

RanRandom Partitioner

Page 23: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 23

Tool ChainTool Chain

Page 24: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 24

with Puppet

• Puppet is the shared framework between Operations and Development

• Versioning of puppet code allows for adoption of development best practices

• Leverage Domain specific knowledge and skill

DevOps with Puppet

Page 25: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 25

Always Move ForwardAlways Move Forward

Page 26: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 26

Operational Efficiencies

• Remote logging is a requirement • Cassandra uses log4j natively• Resources not available for remote log4j

development• Scribed with Puppet provides the solution

Operational Efficiencies

Page 27: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 27

• Munin• JMX trending• Identify critical data points• Rapid development of graphs• Puppet Definitions are used for rapid

deployment

Development takes the Operational Lead

Page 28: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 28

Sample Munin Graph

Page 29: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 29

Puppet Code

define munin::cassandracolumnfamily ( ) { include cassandravirtual File <| title == "jmxbin" |>

$confdir="/opt/cassandra-munin-plugins” $plugindir="/etc/munin/plugins" $target="/opt/cassandra-munin-plugins/jmx_" # Match 3 strings separated by periods $pattern = '^([^.]*)[.]([^.]*)[.]([^.]*)$' $keyspace = regsubst($name, $pattern, '\1') $columnfamily = regsubst($name, $pattern, '\2') $file = regsubst($name, $pattern, '\3')

file {"${keyspace}_${columnfamily}_${file}.conf": owner => 'root', ensure => 'file', group => 'root', type => 'file', path => "${confdir}/${keyspace}_${columnfamily}_${file}.conf", mode => '644', content => template("munin/attribute_${file}.conf.erb"), require => [ Package['munin-node'], File['/opt/cassandra-munin-plugins'], File['jmxquery'], ], } file {"$plugindir/${keyspace}_${columnfamily}_${file}": ensure => 'link', owner => 'root', group => 'root', mode => '511', type => 'link', target => "$target", require => [ File['/opt/cassandra-munin-plugins'], File["${keyspace}_${columnfamily}_${file}.conf"], File['jmxquery'], Package['munin-node'], ],

Example: Munin Puppet Code

Page 30: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 30

Conclusion

• Cassandra as an appliance• Development Best Practices with Life Cycle

Management• Traditional vs. Today

• Infrastructure 4 weeks 4 hours to build 72 nodes

• Development to Deployment9 months 3 months

• CostMillions 150k

Conclusion

Page 31: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 31

Q&A

Thank You!