Cassandra & puppet, scaling data at $15 per month

Preview:

DESCRIPTION

Constant Contact shares lessons learned from DevOps approach to implementing Cassandra to manage social media data for over 400k small business customers. Puppet is the critical in our tool chain. Single most important factor was the willingness of Development and Operations to stretch beyond traditional roles and responsibilities.

Citation preview

Copyright © 2011 Constant Contact Inc. 1

Constant ContactMarch 2011

Dave Connors – VP OperationsJim Ancona – Systems ArchitectMark Schena – Manager Systems Automation

Cassandra & Puppet:Scaling data at $15/month

Copyright © 2011 Constant Contact Inc. 2

Constant Contact

2000 – 2010

Market leader for Small Businesses• Email, Event & Survey• Over 400k paying customers• No. 134 on the Deloitte Technology Fast 500 listing

Business model• Many customers pay as little as $15 a month• ~2 million database transactions per minute

Constant Contact

Copyright © 2011 Constant Contact Inc. 3

Constant Contact

The business problem

Copyright © 2011 Constant Contact Inc. 4

Constant Contact

Small Businesses are looking to us for help with Social Media marketing

• Social Media 10-100 times more data

• Challenge with our business model

Copyright © 2011 Constant Contact Inc. 5

The Key Challenge

Integrate social media data

• Solution = NoSQL

• Cost = Low

• Time to market = ?

The Key Challenge

Copyright © 2011 Constant Contact Inc. 6

Implementation

Ops and Dev both face issues

• Data model• Monitoring• Authentication• Logging• Risk profile• Roles & Responsibilities

Implementing NoSQL

Copyright © 2011 Constant Contact Inc. 7

Dev

Ops

Copyright © 2011 Constant Contact Inc. 8

Apache Cassandra

• Developed at Facebook• Open sourced in 2008• Incubated at Apache• Became an Apache top-level project in 2010

• http://cassandra.apache.org

• In use at Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco, …

• Largest production cluster has over 100 TB of data in over 150 machines

Apache Cassandra

Copyright © 2011 Constant Contact Inc. 9

What is Cassandra?

• Implemented in Java

• Fault Tolerant• Elastic• Durable

• Rich data model• Replicated data • Consistency

options

What is Cassandra

Copyright © 2011 Constant Contact Inc. 10

Replication

X

X X

How many copies of each piece of data

do we want?

N=3

Replication

Copyright © 2011 Constant Contact Inc. 11

Y

Y Y

Y

X Y

Consistency LevelONE

WriterReade

r

YX

X X

Consistency Level One

Copyright © 2011 Constant Contact Inc. 12

Y

Y Y

X

X Y

WriterReade

r

XX

X X

Consistency Level Quorum

Copyright © 2011 Constant Contact Inc. 13

Risks and Mitigation

• Moving target• Developer

unfamiliarity• Operational

procedures• Reliability concerns

• Deployment automation

• Community involvement

• Training/Consulting• Application

selection• Lots of monitoring• Phased rollout

Risks and Mitigation

Copyright © 2011 Constant Contact Inc. 14

Development Challenges

Understanding the data modelChoosing a client

■ Clients available for Java, Python, .NET, Ruby, PHP

■ Don’t use Thrift

Moving target

Development Challenges

Copyright © 2011 Constant Contact Inc. 15

• Not “one neck to wring”• Paid support and training is available:

http://datastax.com• Community

■ Mailing lists■ IRC #cassandra at freenode

• Contribute

Open Source

Copyright © 2011 Constant Contact Inc. 16

• Switchable modes• Mirroring• Dial-able traffic

Phased Rollout

Copyright © 2011 Constant Contact Inc. 17

• Big, complex project• Close collaboration• Flexible roles• Ability to iterate

Collaboration

Copyright © 2011 Constant Contact Inc. 18

Dev

Ops

Copyright © 2011 Constant Contact Inc. 19

“Are you sure you really want that?”

• 3 500G disks• 1 250G disk• No SWAP• RAID Zero Root Partition and Data Storage• 32G Memory

“Are you sure you really want that?”

Copyright © 2011 Constant Contact Inc. 20

We will need how many servers?We will need how many servers?

Copyright © 2011 Constant Contact Inc. 21

• Quorum = 3 • Multiple Datacenters = 2• Use only half the available disk = 2• 12 Servers = ~1 TB Of Data Storage• ~6 TB of Data Storage

3 x 2 = 6x 2 = 12

72x 6 =

How many nodes?

Copyright © 2011 Constant Contact Inc. 22

RanRandom Partitioner

Copyright © 2011 Constant Contact Inc. 23

Tool ChainTool Chain

Copyright © 2011 Constant Contact Inc. 24

with Puppet

• Puppet is the shared framework between Operations and Development

• Versioning of puppet code allows for adoption of development best practices

• Leverage Domain specific knowledge and skill

DevOps with Puppet

Copyright © 2011 Constant Contact Inc. 25

Always Move ForwardAlways Move Forward

Copyright © 2011 Constant Contact Inc. 26

Operational Efficiencies

• Remote logging is a requirement • Cassandra uses log4j natively• Resources not available for remote log4j

development• Scribed with Puppet provides the solution

Operational Efficiencies

Copyright © 2011 Constant Contact Inc. 27

• Munin• JMX trending• Identify critical data points• Rapid development of graphs• Puppet Definitions are used for rapid

deployment

Development takes the Operational Lead

Copyright © 2011 Constant Contact Inc. 28

Sample Munin Graph

Copyright © 2011 Constant Contact Inc. 29

Puppet Code

define munin::cassandracolumnfamily ( ) { include cassandravirtual File <| title == "jmxbin" |>

$confdir="/opt/cassandra-munin-plugins” $plugindir="/etc/munin/plugins" $target="/opt/cassandra-munin-plugins/jmx_" # Match 3 strings separated by periods $pattern = '^([^.]*)[.]([^.]*)[.]([^.]*)$' $keyspace = regsubst($name, $pattern, '\1') $columnfamily = regsubst($name, $pattern, '\2') $file = regsubst($name, $pattern, '\3')

file {"${keyspace}_${columnfamily}_${file}.conf": owner => 'root', ensure => 'file', group => 'root', type => 'file', path => "${confdir}/${keyspace}_${columnfamily}_${file}.conf", mode => '644', content => template("munin/attribute_${file}.conf.erb"), require => [ Package['munin-node'], File['/opt/cassandra-munin-plugins'], File['jmxquery'], ], } file {"$plugindir/${keyspace}_${columnfamily}_${file}": ensure => 'link', owner => 'root', group => 'root', mode => '511', type => 'link', target => "$target", require => [ File['/opt/cassandra-munin-plugins'], File["${keyspace}_${columnfamily}_${file}.conf"], File['jmxquery'], Package['munin-node'], ],

Example: Munin Puppet Code

Copyright © 2011 Constant Contact Inc. 30

Conclusion

• Cassandra as an appliance• Development Best Practices with Life Cycle

Management• Traditional vs. Today

• Infrastructure 4 weeks 4 hours to build 72 nodes

• Development to Deployment9 months 3 months

• CostMillions 150k

Conclusion

Copyright © 2011 Constant Contact Inc. 31

Q&A

Thank You!

Recommended