40
reimagining the business of apps #Cassandra13 ©2013 NativeX Holdings, LLC The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop

C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

Embed Size (px)

DESCRIPTION

NativeX (formerly W3i) recently transitioned a large portion of their backend infrastructure from Microsoft SQL Server to Apache Cassandra. Today, its Cassandra cluster backs its mobile advertising network supporting over 10 million daily active users that produce over 10,000 transactions per second with an average database request latency of under 2 milliseconds. Come hear our story about how we were successful at getting our .NET web apps to reliably connect to Cassandra. Come learn about FluentCassandra, Snowflake, Hector, and IKVM. It's a story of struggle and perseverance, where everyone lives happily ever after.

Citation preview

Page 1: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

reimagining the business of apps

#Cassandra13    

©2013 NativeX Holdings, LLC

The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop

Page 2: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

#Cassandra13    

About the Presenters Jeff Smoley – Infrastructure Architect

Derek Bromenshenkel – Infrastructure Architect

Page 3: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

#Cassandra13    

Agenda •  About NativeX •  Why Cassandra? •  Challenges •  Auto Id Generation •  FluentCassandra •  Hector •  IKVM.NET •  HectorNet •  Reporting Integration

Page 4: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

About NativeX •  Formerly W3i

•  Home Office in Sartell, MN

•  75 miles NW of Minneapolis

•  Remote Offices in MSP and SF

•  150 Employees

Page 5: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

What NativeX Does •  Marketing technology

platform that enables developers to build successful business around their apps.

•  We provide Publishers with a way to monetize and Advertisers with a way to gain distribution.

Page 6: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

#Cassandra13    

Mobile Vanity Metrics •  Over 700M unique devices

•  1000s of Apps

•  > 100M Monthly Active Users

•  > 200GB of data ingest per week

Page 7: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Backstory •  From 100M session/quarter

to 5B.

•  Anticipate 7B sessions in Q2.

•  Growth was anticipated.

•  Realized infrastructure needed to change to support this. 0  

1  

2  

3  

4  

5  

6  

2011  Q4   2012  Q1   2012  Q2   2012  Q3   2012  Q4   2013  Q1  

Billion

s  

API  Requests  

Page 8: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Original OLTP Architecture •  Microsoft SQL Server

•  2 Node Cluster (Failover)

•  12 cores / node

•  192 GB mem / node •  Compellent SAN

•  172 Tiered Disk

•  SSD, FC, SATA

Page 9: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Objectives

Scale  

• Horizontal  • Incremental  cost  structure  

Resiliency  

• No  single  point  of  failure  

• Geographically  distributed  

Page 10: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

What is NoSQL •  Stands for Not Only SQL.

•  The NoSQL movement is about understanding problems and focusing on solutions.

•  It’s not about silver bullets and black boxes.

•  It is about using the right tool for the right problem.

Page 11: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Researched Products •  Compared features like:

•  Distributed / Shared Nothing

•  Multi-Cluster Support

•  Maturity & Popularity

•  Documentation

•  .NET Support

Page 12: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Why Cassandra? •  Multi-node •  Multi-cluster •  Highly Available

•  Durable •  Shared Nothing •  Tunable Consistency

Page 13: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Cassandra at NativeX •  C* was not a replacement DB system.

•  We continue to use MS SQL Server alongside C*.

•  SQL Server used for storing configuration data.

•  C* solves a very specific problem for us.

•  Writing large volumes of data quickly.

•  Reading very specific data out of a large record set.

Page 14: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

#Cassandra13    

Challenges •  C* does not have Auto Id generation.

•  How to connect to C* with C#?

•  Finding a connector with good Failure Tolerance.

•  How to integrate our reporting system?

Page 15: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Auto ID Generation •  Pre-existing requirements

•  Unique, 64-bit positive integers

•  Increasing (sortable) a plus

•  Previously SQL Server Identity column

•  A Time-based UUID is sortable and unique

•  Changed everything we could

•  The future for us

Page 16: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Auto ID – What are the options? •  SQL dummy table

•  Easy & familiar, but limited •  Pre-generated range

•  Proposed by @mdennis

•  Distributed, but more complicated to implement •  Sharding [Instagram]

•  Discovered too late

•  Unfamiliar with Postgres

Page 17: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

We chose Snowflake •  Built by Twitter, Apache 2.0 license

•  https://github.com/twitter/snowflake

•  “… network service for generating unique ID numbers at high scale..”

•  Same motivation; MySQL -> C*

•  A few tweaks for our Windows environment

Page 18: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Technical reasons for Snowflake •  Meets all requirements

•  Tested in high transaction system

•  Java based [Scala] implementation

•  Thrift server

•  Run as a Windows service with Apache Daemon

•  Con: Requires Apache Zookeeper

•  Coordinate the worker id

Page 19: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Connecting to Snowflake •  Built our own .NET

Snowflake Client

•  Snowflake server on each web node

•  Local instance is primary

•  Round robin failover to other nodes

•  Auto failover AND recovery

•  “Circuit Breaker” pattern

Web  App   SF  

Server  1  

Web  App   SF  

Server  3  

Web  App   SF  

Server  2  

Web  App   SF  

Server  4  

Page 20: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

#Cassandra13    

Challenges •  C* does not have Auto Id generation.

•  How to connect to C* with C#?

•  Finding a connector with good Failure Tolerance.

•  How to integrate our reporting system?

Page 21: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Connecting to Cassandra with C# •  Thrift alone too low level •  Needs

•  CQL support •  Active development / support

•  Wants •  ADO.NET / LINQ feel •  ????

•  FluentCassandra is where we started

Page 22: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Vetting FluentCassandra •  Pros

•  Open source - https://github.com/fluentcassandra/fluentcassandra

•  Nick Berardi, project owner, is excellent

•  Designed for CQL

•  Familiar feel

•  Were able to start project development with it

Page 23: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Vetting FluentCassandra •  Cons

•  Immaturity

•  Few users with high transaction system

•  Permanent node blacklisting

•  Lacked auto retry

•  Couldn’t live with these limitations

•  Tried hiring independent contractor to help us mature it

Page 24: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

#Cassandra13    

Challenges •  C* does not have Auto Id generation.

•  How to connect to C* with C#?

•  Finding a connector with good Failure Tolerance.

•  How to integrate our reporting system?

Page 25: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Hector: Yes, please •  Popular C* connector

•  Use cases matching ours

•  Good maturity

•  Auto node discovery

•  Auto retry

•  Auto failure recovery

•  Written in Java – major roadblock

Page 26: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Help! •  We knew we still needed help.

•  We found a company named Concord.

•  Based out of the Twin Cites.

•  Specialize in System, Process, and Data Integration.

•  http://concordusa.com/

Page 27: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Concord’s Recommendation •  Concord recommended that we use IKVM.NET to port Hector to

a .NET assembly.

•  They had previous success using IKVM for other Java to .NET ports.

•  They felt that maturing FluentCassandra was going to take longer than our timeline allowed.

Page 28: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

About the IKVM.NET Project •  http://www.ikvm.net/

•  Open Source Project.

•  Main contributor is Jeroen Frijters.

•  He is actively contributing to the project.

•  License allows for use in commercial applications.

Page 29: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

What is IKVM.NET? •  IKVM.NET includes the following components:

•  A Java Virtual Machine implemented in .NET.

•  A .NET implementation of the Java class libraries.

•  Set of tools that enable Java and .NET interoperability.

Page 30: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Uses for IKVM •  Drop-in JVM

•  Included is a distribution of a .NET implementation of a Java Virtual Machine.

•  Allows you to run jar files using the .NET stack.

•  Example: ikvm -jar myapp.jar

Page 31: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Uses for IKVM •  Use Java libraries in your .NET applications

•  Using ikvmc you can compile Java bytecode to .NET IL.

•  Example: ikvmc -target:library mylib.jar

Page 32: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Uses for IKVM •  Develop .NET applications in Java

•  Write code in Java.

•  Compile to JVM bytecode.

•  Use ikvmc to produce a .NET Executable.

•  Can also use .NET API’s in Java code using the ikvmstub application to generate a Java jar file.

•  Example: ikvmstub MyDotNetAssemblyName

Page 33: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Hector Converted to .NET •  Per Concord’s recommendation we chose to compile the Hector

jar into a .NET Assembly.

•  Hector and all of it’s dependencies are pulled into one .NET dll that can be referenced by any .NET assembly.

•  In addition you will have to reference some core IKVM assemblies.

•  Each Java dependency is given it’s own namespace with in the .NET dll.

Page 34: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

HectorNet •  Concord also created a dll called HectorNet that wraps some of

the Hector behaviors and makes it feel more like .NET.

•  Such as supporting connection strings.

•  Mapping Thrift byte arrays to .NET data types.

•  Mapping to native .NET collections instead of using Java collections.

Page 35: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

#Cassandra13    

Challenges •  C* does not have Auto Id generation.

•  How to connect to C* with C#?

•  Finding a connector with good Failure Tolerance.

•  How to integrate our reporting system?

Page 36: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Integrating Reporting

OLTP  C*  

Extract   Transform  

CUBE  SSAS  

OLAP  MS  SQL  

Load  

ETL  -­‐  SSIS  

Page 37: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Integrating Reporting •  The SSIS Extract process uses C# Script Tasks.

•  Script Task needs references to HectorNet and all of its dependencies.

•  SSIS can only reference assemblies that are in the GAC.

•  Assemblies in the GAC have to be Signed.

Page 38: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

#Cassandra13    

Why Not DataStax C# Driver? •  We built everything using CQL 2.0. •  Wasn’t ready in time for our launch date.

Page 39: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

#Cassandra13    

DSE for the Win! •  We use DataStax Enterprise.

•  Mainly for support, which continues to be a life saver.

Page 40: C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley

©2013 NativeX Holdings, LLC #Cassandra13    

Thank you! •  We are hiring

•  http://nativex.com/careers/ •  Join the MSP C* Meetup

•  http://www.meetup.com/Minneapolis-St-Paul-Cassandra-Meetup/ •  Contact us

•  [email protected] •  [email protected] @breakingtrail

•  Slide Deck •  http://www.slideshare.net/jjsmoley/the-perils-and-triumphs-of-using-

cassandra-at-a-netmicrosoft-shop