Upload
planet-cassandra
View
608
Download
0
Tags:
Embed Size (px)
DESCRIPTION
NativeX (formerly W3i) recently transitioned a large portion of their backend infrastructure from Microsoft SQL Server to Apache Cassandra. Today, its Cassandra cluster backs its mobile advertising network supporting over 10 million daily active users that produce over 10,000 transactions per second with an average database request latency of under 2 milliseconds. Come hear our story about how we were successful at getting our .NET web apps to reliably connect to Cassandra. Come learn about FluentCassandra, Snowflake, Hector, and IKVM. It's a story of struggle and perseverance, where everyone lives happily ever after.
Citation preview
reimagining the business of apps
#Cassandra13
©2013 NativeX Holdings, LLC
The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop
#Cassandra13
About the Presenters Jeff Smoley – Infrastructure Architect
Derek Bromenshenkel – Infrastructure Architect
#Cassandra13
Agenda • About NativeX • Why Cassandra? • Challenges • Auto Id Generation • FluentCassandra • Hector • IKVM.NET • HectorNet • Reporting Integration
©2013 NativeX Holdings, LLC #Cassandra13
About NativeX • Formerly W3i
• Home Office in Sartell, MN
• 75 miles NW of Minneapolis
• Remote Offices in MSP and SF
• 150 Employees
©2013 NativeX Holdings, LLC #Cassandra13
What NativeX Does • Marketing technology
platform that enables developers to build successful business around their apps.
• We provide Publishers with a way to monetize and Advertisers with a way to gain distribution.
#Cassandra13
Mobile Vanity Metrics • Over 700M unique devices
• 1000s of Apps
• > 100M Monthly Active Users
• > 200GB of data ingest per week
©2013 NativeX Holdings, LLC #Cassandra13
Backstory • From 100M session/quarter
to 5B.
• Anticipate 7B sessions in Q2.
• Growth was anticipated.
• Realized infrastructure needed to change to support this. 0
1
2
3
4
5
6
2011 Q4 2012 Q1 2012 Q2 2012 Q3 2012 Q4 2013 Q1
Billion
s
API Requests
©2013 NativeX Holdings, LLC #Cassandra13
Original OLTP Architecture • Microsoft SQL Server
• 2 Node Cluster (Failover)
• 12 cores / node
• 192 GB mem / node • Compellent SAN
• 172 Tiered Disk
• SSD, FC, SATA
©2013 NativeX Holdings, LLC #Cassandra13
Objectives
Scale
• Horizontal • Incremental cost structure
Resiliency
• No single point of failure
• Geographically distributed
©2013 NativeX Holdings, LLC #Cassandra13
What is NoSQL • Stands for Not Only SQL.
• The NoSQL movement is about understanding problems and focusing on solutions.
• It’s not about silver bullets and black boxes.
• It is about using the right tool for the right problem.
©2013 NativeX Holdings, LLC #Cassandra13
Researched Products • Compared features like:
• Distributed / Shared Nothing
• Multi-Cluster Support
• Maturity & Popularity
• Documentation
• .NET Support
©2013 NativeX Holdings, LLC #Cassandra13
Why Cassandra? • Multi-node • Multi-cluster • Highly Available
• Durable • Shared Nothing • Tunable Consistency
©2013 NativeX Holdings, LLC #Cassandra13
Cassandra at NativeX • C* was not a replacement DB system.
• We continue to use MS SQL Server alongside C*.
• SQL Server used for storing configuration data.
• C* solves a very specific problem for us.
• Writing large volumes of data quickly.
• Reading very specific data out of a large record set.
#Cassandra13
Challenges • C* does not have Auto Id generation.
• How to connect to C* with C#?
• Finding a connector with good Failure Tolerance.
• How to integrate our reporting system?
©2013 NativeX Holdings, LLC #Cassandra13
Auto ID Generation • Pre-existing requirements
• Unique, 64-bit positive integers
• Increasing (sortable) a plus
• Previously SQL Server Identity column
• A Time-based UUID is sortable and unique
• Changed everything we could
• The future for us
©2013 NativeX Holdings, LLC #Cassandra13
Auto ID – What are the options? • SQL dummy table
• Easy & familiar, but limited • Pre-generated range
• Proposed by @mdennis
• Distributed, but more complicated to implement • Sharding [Instagram]
• Discovered too late
• Unfamiliar with Postgres
©2013 NativeX Holdings, LLC #Cassandra13
We chose Snowflake • Built by Twitter, Apache 2.0 license
• https://github.com/twitter/snowflake
• “… network service for generating unique ID numbers at high scale..”
• Same motivation; MySQL -> C*
• A few tweaks for our Windows environment
©2013 NativeX Holdings, LLC #Cassandra13
Technical reasons for Snowflake • Meets all requirements
• Tested in high transaction system
• Java based [Scala] implementation
• Thrift server
• Run as a Windows service with Apache Daemon
• Con: Requires Apache Zookeeper
• Coordinate the worker id
©2013 NativeX Holdings, LLC #Cassandra13
Connecting to Snowflake • Built our own .NET
Snowflake Client
• Snowflake server on each web node
• Local instance is primary
• Round robin failover to other nodes
• Auto failover AND recovery
• “Circuit Breaker” pattern
Web App SF
Server 1
Web App SF
Server 3
Web App SF
Server 2
Web App SF
Server 4
#Cassandra13
Challenges • C* does not have Auto Id generation.
• How to connect to C* with C#?
• Finding a connector with good Failure Tolerance.
• How to integrate our reporting system?
©2013 NativeX Holdings, LLC #Cassandra13
Connecting to Cassandra with C# • Thrift alone too low level • Needs
• CQL support • Active development / support
• Wants • ADO.NET / LINQ feel • ????
• FluentCassandra is where we started
©2013 NativeX Holdings, LLC #Cassandra13
Vetting FluentCassandra • Pros
• Open source - https://github.com/fluentcassandra/fluentcassandra
• Nick Berardi, project owner, is excellent
• Designed for CQL
• Familiar feel
• Were able to start project development with it
©2013 NativeX Holdings, LLC #Cassandra13
Vetting FluentCassandra • Cons
• Immaturity
• Few users with high transaction system
• Permanent node blacklisting
• Lacked auto retry
• Couldn’t live with these limitations
• Tried hiring independent contractor to help us mature it
#Cassandra13
Challenges • C* does not have Auto Id generation.
• How to connect to C* with C#?
• Finding a connector with good Failure Tolerance.
• How to integrate our reporting system?
©2013 NativeX Holdings, LLC #Cassandra13
Hector: Yes, please • Popular C* connector
• Use cases matching ours
• Good maturity
• Auto node discovery
• Auto retry
• Auto failure recovery
• Written in Java – major roadblock
©2013 NativeX Holdings, LLC #Cassandra13
Help! • We knew we still needed help.
• We found a company named Concord.
• Based out of the Twin Cites.
• Specialize in System, Process, and Data Integration.
• http://concordusa.com/
©2013 NativeX Holdings, LLC #Cassandra13
Concord’s Recommendation • Concord recommended that we use IKVM.NET to port Hector to
a .NET assembly.
• They had previous success using IKVM for other Java to .NET ports.
• They felt that maturing FluentCassandra was going to take longer than our timeline allowed.
©2013 NativeX Holdings, LLC #Cassandra13
About the IKVM.NET Project • http://www.ikvm.net/
• Open Source Project.
• Main contributor is Jeroen Frijters.
• He is actively contributing to the project.
• License allows for use in commercial applications.
©2013 NativeX Holdings, LLC #Cassandra13
What is IKVM.NET? • IKVM.NET includes the following components:
• A Java Virtual Machine implemented in .NET.
• A .NET implementation of the Java class libraries.
• Set of tools that enable Java and .NET interoperability.
©2013 NativeX Holdings, LLC #Cassandra13
Uses for IKVM • Drop-in JVM
• Included is a distribution of a .NET implementation of a Java Virtual Machine.
• Allows you to run jar files using the .NET stack.
• Example: ikvm -jar myapp.jar
©2013 NativeX Holdings, LLC #Cassandra13
Uses for IKVM • Use Java libraries in your .NET applications
• Using ikvmc you can compile Java bytecode to .NET IL.
• Example: ikvmc -target:library mylib.jar
©2013 NativeX Holdings, LLC #Cassandra13
Uses for IKVM • Develop .NET applications in Java
• Write code in Java.
• Compile to JVM bytecode.
• Use ikvmc to produce a .NET Executable.
• Can also use .NET API’s in Java code using the ikvmstub application to generate a Java jar file.
• Example: ikvmstub MyDotNetAssemblyName
©2013 NativeX Holdings, LLC #Cassandra13
Hector Converted to .NET • Per Concord’s recommendation we chose to compile the Hector
jar into a .NET Assembly.
• Hector and all of it’s dependencies are pulled into one .NET dll that can be referenced by any .NET assembly.
• In addition you will have to reference some core IKVM assemblies.
• Each Java dependency is given it’s own namespace with in the .NET dll.
©2013 NativeX Holdings, LLC #Cassandra13
HectorNet • Concord also created a dll called HectorNet that wraps some of
the Hector behaviors and makes it feel more like .NET.
• Such as supporting connection strings.
• Mapping Thrift byte arrays to .NET data types.
• Mapping to native .NET collections instead of using Java collections.
#Cassandra13
Challenges • C* does not have Auto Id generation.
• How to connect to C* with C#?
• Finding a connector with good Failure Tolerance.
• How to integrate our reporting system?
©2013 NativeX Holdings, LLC #Cassandra13
Integrating Reporting
OLTP C*
Extract Transform
CUBE SSAS
OLAP MS SQL
Load
ETL -‐ SSIS
©2013 NativeX Holdings, LLC #Cassandra13
Integrating Reporting • The SSIS Extract process uses C# Script Tasks.
• Script Task needs references to HectorNet and all of its dependencies.
• SSIS can only reference assemblies that are in the GAC.
• Assemblies in the GAC have to be Signed.
#Cassandra13
Why Not DataStax C# Driver? • We built everything using CQL 2.0. • Wasn’t ready in time for our launch date.
#Cassandra13
DSE for the Win! • We use DataStax Enterprise.
• Mainly for support, which continues to be a life saver.
©2013 NativeX Holdings, LLC #Cassandra13
Thank you! • We are hiring
• http://nativex.com/careers/ • Join the MSP C* Meetup
• http://www.meetup.com/Minneapolis-St-Paul-Cassandra-Meetup/ • Contact us
• [email protected] • [email protected] @breakingtrail
• Slide Deck • http://www.slideshare.net/jjsmoley/the-perils-and-triumphs-of-using-
cassandra-at-a-netmicrosoft-shop