Upload
scylladb
View
1.714
Download
3
Embed Size (px)
Citation preview
ScyllaDB in Samsung SDS
Dror Gadot, Kuyul NohSamsung SDS
From PoC to contribution, and beyond
Agenda • Challenges & Solution
• Use Cases
• Technical Validation
• Future Plan
3 / 23
Samsung SDS
IT Services Business Solutions Logistics BPO
Logistics BPO2Consulting / SI1
Infrastructure Outsourcing
Application OutsourcingSupply Chain & Logistics
1SI : Systems Integration2BPO : Business Process Outsourcing
Enterprise Applications
Enterprise Analytics
Enterprise Mobility
• As an ‘IT Solution & Service Provider’, Samsung SDS play a pivotal role in improving IT competitiveness across the Samsung Group to become top tier companies in multiple industries.
4 / 23
Samsung SDS (2/2)
51 Global Offices in 30countriesGlobal Presence
SDS ChinaBeijing, China
Global HQSeoul, Korea
SDS Latin AmericaSao Paulo, Brazil
SDS Asia PacificSingapore
SDS AmericaNew Jersey, USA
SDS IndiaNew Delhi, India
SDS EuropeWeybridge, UK
SDS Middle EastDubai, UAE
Global Footprint
4 SW Centers29 Logistics Offices
7 Subsidiaries11 Data Centers
5 / 23
Samsung SDS – Scylla
• Deep evaluation of ScyllaDB solution
• Prepare adoption of ScyllaDB
(improve performance & reduce cost of internal systems)
• Contribution to ScyllaDB code base
• Define additional collaboration schemes
6 / 23
Challenges & Solution• Challenges of a NoSQL
§ Performance dilemma- To get higher performance, more servers to a cluster.- 100~200 nodes in a cluster?; Performance/management issues
§ JVM limitation - JVM based application has excellent portability.- DBMS on JVM?; Garbage Collection, Memory management issues
• Solution§ No more JVM based § NUMA friendly new architecture§ High Performance Network processing
Agenda • Challenges & Solution
• Use Cases
▸ IoT Platform
▸ Messenger Service
▸ Requirements
• Technical Validation
• Future Plan
8 / 23
IoT Platform (1/2)
• An enterprise IoT platform that manages the entire lifecycle of data to provide analytical insights for business operations.
Operations Manager
SensorDevice
PLC
Work Station
Data Scientist
Enterprise System
Edge Connect Process Analyze Utilize
E2E Security
Enterprise IoT Platform
Brightics™
9 / 23
IoT Platform (2/2)
Connect AnalyzeProcessEdge Utilize
Sensor Device
Work Station
Video / Smart device
PredictiveAnalytics
AnomalyDetection
VisualizationTools
Enterprise SystemInterface
Analytics Model
Hadoop Eco.
In-Memory
IoTConnectivity
EdgeGateway
DataConnectivity
Connect AnalyzeProcessEdge Utilize
BatchProcessing
Real-timeProcessing
Micro ServiceExecution
CEP
…IOT Data Processing
10 / 23
Messenger Service (1/2)
• Square Messenger provides a communication service to 400,000 users optimized for business.
Real-time conversation with Mobile and DesktopAlways on Messenger Service
Collaboration up to 600 peopleFollowing to existing conversation with chat history
Seamless Communication
Collaboration for
GroupChat
Advanced Security
Message recall , Private conversation
Check the message read status
Screen Lock based on Password & fingerprintScreen capture prevention
11 / 23
Messenger Service (2/2)
ConnectMessaging UtilizeMessaging Service
Messag
ing
Interface
Push
Contact
Presence
MessageProcessing
Agent
ExternalService
ConnectAnalyticsConnectUser Management
Desktop
Android
ConnectAuthentication
Message Data Processing
12 / 23
Requirements
• Higher throughput and Lower latency
• Elastic Scalability
• Stability for 24 x 7 services
• Reduce # of Physical Servers
• Minimal code changes of existing application
Agenda • Challenges & Solution
• Use Cases
• Technical Validation
▸ Testing Environment
▸ Functional Test
▸ Non-Functional Test
• Future Plan
14 / 23
Testing Environment
Node #1
Other
ScyllaDB
Node #2
Other
ScyllaDB
Node #3
Other
ScyllaDB
Node #4
Other
ScyllaDB
Additional nodes for Scale-OutBase nodes
Node #5
Other
ScyllaDB
Node #6
Other
ScyllaDB
Agent #1
Cassandra-stress
Agent #2
Cassandra-stress
Agent #3
Cassandra-stress
* Software • OS: CentOS 7.2• ScyllaDB: 1.0• Other : 2.1.8• Cassandra-stress: 2.1.8※ Replication Factor : 3
* Hardware• Model : Supermicro 6048R• CPU : 16core• Main Memory : 64GB• NIC : 10GB * 4ea• Disk : SSD 300GB (RAID 0)
15 / 23
Testing Scenario
§Has only 1 column,
but data size is varied.
[Data Schema]
§Has always fixed column.
Category Items
Functional
Monitoring ToolData export/importBackup/restore(snapshot)Cassandra CompatibilityClient Connection (cqlsh, thrift)Repair, Compaction, etc.
Non-Functional
PerformanceBy Scale-Out (3 à 4 à 5 à 6 nodes)By Consistency LevelBy workload, etc.
Availability Recovery after Seed node downRecovery after 2 nodes down, etc.
Scalability Add 1 nodes after Seed Node downAdd 1~2 nodes under heavy stress, etc.
Stability Aging test for 5 days under heavy stress
Case 2Case 1
[Testing Items]
16 / 23
Functional Test
Test itemsResults
RemarkOther ScyllaDB
Monitoring (nodetool, UI etc.) O O - Support Tessera, Riemann-dash UI (Docker Container)
Data migration (data file compatibility) - O - Fully compatible with other NoSQL (ver. 2.1.8)
Client connection (cqlsh, thrift) O O - Thrift is supported at Ver.1.3
Repair command X O - Other NoSQL : At manual repair, many time-out wasoccurred under heavy writing
cqlsh features - △
- Not supported featuresCounter typeSecondary IndexTrigger
(will be supported at 1.4)
Compaction features O O - Support SizeTiered, Leveled, DateTiered types
Other ScyllaCQL data types 8 7
Functions 5 5
cqlsh commands 11 11
CQL commands 28 24
※ updated based on ScyllaDB Ver. 1.3 RC3 (‘16.8.18) (O: fully meet, △ : partially meet, X : don't meet)
• Most of features are work well, a few are under development
17 / 23
Non-Functional Test - Performance(1/2)
(load: 3,000 threads)
Case1-Read
Case1-Write
TPS Latency (unit : ms)
194,144
776,283
5.4
15.5
84,999
349,722
7.7
35.3
• Has 2~8 times higher performance
Other
ScyllaDB
Other
ScyllaDB
Other
ScyllaDB
Other
ScyllaDB
18 / 23
Non-Functional Test - Performance(2/2)
Case1-Read 70%Write 30%
Case2-Read 50%Write 50%
96,610
518,482
5.1
30.6
69,038
407,883
1.5
43.4
TPS Latency (unit : ms)
Other
ScyllaDB
Other
ScyllaDB
Other
ScyllaDB
Other
ScyllaDB
19 / 23
Non-Functional Test – Availability/Scalability
• No issues on availability and scalability
kill restart
[Availability]Down Seed Node & Rejoin it to cluster
kill restart
Other ScyllaDB
[Scalability]Add 2 nodes into cluster simultaneously
add add
è The TPS was decreased for 40~60ms,
and then recovered the previous TPS
for 50~70 ms when the node was rejoined.
è The TPS was decreased when 2 node was added,
and then increased to expected TPS after 100 ms
in both cases
Other ScyllaDB
20 / 23
Non-Functional Test– Stability
• Keep in stable under heavy stress for 5 days
TPS Latency
ScyllaDB• Average TPS :
113,879
• Average Latency :
1.37 ms
CPU Usage
Other• Average TPS :
16,249
• Average Latency :
25.2 ms
(unit : ms)
※ Data Schema : Case2, Transaction Type : Read 50%, Write 50 %, Work Load: 400 threads
Agenda • Challenges & Solution
• Use Cases
• Technical Validation
• Future Plan
▸ Continuous engagement
▸ Contribution
22 / 23
Continuous engagement
• Ver. 0.10 (Oct. 2015)§ Feasibility test, Requirements discussion
• Ver. 0.17 (Feb.2016)§ Functional/Performance Test § Report bugs (performance drop when new two nodes were added, Manual repair/compact time-out, etc.)
• Ver. 1.0 (Apr.2016)§ Use cases based PoC § Report bugs (Large partition data insertion error, Major compaction error, etc.)
• Ver. 1.3 (Aug.2016)§ New feature test
23 / 23
Future Plan
• Applying to business cases§ IoT data gathering, Message processing, etc.§ Many new use cases
• Planning to develop additional enterprise features§ Large cluster management, ScyllaDB as a Service, etc.
• Will contribute to community§ Monitoring tool§ Management tool
▶ Proven Solution, ScyllaDB
▶ Make it Happen Together!
Wrap Up
Thank You!
Contact: [email protected]