Upload
couchbase
View
3.341
Download
3
Embed Size (px)
Citation preview
The Three Dimensions of NoSQL Evalua6ng NoSQL Databases for Enterprise Readiness!
William McKnight, Founder McKnight Consul3ng Group & GigaOm Analyst Anil Madan, Sr. Dir. Engineering, PayPal Shane Johnson, Product Marke3ng Manager, Couchbase!
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 2
Unlock Potential
William McKnight President McKnight Consulting Group
The Three Dimensions of NoSQL
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 3
About your Speaker, William McKnight
President, McKnight Consul3ng Group ! Frequent keynote speaker and trainer interna3onally
! Consulted to Pfizer, Sco3abank, Teva Pharmaceu3cals, Verizon, Fidelity, AIG, Commerzbank, Amerisource Bergen, Dong Energy, TeliaSonera and many other Global 1000 companies
! A prolific writer with hundreds of ar3cles, blogs, 25 white papers and 2 books in publica3on
! Focused on delivering business value and solving business problems u3lizing proven, streamlined approaches to informa3on management
! Leading analyst in Informa3on Management
! Award-‐winning entrepreneur, consultant, IT leader
! Consistently ranked top 20 in big data rankings
! 20+ years in Informa3on Management
! Mentor, Leader, Informa3on Architect
! Former Wellpoint Informa3on Technology execu3ve William McKnight
The Savvy Manager’s Guide
The Savvy Manager’s Guide
Information M
anagement
Information Management Strategies for Gaining a
Competitive Advantage with Data
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 4
Telecommunica6ons
McKnight Consulting Group Client Portfolio
Finance & Insurance Healthcare & Pharmaceu6cals
Manufacturing, High Tech & Energy Retail & Consumer Goods
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 5
A Quick Summary
Parallel DB Systems NoSQL Data Model ! Structured data with known
schema ! Any data will fit in any
format ! (un)(semi)structured
Hardware Configuration
! Purchased as an appliance ! Assembled from commodity machines
Fault Tolerance
! Failures assumed to be rare
! No query level fault tolerance
! Failures assumed to be common
! Simple, yet efficient, fault tolerance
Where to do big data analytics?
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 6
! More data model flexibility ! JSON, XML as a data model ! No “schema first” requirement; load first
! Faster time to insight from data acquisition ! Relaxed ACID
! Eventual consistency ! Willing to trade consistency for availability ! ACID would crush things like storing all clicks, sensor reads,
etc.
! Low upfront software costs ! Programmers love the freedoms
Why NoSQL
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 7
! Highly scalable ! 1000s of nodes and massive (100s of TB) files ! Large block sizes to maximize sequential I/O
performance ! No use of mirroring or RAID.
! Reduce cost ! Use one mechanism (triply replicated blocks) to
deal with a wide variety of failure types rather than multiple different mechanisms
! Negatives ! Lack of control over record placement
! Makes it impossible to employ many optimizations successfully employed by parallel DB systems
File System Summary
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 8
SQL
Operational Big Data Platform Selection
Data Size
Workload Complexity
NoSQL
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 9
Key-Value Stores
! NoSQL OLTP
! A record may look like: ! Book: “Of Mice and Men": Author: “Steinbeck“
! Great for unstructured data centered on a single object.
! Typically used as a cache for data frequently requested by web applications such as online shopping carts or social-media sites.
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 10
! A record may look like: ! “id” => 12345, ! “name” => “Jane”, ! “age” => 22, ! “address” => number => 123 street => Main
! Often deployed for web-traffic analysis, social gaming, content stores, user-behavior/action analysis, or log-file analysis in real time.
Document Stores
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 11
! Key-Value Plus…
! Column/Groups Stored Independently
! Column Family Values Required
! Ideal for Application State Information, Blogging Data
Wide Column Stores
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 12
! Based on Graph Theory ! Vertices (nodes), edges (relations) and properties
! Navigating social networks, configurations and recommendations ! i.e., Get the cheapest flights from DFW to SYD leaving
on 7/12/13 with a minimum number of stops and each stop less than 2 hours.
! i.e., Social Networks ! Churn and Offer Management
Graph Stores: Emphasizing Relationships as Primary Data
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 14
Unlock Potential
Evaluation Criteria for NoSQL Databases
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 15
Common Features of NoSQL Databases
! High Availability
! Cloud Readiness
! Parallel Query with MapReduce
! Commodity Class Nodes
! Eventual Consistency
! Distribution with Sharding
! Scalability with Scale Out
! Schemaless
! Open Source
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 16
The Ability to Perform
! Throughput latency for all operations
! Read-intensive workloads
! Write-intensive workloads
! Balanced workloads
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 17
Enabling Performance
! Ability to Scale Up
! Concurrency Control (Locks, Multiversion Concurrency Control, Coarse Grained versus Fine Grained)
! Object Caching (Integrated versus External)
! Ability to Use Memory Efficiently
! Support for Sharding (Reads and Writes)
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 18
The Ability to be Agile to the Needs of the Organization
! Large Established Community
! Ease of Installation and Configuration (hours versus days)
! Support for Integration with Apache Hadoop
! Support for Deployment to Containers, Virtual Machines, and the Cloud
! Continuous Availability with no single point of failure
! Read and Write Anywhere
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 19
The Ability to Scale
! Mirroring Process (Synchronous versus Asynchronous)
! Administration Console (GUI, Integrated versus Separate)
! Rebalancing (Controllable versus Not)
! Node Addition Process (Disruption, Ease)
! Topology (One Node Type versus Multiple Node Types)
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 20
! Data Integration ! Data Virtualization ! Cloud Infrastructure ! Data Quality/Governance ! Information Architecture
UnBig (RDBMS)
Big (NoSQL)
Enablers for NoSQL
Data Warehouses Marts & Cubes Operational
Data Stores Transactional Sources
File Systems Big Data
Enterprise Data Virtualization
Copyright © 2014 McKnight Consulting Group, LLC All Rights Reserved Slide 21
The Three Dimensions of NoSQL
Strategy
Training
Strategy § Trusted Advisor § Assessments § Roadmaps § Tool Selections § Program Management Training § Classes § Workshops Implementation § Big Data § Data Warehousing/Business Intelligence/Analytics § Master Data Management § Governance/Quality
Implementation
A consul6ng firm with a 100 percent focus on Informa6on Management.
[email protected] 214-514-1444
© 2014 PayPal Inc. All rights reserved.
Couchbase@PayPal Anil Madan Sr. Director of Engineering [email protected]
26
CURRENCIES SUPPORTED
148M
ACTIVE REGISTERED ACCOUNTS
200 MARKETS OFFER PAYPAL
EUROPEAN UNION EURO
AUSTRALIAN DOLLAR
CANADIAN DOLLAR
NEW ZEALAND DOLLAR
HUNGARIAN FORINT
MALAYSIAN RINGGIT
UNITED KINGDOM POUNDS STERLING
HONG KONG DOLLAR
UNITED STATES DOLLAR
TAIWAN NEW DOLLAR
CHINESE RMB
SWEDISH KRONA
SINGAPORE DOLLAR
PHILIPPINE PESO
BRAZILIAN REAL
RUSSIAN RUBLE
NORWEGIAN KRONE
JAPANESE YEN
MEXICAN PESO
TURKISH LIRA
SWISS FRANC
CZECH KORUNA
ISRAELI NEW SHEKEL
DANISH KRONE
THAI BAHT
POLISH ZLOTY
$1 in every $6!
Spent on e-commerce is spent through PayPal.*
*Source: Morgan Stanley, “eCommerce Disruption: A Global Theme,” January 6, 2013, p.21.
© 2014 PayPal Inc. All rights reserved..
Why Couchbase?!
▪ Data volume • Online system ; 300M – 1B documents @ 10k value size • 3-10TB total storage
▪ Data Access • Distributed caching • Persistence
▪ Data Structure • Flexible & Schemaless
▪ Read/Write • 50% read/50% write • Low latency < 5 msec
▪ Availability and scalability • Resilient • Multi data center – DR/BCP • Linearly Scalable
© 2014 PayPal Inc. All rights reserved..
Personalization
27
Touch points
Data
Online Beacon Retail
Personalized Experiences
Segments
Science
Models Hyperlocal
PayPal Merchant Inc 3rd Party
Customer Journeys
Mobile
Merchants PayPal 3rd Par6es
© 2014 PayPal Inc. All rights reserved..
Core Pillars
Connect with 90MM US Shoppers Connect the right offer to the right consumer
Drive customers to purchase.
Connect with over 100MM US shoppers via mobile and online
Connect the right message to the right consumer
Drive shoppers to purchase and close the loop
Reach Relevancy Redemption
© 2014 PayPal Inc. All rights reserved..
Real Time Personalization Service (RPS)!
Social Platform ID
PayPal User ID
3rd Party User ID
eBay User ID
Email Address
Home Address
Phone Number
IDFA Profile ID
© 2014 PayPal Inc. All rights reserved..
Profile Record
Identity & Profile Schema!
Match Key! Value!Master ID! 123ABC456DEF !Email! [email protected]
m, [email protected], !
Ebay ID! 120AS09812DNE0983!
PayPal ID! 03824AD814912NMD1!
ID! Gender!
HHI! Age! PayPal Status!
Credit Standing!
Account Locked!
PayPal Balance!
123ABC456DEF !
Male! $75K+!
25 – 35!
Active! Approved! No! 10!
Identity Record
© 2014 PayPal Inc. All rights reserved..
Document id=550e8400-‐e29b-‐41d4-‐a716-‐446655440000 { "matchKeyData" : [{“pguid_1234" : "45564757"}, {“eguid_5678" : "45657556"}], "segmentProviders" : [ { "name" : ”paypal”, "aeributes" : { "created" : 698465466, "updated" : 698465466, }, "segments" : { “pp.signup.recency":”6579696", “pp.bml.standing":”Approved", "pp.account.locked":”4", "pp.account.balance":"10" } },
!
{ "name" : “ebay", "aeributes" : { "created" : 698465466, "updated" : 759669696, }, "segments" : { “ebay.gender":"1” , “ebay.married”:"0”,
“ebay.age_range”:”2”, “ebay_hhi”:”75”
} } ] }
key : ”eguid_1234" value: "550e8400-e29b-41d4-a716-446655440000" key : ”pguid_5678" value: "550e8400-e29b-41d4-a716-446655440000" key : ”idfa_90" value: "550e8400-e29b-41d4-a716-446655440000"
Identity & Profile Documents!
© 2014 PayPal Inc. All rights reserved..
Where Couchbase Fits In..
32
Data Segments
Science
Models Hyperlocal
PayPal Merchant Inc 3rd Party
Customer Journeys
Touch points Online Beacon Retail
Personalized Experiences
Mobile
Merchants PayPal 3rd Par6es
11
Personaliza3on
XDCR Unidirectional
Write Read
RPS
© 2014 PayPal Inc. All rights reserved..
Cookie Functional View!
CookieService
Couchbase DC A Couchbase DC B
Front Tier
User Interactions
Application Cookie Libraries
Mid Tier
Data Tier
XDCR
Couchbase Client
© 2014 PayPal Inc. All rights reserved..
Cookie Service
Cookie Service
Cookie Service
XDCR
Active
Write
Read
Deployment Model!
Birdirectional Unidirectional
Active Passive
© 2014 PayPal Inc. All rights reserved.
Couchbase@PayPal Anil Madan Sr. Director of Engineering [email protected]
26
CURRENCIES SUPPORTED
148M
ACTIVE REGISTERED ACCOUNTS
200 MARKETS OFFER PAYPAL
EUROPEAN UNION EURO
AUSTRALIAN DOLLAR
CANADIAN DOLLAR
NEW ZEALAND DOLLAR
HUNGARIAN FORINT
MALAYSIAN RINGGIT
UNITED KINGDOM POUNDS STERLING
HONG KONG DOLLAR
UNITED STATES DOLLAR
TAIWAN NEW DOLLAR
CHINESE RMB
SWEDISH KRONA
SINGAPORE DOLLAR
PHILIPPINE PESO
BRAZILIAN REAL
RUSSIAN RUBLE
NORWEGIAN KRONE
JAPANESE YEN
MEXICAN PESO
TURKISH LIRA
SWISS FRANC
CZECH KORUNA
ISRAELI NEW SHEKEL
DANISH KRONE
THAI BAHT
POLISH ZLOTY
$1 in every $6!
Spent on e-commerce is spent through PayPal.*
*Source: Morgan Stanley, “eCommerce Disruption: A Global Theme,” January 6, 2013, p.21.
© 2014 PayPal Inc. All rights reserved..
Why Couchbase?!
▪ Data volume • Online system ; 300M – 1B documents @ 10k value size • 3-10TB total storage
▪ Data Access • Distributed caching • Persistence
▪ Data Structure • Flexible & Schemaless
▪ Read/Write • 50% read/50% write • Low latency < 5 msec
▪ Availability and scalability • Resilient • Multi data center – DR/BCP • Linearly Scalable
© 2014 PayPal Inc. All rights reserved..
Personalization
43
Touch points
Data
Online Beacon Retail
Personalized Experiences
Segments
Science
Models Hyperlocal
PayPal Merchant Inc 3rd Party
Customer Journeys
Mobile
Merchants PayPal 3rd Par6es
© 2014 PayPal Inc. All rights reserved..
Core Pillars
Connect with 90MM US Shoppers Connect the right offer to the right consumer
Drive customers to purchase.
Connect with over 100MM US shoppers via mobile and online
Connect the right message to the right consumer
Drive shoppers to purchase and close the loop
Reach Relevancy Redemption
© 2014 PayPal Inc. All rights reserved..
Real Time Personalization Service (RPS)!
Social Platform ID
PayPal User ID
3rd Party User ID
eBay User ID
Email Address
Home Address
Phone Number
IDFA Profile ID
© 2014 PayPal Inc. All rights reserved..
Profile Record
Identity & Profile Schema!
Match Key! Value!Master ID! 123ABC456DEF !Email! [email protected]
m, [email protected], !
Ebay ID! 120AS09812DNE0983!
PayPal ID! 03824AD814912NMD1!
ID! Gender!
HHI! Age! PayPal Status!
Credit Standing!
Account Locked!
PayPal Balance!
123ABC456DEF !
Male! $75K+!
25 – 35!
Active! Approved! No! 10!
Identity Record
© 2014 PayPal Inc. All rights reserved..
Document id=550e8400-‐e29b-‐41d4-‐a716-‐446655440000 { "matchKeyData" : [{“pguid_1234" : "45564757"}, {“eguid_5678" : "45657556"}], "segmentProviders" : [ { "name" : ”paypal”, "aeributes" : { "created" : 698465466, "updated" : 698465466, }, "segments" : { “pp.signup.recency":”6579696", “pp.bml.standing":”Approved", "pp.account.locked":”4", "pp.account.balance":"10" } },
!
{ "name" : “ebay", "aeributes" : { "created" : 698465466, "updated" : 759669696, }, "segments" : { “ebay.gender":"1” , “ebay.married”:"0”,
“ebay.age_range”:”2”, “ebay_hhi”:”75”
} } ] }
key : ”eguid_1234" value: "550e8400-e29b-41d4-a716-446655440000" key : ”pguid_5678" value: "550e8400-e29b-41d4-a716-446655440000" key : ”idfa_90" value: "550e8400-e29b-41d4-a716-446655440000"
Identity & Profile Documents!
© 2014 PayPal Inc. All rights reserved..
Where Couchbase Fits In..
48
Data Segments
Science
Models Hyperlocal
PayPal Merchant Inc 3rd Party
Customer Journeys
Touch points Online Beacon Retail
Personalized Experiences
Mobile
Merchants PayPal 3rd Par6es
11
Personaliza3on
XDCR Unidirectional
Write Read
RPS
© 2014 PayPal Inc. All rights reserved..
Cookie Functional View!
CookieService
Couchbase DC A Couchbase DC B
Front Tier
User Interactions
Application Cookie Libraries
Mid Tier
Data Tier
XDCR
Couchbase Client
© 2014 PayPal Inc. All rights reserved..
Cookie Service
Cookie Service
Cookie Service
XDCR
Active
Write
Read
Deployment Model!
Birdirectional Unidirectional
Active Passive