21
Introduction to NoSQL (Not Only SQL) By Antonio Castellón :: Multi-Disciplinary Engineer – Computer Science May, 2015 - for Philip Morris International R&D

NoSQL

Embed Size (px)

Citation preview

Page 1: NoSQL

Introduction to NoSQL

(Not Only SQL)

By Antonio Castellón :: Multi-Disciplinary Engineer – Computer Science

May, 2015 - for Philip Morris International R&D

Page 2: NoSQL

Problem : Data Complex

Page 3: NoSQL

Problem : Data Complex to Model

Page 4: NoSQL

Problem : Dynamic Data ( Uncertainty )

End User requirements and data itself sometimes generate different types of uncertainty

Page 5: NoSQL

The NoSQL Jungle

Page 6: NoSQL

Data – NoSQL – Different implementations

CURRENTLY +150

Page 7: NoSQL

Data - NoSQL – Comparing data structure

Image from: http://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/

Page 8: NoSQL

Data - NoSQL – Compare

98% of the business requirements

There is still billions of nodes and relationships

Page 9: NoSQL

Data - NoSQL – Keys to fit

Key-value store

Column Store

Document Store

Graph Database

Performance High High High Variable

Scalability High High Variable (High) Variable

Flexibility High Moderate High High

Complexity None Low Low High

Functionality Variable(None) Minimal Variable (Low) Graph Theory

Page 10: NoSQL

Data – Our selection

Graph Databases

Page 11: NoSQL

Data – Graph Databases – Why?

Flexible data structureDoesn’t matter if the relations will change in the future.

Closer match to business logic

Page 12: NoSQL

Data – Graph Databases – Why?

Natural query system You tell what you want, not how to get it.

with recursive cluster (party, path, depth)  as ( select cast(@userId as character varying), cast(@userId as character varying), 1  union  (  select (case  when this.party = amc.userA then amc.userB  when this.party = amc.userB then amc.userA  end), (this.path || '.' || (case  when this.party = amc.userA then amc.userB  when this.party = amc.userB then amc.userA  end)), this.depth + 1  from cluster this, chat amc  where ((this.party = amc.userA and position(amc.userB in this.path) = 0)  or (this.party = amc.userB and position(amc.userA in this.path) = 0)) AND this.depth < @depth + 1 ) )  select party, path  from cluster  where not exists (  select *  from cluster c2 where cluster.party = c2.party  and (  char_length(cluster.path) > char_length(c2.path) or (char_length(cluster.path) = char_length(c2.path)) and (cluster.path > c2.path)  )  )  order by party, path; 

SQL = several hours to be executed

VS

START b = node:User(UserId=‘Manolo') MATCH (b) --(friend)--(friendoffriend) RETURN count(friendoffriend)

Cypher Language = 635ms

Page 13: NoSQL

Data - Graph Databases – Why?

Fits very well with complex data

Page 14: NoSQL

Data - Graph Databases – Why?

Fits very well with Bio-Informatics

0.9 Billion relationsips

Page 15: NoSQL

Data – Graph Databases – Why?

Fast Prototyping and developmentWe don’t need to lose too much time to define the schema (fine-grained).

Page 16: NoSQL

Data - Graph Databases – What is it?

Properties

Labels

Relationships

Page 17: NoSQL

Data - Graph Databases - Implemented by …

Page 18: NoSQL

Data - Graph Databases – Top 3

Name API Query Methods Consistency Staff (people) / Community

OrientDB Java Traverser API, Blueprints, Rexster

Own SQL-like Query Language, Gremlin

ACID, MVCC 3 / Low

Neo4j Java, Python, JPython, Ruby, JRuby, JavaScript (Node.js), PHP, .NET, Django, Clojure, Spring, Scala, or REST (any language)

Cypher (native/preferred), Native Java APIs (special cases), Traverser API, REST, Blueprints, Gremlin

ACID 42 / Very High

DEX Java, C++, .NET Native Java, C# and C++ APIs, Blueprints, Gremlin

Consistency, durability and partial isolation and atomicity

5 / ?

Page 19: NoSQL

Data - Graph Databases - Neo4j customers

Page 20: NoSQL

Data - Graph Database - Neo4j - Partners

Page 21: NoSQL

EndThanks you for your attention.