24
Graph Databases and Neo4j

Graph Database and Neo4j

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Graph Database and Neo4j

Graph Databases and

Neo4j

Page 2: Graph Database and Neo4j

Data is getting bigger:

“Every 2 days we create as much information as we did up to 2003”

– Eric Schmidt, Google

Page 3: Graph Database and Neo4j

NOSQL

Page 4: Graph Database and Neo4j

Key Value Stores

Most Based on Dynamo: Amazon Highly Available Key-Value Store

Data Model:

Global key-value mapping

Big scalable Hash Map

Highly fault tolerant (typically)

Examples:

Redis, Riak, Voldemort

Page 5: Graph Database and Neo4j

Pros & Cons

Pros:

Simple data model

Scalable

Cons:

Create your own “foreign keys”

Poor for complex data

Page 6: Graph Database and Neo4j

Column Family

Most Based on Big Table: Google’s Distributed Storage System for Structured Data

Data Model:

A big table, with column families

Map Reduce for querying/processing

Examples:

HBase, HyperTable, Cassandra

Page 7: Graph Database and Neo4j

Pros & Cons

Pros:

Supports Simi-Structured Data

Naturally Indexed (columns)

Scalable

Cons:

Poor for interconnected data

Page 8: Graph Database and Neo4j

Document Databases

Data Model:

A collection of documents

A document is a key value collection

Index-centric, lots of map-reduce

Examples:

CouchDB, MongoDB

Page 9: Graph Database and Neo4j

Pros & Cons

Pros:

Simple, powerful data model

Scalable

Cons:

Poor for interconnected data

Query model limited to keys and indexes

Map reduce for larger queries

Page 10: Graph Database and Neo4j

Graph Databases

Data Model:

Nodes and Relationships

Examples:

Neo4j, OrientDB, InfiniteGraph, AllegroGraph

Page 11: Graph Database and Neo4j

Pros & Cons

Pros:

Powerful data model, as general as RDBMS

Connected data locally indexed

Easy to query

Cons:

Requires rewiring your brain

Page 12: Graph Database and Neo4j

Com

plex

ity

Big TableClones

Size

Key-ValueStore

DocumentDatabases

GraphDatabases

90% ofUse Cases

RelationalDatabases

Page 13: Graph Database and Neo4j

A Graph Database uses graph structure with nodes, edges and properties to represent and store data.

By definition, a graph database is any storage system that provides index-free adjacency. This means that every element contains a direct pointer to its adjacent element and no index lookups are necessary.

Graph databases focus on the interconnection between Entities.

Graph Database definition

Page 14: Graph Database and Neo4j

Compared with RDBMSGraph databases are often faster for associative data sets

Map more directly to the structure of object-oriented applications

Scale more naturally to large data sets as they do not typically require expensive join operations.

As they depend less on a rigid schema, they are more suitable to manage ad-hoc and changing data with evolving schemas.

Page 15: Graph Database and Neo4j

Finding Extended Friends

Page 16: Graph Database and Neo4j

NodesNodes represent Entities such as people, businesses, accounts, or any other item you might want to keep track of.

Page 17: Graph Database and Neo4j

PropertiesProperties are pertinent information that relate to nodes.

Page 18: Graph Database and Neo4j

EdgesEdges are the lines that connect nodes to nodes or nodes to properties and they represent the Relationship between the two.

Most of the important information is really stored in the edges.

Meaningful patterns emerge when one examines the connections and interconnections of nodes, properties and edges.

Page 19: Graph Database and Neo4j
Page 20: Graph Database and Neo4j
Page 21: Graph Database and Neo4j

What is Neo4j?• A Graph Database

• Property Graph

• Full ACID (atomicity, consistency, isolation, durability)

• High Availability (with Enterprise Edition)

• 32 Billion Nodes, 32 Billion Relationships, 64 Billion Properties

• Embedded Server

• REST API

Page 22: Graph Database and Neo4j

Key Features•Runs on major platforms : Mac | Windows | Unix

•Extensive documentation

•Active community

•Open Source

Page 23: Graph Database and Neo4j

CYPHERCypher is a declarative graph query language that allows for expressive and efficient querying and updating of the graph store without having to write traversal through the graph structure in code.

Page 24: Graph Database and Neo4j

CYPHERSTART: Starting points in the graph, obtained via index lookups or by element IDs.

MATCH: The graph pattern to match, bound to the starting points in START.

WHERE: Filtering criteria.

RETURN: What to return.

CREATE: Creates nodes and relationships.

DELETE: Removes nodes, relationships and properties.

SET: Set values to properties.

FOREACH: Performs updating actions once per element in a list.

WITH: Divides a query into multiple, distinct parts.