27
Introducing VenmoPlus.com Explore your Venmo network Qingpeng “Q.P.” Zhang Insight Data Engineering Fellow

Introducing VenmoPlus.com 6/27 version

Embed Size (px)

Citation preview

Page 1: Introducing VenmoPlus.com 6/27 version

Introducing VenmoPlus.comExplore your Venmo network

Qingpeng “Q.P.” ZhangInsight Data Engineering Fellow

Page 2: Introducing VenmoPlus.com 6/27 version

Venmo ~= Facebook + Paypal

Page 3: Introducing VenmoPlus.com 6/27 version

Demo VenmoPlus.com

http://venmoplus.com:8999/#/

Page 4: Introducing VenmoPlus.com 6/27 version

Pipeline

Historical transactions

Page 5: Introducing VenmoPlus.com 6/27 version

Pipeline

Page 6: Introducing VenmoPlus.com 6/27 version

Historical transactions

Real time transactions

Pipeline

Page 7: Introducing VenmoPlus.com 6/27 version

2013

Biggest Challenge:

● Calculate/Query graph distance in real time

Page 8: Introducing VenmoPlus.com 6/27 version

● Cache of 2nd degree friends list● Partitioned GraphDB● Good for Linkedin (hundreds of million

users, with higher degree)

● 5 million vertices (users)● 32 million distinct edges (transactions)● 88 million total edges (transactions)

Page 9: Introducing VenmoPlus.com 6/27 version

● Cache of 2nd degree friends list● Partitioned GraphDB● Good for Linkedin (hundreds of million

users, with higher degree)

● 5 million vertices (users)● 32 million distinct edges (transactions)● 88 million total edges (transactions)

No cache (precalculation)?No GraphDB?

Page 10: Introducing VenmoPlus.com 6/27 version

Historical transactions

Real time transactions

Two Databases

Page 11: Introducing VenmoPlus.com 6/27 version

Two Databases

420890 Graham Hadley

1630476 Leon Tang

810029 Harminder Toor

1371353 Ephraim Park

562884 Paul Min

420890 set(14935158, 562884)

1630476 set(1371353)

810029 set(190230,14935158)

1371353 set(810029,971156)

562884 set(196371,1371353)

Page 12: Introducing VenmoPlus.com 6/27 version

Two Databases

Page 13: Introducing VenmoPlus.com 6/27 version

This, or that? - to build graph

Page 14: Introducing VenmoPlus.com 6/27 version

This, or that? - for fast searching

Page 15: Introducing VenmoPlus.com 6/27 version

Historical transactions

Real time transactions

Two Databases

Page 16: Introducing VenmoPlus.com 6/27 version

Lesson learned

Page 17: Introducing VenmoPlus.com 6/27 version
Page 18: Introducing VenmoPlus.com 6/27 version
Page 19: Introducing VenmoPlus.com 6/27 version
Page 20: Introducing VenmoPlus.com 6/27 version

VenmoPlus.com

m4.xlarge

m4.large

m4.xlarge

m4.large

t2.micro

$29.11/day

Page 21: Introducing VenmoPlus.com 6/27 version

About Me

● PhD in Computer Science● BS in Physics

Volunteers:

● Software Carpentry● Data Carpentry● American Red Cross

Christmas Eve 2014, ice storm, Michigan

Page 22: Introducing VenmoPlus.com 6/27 version

Algorithm Optimization

Shortest distance -> intersection of sets (friend lists)

● 1st degree friends of A ∩ 1st degree friends of B == [] ?● 2nd degree friends of A ∩ 1st degree friends of B == []?

Page 23: Introducing VenmoPlus.com 6/27 version

Algorithms Design -2

Query distance between vertices in a historic moment in a constantly changing graph (because we don’t pre-calculate the distance….)

● A recent transaction for a user is history and has changed the graph● Query distance of the two users at that moment.

○ not considering that specific transaction)○ Remove the influence of that specific transaction temporarily and restore

■ Test if that transaction is the first between the pair of users.

Page 24: Introducing VenmoPlus.com 6/27 version

1 Spark m4.large 0.12 2.88

2 Spark m4.large 0.12 2.88

3 redis m4.xlarge 0.24 5.76

4 Elasticsearch

m4.xlarge 0.24 5.76

5 Elasticsearch

m4.xlarge 0.24 5.76

6 Kafka, producer

m4.large 0.12 2.88

7 kafka m4.large 0.12 2.88

8 webserver t2.micro 0.013 0.312

https://github.com/qingpeng/VenmoPlus for more details!

$29.11/24hours

Page 25: Introducing VenmoPlus.com 6/27 version

AlgorithmsDistance detection between vertices in graph (1st, 2nd, 3rd friends?)

● 1st degree friends of A ∩ 1st degree friends of B == [] ?● 2nd degree friends of A ∩ 1st degree friends of B == []?

Page 26: Introducing VenmoPlus.com 6/27 version

Producer [10]

[7,8] [1-6]

[1-6]

[4,5,6]

[1]

Backend/API [9]

Frontend [9]

[2,3]

Page 27: Introducing VenmoPlus.com 6/27 version

Redis:

● Graph Edges: userID -> userID● Graph Vertices: userID -> userName

In memory DB -> Fast graph updating, graph traversal, in real time

ElasticSearch:

● Everything about the transactions

Distributed -> Data storage and full text search, in real time

Big Challenge:

● Graph distance + Common connections in real time