Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
CellIQ: Real-Time Cellular Network Analytics at Scale
Anand Iyer#, Li Erran Li+, Ion Stoica# #UC Berkeley +Bell Labs
Cellular Networks have been seeing exponential growth and become part of our lives
Image courtesy: Alcatel-Lucent
What is needed to solve these issues?
Are some regions in the network hotspots? - Better load balancing
How is user traffic moving in the network? - Better resource provisioning
What are the popular handoff sequences? - Troubleshoot handoff related problems
Cellular Network Analytics Today
Cellular Network Analytics Today
Cellular Network Analytics Today
Problem
Existing cellular network analytic systems do not
support advanced analytic tasks in an efficient manner.
High Velocity Data Continuous Monitoring
Advanced Tasks
Timely Spatio-Temporal Analysis
Challenges
CellIQ is a cellular network analytics system that supports rich analysis
tasks efficiently by leveraging domain-specific optimizations
Cellular Data as Time-Evolving Graphs
Tasks easily expressed in graphs: Hotspot computation è Connected components
Handoff sequences & User traffic è Pregel model
Edge PropertyVertex Property
BS1
UE2
UE1 BS2
UE3
UE4
UE5
Why Not Use a Graph Parallel Framework?
��
��
���
���
���
���
���
���
���
�������� ���������� ������ �����
�����������������������
������������
Fails to produce results!
Domain specific optimizations key for efficient analysis
CellIQ Implementation
*Gonzales. et.al. “GraphX: Graph Processing in a Distributed Dataflow Framework”, OSDI 2014
Implemented as a layer on GraphX* Incorporates several domain specific optimizations
GraphX
Spark
Pregel API
PageRank Connected Comp. K-core Triangle Count
LDA SVD++
CellIQ
Computational Model
BS1UE2
UE1
BS2
UE3
UE4
UE5
Computational Model
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1UE2
UE1
BS2
UE3
UE4
UE5
Computational Model
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1
UE2
UE1 BS2
UE3
UE4
UE5
Computational Model: GStreams
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1
UE2
UE1 BS2
UE3
UE4
UE5
Domain specific graph partitioning Spatial operations
Window operations
Computational Model: GStreams
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1
UE2
UE1 BS2
UE3
UE4
UE5
Domain specific graph partitioning Spatial operations
Window operations
Graph computation frameworks rely on partitioning to minimize communication & balance computation
B C
A D
FE
A DD
B C
D
E
AA
F Machine 1 Machine 2
A
B
C
D
E
F
Graph Partitioning
Partition geographically close-by entities
Machine 3 Machine 4
3
B CB C
D
E
A
F
Machine 1 Machine 2
CellIQ Graph Partitioning
G H
2D 1D
?
3 Machine 3 Machine 4
B CB C
D
E
A
F
Machine 1 Machine 2
AB
CD
EF
Graph Partitioning
G HG
H
Random (hashed) partitioning
3 Machine 3 Machine 4
B CB C
D
E
A
F
Machine 1 Machine 2
AB
CD
EF
Graph Partitioning
G HG
H
Random (hashed) partitioning results in poor spatial locality
Machine 3 Machine 4
B CB C
D
E
A
F
Machine 1 Machine 2
CellIQ Graph Partitioning
G H
Uses Hilbert space-filling curves
Machine 3 Machine 4
0 3
2 1 B CB C
D
E
A
F
Machine 1 Machine 2
CellIQ Graph Partitioning
G H
Uses Hilbert space-filling curves Use curve’s distance as the 1-dimensional key
Machine 3 Machine 4
0 3
2 1 B CB C
D
E
A
F
Machine 1 Machine 2
AB C
D
EF
CellIQ Graph Partitioning
G H G H
Uses Hilbert space-filling curves Use curve’s distance as the 1-dimensional key Range partition the key space
0 1
2 3
4 7
6 5
8 11
10 9
14 15
12 13
Machine 3 Machine 4
B CB C
D
E
A
F
Machine 1 Machine 2
AB C
D
EF
CellIQ Graph Partitioning
G H G H
Uses Hilbert space-filling curves Use curve’s distance as the 1-dimensional key Range partition the key space
Computational Model: GStreams
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1
UE2
UE1 BS2
UE3
UE4
UE5
Domain specific graph partitioning Spatial operations Window operations
GeoGraph API
class GeoGraph[V, E] { // Broadcast a message to all // vertices within a radius def sendMsg(radius) // Create a spatially aggregated // graph by combining vertices // and edges def spatialAG(reduceV: (V, V) => V, reduceE: (E, E) => E) }
Tracking user traffic gradients
Goal: Detect and track direction of movement of user groups
3
B C
A D
F
E
A DD
B C
D
E
AA
F
Tracking user traffic gradients
Base Station
3
B C
A D
F
E
A DD
B C
D
E
AA
F
Tracking user traffic gradients
B C
A D
F
E
A DD
B C
D
E
AA
F
Hop-by-hop propagation
Tracking user traffic gradients
B C
A D
F
E
A DD
B C
D
E
AA
F
Hop-by-hop propagation is inefficient
Tracking user traffic gradients
Tracking user traffic gradients
B C
A D
F
E
A DD
B C
D
E
AA
F
Instead, CellIQ enables radius based broadcast
Part. 2
Part. 1
Vertex Table(RDD)
B C
A D
FE
A D
Routing Table in GraphX enables Multicast
D
B C
D
E
AA
F
Machine 1
Machine 2
Edge Table(RDD)
A B
A C
C D
B C
A E
A F
E F
E D
B
C
D
E
A
F
RoutingTable
(RDD)
B
C
D
E
A
F
1
2
1 2
1 2
1
2
Slide courtesy: Joey Gonzales
RoutingTable
(RDD)
B
C
D
E
A
F
1
2
1 2
1 2
1
2 Part. 2
Part. 1
Vertex Table(RDD)
B C
A D
FE
A DD
B C
D
E
AA
F
Machine 1
Machine 2
Edge Table(RDD)
A B
A C
C D
B C
A E
A F
E F
E D
B
C
D
E
A
FSlide courtesy: Joey Gonzales
Can compute destination partitions easily due to the use of geo-partitioner
GeoGraph API
class GeoGraph[V, E] { // Broadcast a message to all // vertices within a radius def sendMsg(radius) // Create a spatially aggregated // graph by combining vertices // and edges def spatialAG(reduceV: (V, V) => V, reduceE: (E, E) => E) }
B C
A D
F
E
A DD
B C
D
E
AA
F
Spatial Clustering
F E DDB’F
Goal: Combine spatially close-by vertices
Spatial Clustering Two ways to enable spatial aggregation: - Using a (supplied) field in properties - Leverage geo partitioner
00 01
02 03
10 13
12 11
20 23
22 21
32 33
30 31
Spatial Clustering Two ways to enable spatial aggregation: - Using a (supplied) field in properties - Leverage geo partitioner
00 01
02 03
10 13
12 11
20 23
22 21
32 33
30 31 0 3
2 1
Computational Model: GStreams
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1UE2
UE1
BS2
UE3
UE4
UE5
BS1
UE2
UE1 BS2
UE3
UE4
UE5
Domain specific graph partitioning Spatial operations
Window operations
Tracking Persistent Hotspots
Goal: Detect and track groups of base stations with high traffic volume
Equivalent to finding connected components
Tracking Persistent Hotspots BS1
BS2 BS3
t1 t2 t3
W
Combining graphs at the end of the window results in many join operations (inefficient)
BS1
BS2
BS1
BS2
Tracking Persistent Hotspots BS1
BS2 BS3
t1 t2 t3
W
BS1
BS2
BS1
BS2
BS1
BS2 BS3
1 1
1
BS1
BS2 BS3
2 1
1
BS1
BS2 BS3
3 1
1
Apply incremental updates to a cumulative graph
Tracking Persistent Hotspots BS1
BS2 BS3
t1 t2 t3
BS1
BS2
BS1
BS2
BS1
BS2 BS3
1 1
1
Apply differential updates to a cumulative graph
BS1
BS3
t4
BS1
BS2 BS3
1 2
1
BS1
BS2 BS3
1 3
1
BS1
BS2 BS3
1 2
0
GStream API
class GStream[V, E] { def graphReduceByWindow( reduceFunc(Graph[V, E], Graph[V, E], fv: (V, V) => V, fe: (E, E) => E): Graph[V, E], invReduceFunc(Graph[V, E], Graph[V, E], fv: (V, V) => V, fe: (E, E) => E): Graph[V, E], windowDuration, slideDuration) }
graphReduceByWindow
• Implemented using Spark’s cogroupedRDD • Two default reduce functions: graph intersection and union • Further optimizations:
– Co-partition graphs from multiple batches – Reuse indices and routing tables for graphs in the
same window More details in the paper!
How does CellIQ perform?
Evaluation Setup
• LTE control plane data from a major cellular network operator • 1 million+ subscribers, live network
• 2 TB data from 1 week
– 1 file per minute, 750k records, 100s of fields/line – 10 collection points, 10 hours per day
• Implemented several analysis tasks
Benefits of Geo-partitioning
��
��
���
���
���
���
���
���
���
�������� ���������� ������ �����
�����������������������
������������
�������������������� ����������������
Benefits of Geo-partitioning
��
��
���
���
���
���
���
���
���
�������� ���������� ������ �����
�����������������������
������������
�������������������� ����������������
Small amount of data, movement not noticeable
Default practitioner fails to produce results
Benefits of Incremental Updates
��
��
���
���
���
���
���
���
���
�������� ���������� ������ �����
�����������������������
������������
������������������������������������
������������������������������������������������������������
Benefits of Incremental Updates
��
��
���
���
���
���
���
���
���
�������� ���������� ������ �����
�����������������������
������������
������������������������������������
������������������������������������������������������������
2 – 5X improvements
Benefits of Incremental Updates
��
��
���
���
���
���
���
���
���
�������� ���������� ������ �����
�����������������������
������������
������������������������������������
������������������������������������������������������������
window size affects performance
Benefits of Differential Updates
��
��
��
��
��
���
�� �� �� �� �� ��� ���
�����������������
����������������
��������������
Benefits of Differential Updates
��
��
��
��
��
���
�� �� �� �� �� ��� ���
�����������������
����������������
��������������
Larger windows see bigger benefits
Graceful degradation in performance
Benefits of Radius-based Broadcast
��
����
����
����
����
����
����
�������� ���������� ������ �����
�����������������
������������
����������������������
��� � ��
���
����������
��� ���
���
Benefits of Radius-based Broadcast
��
����
����
����
����
����
����
�������� ���������� ������ �����
�����������������
������������
����������������������
��� � ��
���
����������
��� ���
���
Larger datasets result in increase in messages exchanges per hop
CellIQ is a cellular network analytics system that uses domain-specific optimizations to achieve 2x to 5x
improvements
CellIQ is a cellular network analytics system that uses domain-specific optimizations to achieve 2x to 5x
improvements Ongoing Work: • Using techniques in CellIQ to perform root-cause
analysis on operational LTE Networks • Generalized streaming graph analysis techniques