22
Big Data CDR Analyzer “The Next Generation Mobile Promotions” 080201N – M.K.P.R. Jayawardhana 080254D – P.K.A.M. Kumara 080331L – W.D.A.I. Paranawithana 080357V – T.D.K. Perera Project Supervisors- Mr. Thilina Anjitha – hSenid Dr.Shahani Markus Weerawarana © 2012 University of Moratuwa

Big Data CDR Analyzer - Kanthaka

Embed Size (px)

DESCRIPTION

This is the presentation at the successful completion of 'Kanthaka'- Big Data CDR (Caller Detail Record) Analyzer, a system to support near real time complex promotion at telecom operators. This includes the details of technology selection, system architecture and final test results on a dual core machine with 3GB RAM and a cluster with two such nodes.

Citation preview

Page 1: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

Big Data CDR Analyzer

“The Next Generation Mobile Promotions”

• 080201N – M.K.P.R. Jayawardhana• 080254D – P.K.A.M. Kumara• 080331L – W.D.A.I. Paranawithana• 080357V – T.D.K. Perera

Project Supervisors-Mr. Thilina Anjitha – hSenidDr.Shahani Markus Weerawarana

Page 2: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

OVERVIEW

Background  Current Situation Scope and Assumptions Kanthaka – big data CDR Analyzer System  Technology Comparison

- Map Reduce- NoSQL Databases

Architecture Risks and Possible Remedies  References

Page 3: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

BackgroundMobile Promotions

Page 4: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

CURRENT SITUATION

• Promotions based only on their network usage

• Use only active call switch for triggering promotions

• No way of analyzing and processing high volume CDR records

• No efficient CDR analyzing method • No access to historical data• Complex rules not supported

&@$*#

Page 5: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

TO RESCUE

Selecting eligible users for both commercial organizations based and network usage based promotions.

Eg- giving 20% discount for pizza lovers within age group 16-40 who have called pizza hut more than 5 times a month

High volume CDR analysis. Near real time selection of eligible users

for promotions.

Page 6: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

CDR Analyzer system which

can process 30 million records per day

can produce results within 30 seconds

provides a GUI to define dynamic rules

can be used to offer real-time sales

promotions for mobile subscribers

Page 7: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

This location information retrieving from Location Based System(LBS) can be replaced with any other information retrieving such as subscriber age from the Customer Relationship Management system to support attractive promotions.

Page 8: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

SCOPE AND ASSUMPTIONSSCOPE

30 M Multiple Rules Offer

Promotion

30 M Multiple Rules Select eligibilities

for promotion only

Real system operation Operation expect by Kanthaka

Page 9: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

ASSUMPTIONS

CDR records can be only in .CSV format.

Event type can be in different types like SMS, Voice call, MMS, USSD, Top-up, GPRS, LBS.

CDR can be received as batches to the system asynchronously.

Only 6 attributes out of many attributes will be considered during processing.

Page 10: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

TECHNOLOGY COMPARISON

Page 11: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

Page 12: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

YCSB BENCHMARKS

With more big users, active mailing lists, most promising technologies (secondary index, counters) best to try out is Cassandra.

Page 13: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

Page 14: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

TECHNOLOGY SELECTION

TECHNOLOGIES LEFT BEHIND TECHNOLOGIES SELECTED

Complex Event Processing engines(CEP) No persistency

Rules Engine More layers More

latency Hadoop - latency NoSQL DB- Hbase,

MongoDB, Hive

NoSQL DB - Cassandra

Page 15: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

BRIEF ARCHITECTURE OF ‘KANTHAKA’

Pre-processing unit

Promotion definition Cassandra

Cluster

Page 16: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

TEST RESULTS IN SINGLE NODE

Page 17: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

TEST RESULTS IN TWO NODE- CLUSTER

Page 18: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

CLUSTER BETTER IN HIGH LOADS

Page 19: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

RISKS AND POSSIBLE REMEDIES

NoSQL databases High performance More memory Use an external cluster with descent memory Concurrency Issues Handling

Low speed Locking databaseUse shadow copy

Handling sudden peaks Should have an auto balancing mechanism ready

Page 20: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

FINAL DELIVERABLES

Big Data CDR Analyzer system

Research Paper

Final Report

Page 21: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

REFERENCES

B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, “Benchmarking cloud serving systems with YCSB,” 2010, pp. 143–154.

Visit us at Kanthaka

Page 22: Big Data CDR Analyzer - Kanthaka

© 2012 University of Moratuwa

Thank you

Pushpalanka

AmilaDhanika

Manoj