14
1 Phone Fraudsters in a Haystack Sri Kanajan, Prasad Telekuntla, Mijail Gomez 3 rd place in Tata Telecommunications Global Hackathon

Phone Fraud Detection

Embed Size (px)

Citation preview

Page 1: Phone Fraud Detection

1

Phone Fraudsters in a HaystackSri Kanajan, Prasad Telekuntla, Mijail Gomez

3rd place in Tata Telecommunications Global Hackathon

Page 2: Phone Fraud Detection

2

Leaves International Missed Call

Unknowingly Calls Premium Number or Manipulative Advertisement

$2 BILLION OF LOST REVENUE FROM TELCOM PROVIDERS

Example of Phone Fraud

Page 3: Phone Fraud Detection

3

Motivations

• Current statistical solutions have low specificity and sensitivity

• Human fraud analysts have to continually update their heuristic

based rules and thresholds

• Need an adaptive solution that works in real time with minimal false

positives

Page 4: Phone Fraud Detection

4

Statistical Analysis Anomaly Detection

Live Streaming Phone Data

Hybrid Statistical and Machine Learning Solution

Number of Callers/Callee/Cumulative Call Duration

Machine Learning(Random Forests)

Evaluation of other features in the call log such as answer indicator, area code, pricing…

Used Hackathon De-identified Phone Log Dataset 16 GB

Page 5: Phone Fraud Detection

5

Anomaly Detection Through Statistical Analysis

# of Unique Caller’s per Phone Number

# of Unique Callee’s per Phone Number

Cumulative Duration of Calls to Specific Phone Numbers

ANOMALOUS Phone Numbers!!

Page 6: Phone Fraud Detection

6

Statistical Analysis Anomaly Detection

Machine Learning(Random Forests)

Graph Analysis Anomaly Detection

Live Streaming Phone Data

Predicted Anomalies

Hybrid Statistical and Machine Learning Solution

Page 7: Phone Fraud Detection

7

Fraud Detection Using Graph Metrics

• Triangle Counting

• PageRank

• Others… Note: Goal is to uncover the callers that are

very different from the large majority

Page 8: Phone Fraud Detection

8

Using Principal Component Analysis to uncover the outliers in the graph metrics

Fraud Detection Using Graph Metrics

Possible Fraudsters!

Page 9: Phone Fraud Detection

9

Statistical Analysis Anomaly Detection

Machine Learning(Random Forests)

Graph Analysis Anomaly Detection

Live Streaming Phone Data

Predicted Anomalies

Human Observed

Fraud Analyst

Hybrid Statistical and Machine Learning Solution

Possible Fraud

Page 10: Phone Fraud Detection

10

Human Fraud Analyst Confirmation of Fraudsterwww.fraud-detector.net

Fraud Detection Using Graph Metrics

Page 11: Phone Fraud Detection

11

Statistical Analysis Anomaly Detection

Machine Learning(Random Forests)

Graph Analysis Anomaly Detection

Live Streaming Phone Data

Predicted Anomalies

Confirmed Fraudsters

Human Observed

Fraud Analyst

Hybrid Statistical and Machine Learning Solution

Possible Fraud

Page 12: Phone Fraud Detection

12

Ensemble Model – Machine Learning and Statistical

• With labeled data, the classifier can progressively identify patterns

beyond the graph metrics (uses all other features in the raw call log)

– E.g. patterns in area codes or specific pricing plans used by fraudsters

• Active learning is done online while the system is active. I.e. the

longer the system is in use, the better it gets

Page 13: Phone Fraud Detection

14

Conclusion

Possible False PositivePossible Fraudster

Page 14: Phone Fraud Detection

16

Acknowledgements

D3Python

Zipfian Academy

Technologies Used