Upload
asoka-korale
View
6
Download
0
Embed Size (px)
Citation preview
A model for profiling Mobile Telecom Subscribers based on their credit behavior
Dr. Asoka Korale, C.Eng. MIET
Profiled customers important for
• Credit management and determining credit actions
• Managing revenue through monitoring receipts/payments
Other uses of segmenting / profiling
• Identifying groups for Promotions / Special Offers
• Cross selling / Up selling• Targeted Advertising
Many ways to profile • Elements of a profile
The selected attributes • must reflect the particular task at hand • depend on the nature of the profiling
Attributes for Credit (Receipts and Payments)
Profile• Network Stay: number of years with network• Pay delay: number of days between payment date and due date• Pay gap percentage: proportion of bill that is outstanding• Revenue: bill value
Attribute Range Points0 <= x < 4.3 (lowest quartile) 0.254.3 <= x < 5 (2nd Quartile) 0.55 <= x < 5.7 (3rd Quartile) 0.755.7 <= x < 10 (top quartile) 1.0
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Cumulative Probability Distribution Function of Attribute
Attribute Value (x)
Ex: Allocate a total of 1 points across 4 quartiles
The percentiles of each of the variables is considered in allocating a score (points) for each attribute value.
This scheme can be extended to as many levels as desired to meet any accuracy requirement
Max Network Stay points = 0.35Max Pay Delay points = 0.3Max Pay Gap points = 0.7Max Revenue points = 0.65
Total Credit Risk points = Payment Delay points + Payment Gap pointsRating = 1 – (Total Credit Risk)
Payment Delay(days)
Allocate Pay Delay points based on percentile
PC PC: Percentile Calculation and mapping variables to points
Payment Gap (%)
Allocate Pay Gap Risk points based on percentile
PC
Network Stay(years)
Allocate Network Stay points based on percentile
PC
Allocate a Grade/ Segment based on coordinates of Cluster Centroid
1 – (Pay Delay Points + Pay Gap Points) = Credit Risk Points
Monthly Revenue(Rs)
Allocate Revenue points based on percentile
PC
Cluster: Credit Risk points, Network Stay points, Revenue points
A Cluster• A group of objects more similar to one another than to members of other
clusters• Represents a “segment” in the business perspective
Fuzzy C – Means Clustering Algorithm• Originally derived from computer science, widely used in data mining • An unsupervised learning algorithm
profiles data with out respect to a target variable Has no recourse to a training sequence
• Robust in processing large amounts of data• Particularly useful when data patterns are not self evident
Or when manual processing is not practical• Clusters arise naturally from patterns in the data• Fuzziness implies that each data point may belong to one or more clusters
to a certain degree – depending on membership function• In this modeling - subscriber allocated to cluster with highest membership
The cost function that will be minimized toarrive at the clusters around the centroids
N
i
C
jji
mijm cxuJ
1
2
1
)(
][ ijuU 1. Initialize the membership function and centroids2. Update the membership function
C
k
m
ki
ji
ij
cx
cxu
1
12
1
3. Update the centroids
N
i
mij
N
ii
mij
j
u
xuc
1
1
4. Check the convergence criteria, at kth iteration
kk UU 1
jC
5. Stop if step 4. is satisfied, else return to step 2
ClusterRating points
Network Stay points
Revenue points
1 0.2935 0.2115 0.09692 0.3832 0.2289 0.36153 0.5535 0.2770 0.57704 0.9224 0.2202 0.34545 0.8929 0.2527 0.57806 0.5966 0.2129 0.3664
ClusterRating points
Network Stay points
Revenue points
1 LOW MED LOW2 LOW MED MED3 MED HIGH HIGH4 HIGH MED MED5 HIGH HIGH HIGH6 MED MED MED
Relative to Max attribute value
Cluster 5, has subscribers with low credit risk and high revenue contribution Valuable Subs: Keep
Cluster 1, has subscribers with high credit risk and low revenue contribution Let Churn
Cluster 3, has subscribers with medium creditrisk and high revenue contribution Positively Influence
Table 1: Cluster Centroids Table 2: Centroids relative to max attribute values
00.2
0.40.6
0.81
0
0.1
0.2
0.3
0.40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Rating points
Data as clustered
Nework Stay points
Rev
enue
poi
nts
C1C2C3C4C5C6
1 2 3 4 5 60
200
400
600
800
1000
1200
1400
cluster no
no
of u
sers
Users in cluster
307
809
2201
670
2325
812
Average revenues in each cluster
c1c2c3c4c5c6
Cluster 5, has subscribers with highest average revenue contribution(combined with high network stay and low credit risk – “valuable segment”)
Cluster 1, has subscribers with lowest average revenue contribution(combined with high credit risk and medium network stay – “low value segment”) Cluster 3, has subscribers with high revenue contribution(combined with medium credit risk – “opportunity to influence” - increase their rating
Thank You
Determine Histograms of attributes Normalize each to obtain approximation to probability density
function of the selected attribute. Take cumulative sum of the probability density function to
determine cumulative probability distribution function Determine percentiles for allocating points to the attributes Allocate grade to customers based on the ranking of the
function derived from the three attributes Note: It is also possible to directly compute the percentiles by
simply sorting the samples and reading off the corresponding sample values at each point of interest. Both give same results, with above method providing a little more insight
0 1 2 3 4 5 6 7 8 9 100
0.5
1
1.5
2
2.5
3
3.5
4x 104 Histogram of attribute
Attribute Value0 1 2 3 4 5 6 7 8 9 10
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4Probability Density Function
Attribute Value
For the sake of this example the attribute in this case is assumed to be Normally distributed. In practice however the distributions of the attributes will take different forms, but the procedure for calculating thepercentiles will remain the same.
0 1 2 3 4 5 6 7 8 9 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4Probability Density Function
Attribute Value0 1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Cumulative Probability Distribution Function
Attribute Value
Attribute Range Points0 <= x < 4.3 (lowest quartile) 0.254.3 <= x < 5 (2nd Quartile) 0.55 <= x < 5.7 (3rd Quartile) 0.755.7 <= x < 10 (top quartile) 1.0
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Cumulative Probability Distribution Function of Attribute
Attribute Value (x)
Mapping attribute level to points based on the percentile that the customer achieves in relation to that attribute