customer_profiling_based_on_fuzzy_principals_linkedin

A model for profiling Mobile Telecom Subscribers based on their credit behavior

Dr. Asoka Korale, C.Eng. MIET

Profiled customers important for

• Credit management and determining credit actions

• Managing revenue through monitoring receipts/payments

Other uses of segmenting / profiling

• Identifying groups for Promotions / Special Offers

• Cross selling / Up selling• Targeted Advertising

Many ways to profile • Elements of a profile

The selected attributes • must reflect the particular task at hand • depend on the nature of the profiling

Attributes for Credit (Receipts and Payments)

Profile• Network Stay: number of years with network• Pay delay: number of days between payment date and due date• Pay gap percentage: proportion of bill that is outstanding• Revenue: bill value

Attribute Range Points0 <= x < 4.3 (lowest quartile) 0.254.3 <= x < 5 (2nd Quartile) 0.55 <= x < 5.7 (3rd Quartile) 0.755.7 <= x < 10 (top quartile) 1.0

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Cumulative Probability Distribution Function of Attribute

Attribute Value (x)

Ex: Allocate a total of 1 points across 4 quartiles

The percentiles of each of the variables is considered in allocating a score (points) for each attribute value.

This scheme can be extended to as many levels as desired to meet any accuracy requirement

Max Network Stay points = 0.35Max Pay Delay points = 0.3Max Pay Gap points = 0.7Max Revenue points = 0.65

Total Credit Risk points = Payment Delay points + Payment Gap pointsRating = 1 – (Total Credit Risk)

Payment Delay(days)

Allocate Pay Delay points based on percentile

PC PC: Percentile Calculation and mapping variables to points

Payment Gap (%)

Allocate Pay Gap Risk points based on percentile

PC

Network Stay(years)

Allocate Network Stay points based on percentile

PC

Allocate a Grade/ Segment based on coordinates of Cluster Centroid

1 – (Pay Delay Points + Pay Gap Points) = Credit Risk Points

Monthly Revenue(Rs)

Allocate Revenue points based on percentile

PC

Cluster: Credit Risk points, Network Stay points, Revenue points

A Cluster• A group of objects more similar to one another than to members of other

clusters• Represents a “segment” in the business perspective

Fuzzy C – Means Clustering Algorithm• Originally derived from computer science, widely used in data mining • An unsupervised learning algorithm

profiles data with out respect to a target variable Has no recourse to a training sequence

• Robust in processing large amounts of data• Particularly useful when data patterns are not self evident

Or when manual processing is not practical• Clusters arise naturally from patterns in the data• Fuzziness implies that each data point may belong to one or more clusters

to a certain degree – depending on membership function• In this modeling - subscriber allocated to cluster with highest membership

The cost function that will be minimized toarrive at the clusters around the centroids

N

i

C

jji

mijm cxuJ

1

2

1

)(

][ ijuU 1. Initialize the membership function and centroids2. Update the membership function

C

k

m

ki

ji

ij

cx

cxu

1

12

1

3. Update the centroids

N

i

mij

N

ii

mij

j

u

xuc

1

1

4. Check the convergence criteria, at kth iteration

kk UU 1

jC

5. Stop if step 4. is satisfied, else return to step 2

ClusterRating points

Network Stay points

Revenue points

1 0.2935 0.2115 0.09692 0.3832 0.2289 0.36153 0.5535 0.2770 0.57704 0.9224 0.2202 0.34545 0.8929 0.2527 0.57806 0.5966 0.2129 0.3664

ClusterRating points

Network Stay points

Revenue points

1 LOW MED LOW2 LOW MED MED3 MED HIGH HIGH4 HIGH MED MED5 HIGH HIGH HIGH6 MED MED MED

Relative to Max attribute value

Cluster 5, has subscribers with low credit risk and high revenue contribution Valuable Subs: Keep

Cluster 1, has subscribers with high credit risk and low revenue contribution Let Churn

Cluster 3, has subscribers with medium creditrisk and high revenue contribution Positively Influence

Table 1: Cluster Centroids Table 2: Centroids relative to max attribute values

00.2

0.40.6

0.81

0

0.1

0.2

0.3

0.40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Rating points

Data as clustered

Nework Stay points

Rev

enue

poi

nts

C1C2C3C4C5C6

1 2 3 4 5 60

200

400

600

800

1000

1200

1400

cluster no

no

of u

sers

Users in cluster

307

809

2201

670

2325

812

Average revenues in each cluster

c1c2c3c4c5c6

Cluster 5, has subscribers with highest average revenue contribution(combined with high network stay and low credit risk – “valuable segment”)

Cluster 1, has subscribers with lowest average revenue contribution(combined with high credit risk and medium network stay – “low value segment”) Cluster 3, has subscribers with high revenue contribution(combined with medium credit risk – “opportunity to influence” - increase their rating

Thank You

Determine Histograms of attributes Normalize each to obtain approximation to probability density

function of the selected attribute. Take cumulative sum of the probability density function to

determine cumulative probability distribution function Determine percentiles for allocating points to the attributes Allocate grade to customers based on the ranking of the

function derived from the three attributes Note: It is also possible to directly compute the percentiles by

simply sorting the samples and reading off the corresponding sample values at each point of interest. Both give same results, with above method providing a little more insight

0 1 2 3 4 5 6 7 8 9 100

0.5

1

1.5

2

2.5

3

3.5

4x 104 Histogram of attribute

Attribute Value0 1 2 3 4 5 6 7 8 9 10

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4Probability Density Function

Attribute Value

For the sake of this example the attribute in this case is assumed to be Normally distributed. In practice however the distributions of the attributes will take different forms, but the procedure for calculating thepercentiles will remain the same.

0 1 2 3 4 5 6 7 8 9 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4Probability Density Function

Attribute Value0 1 2 3 4 5 6 7 8 9 10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Cumulative Probability Distribution Function

Attribute Value

Attribute Range Points0 <= x < 4.3 (lowest quartile) 0.254.3 <= x < 5 (2nd Quartile) 0.55 <= x < 5.7 (3rd Quartile) 0.755.7 <= x < 10 (top quartile) 1.0

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Cumulative Probability Distribution Function of Attribute

Attribute Value (x)

Mapping attribute level to points based on the percentile that the customer achieves in relation to that attribute

Documents

customer_profiling_based_on_fuzzy_principals_linkedin