122
Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry Chettapong Punyachonkool Dr. Arnond Sakworawich The First NIDA Business Analytics and Data Sciences Contest/Conference September 2, 2016

Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

  • Upload
    bainida

  • View
    856

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Subscriber Churn Prediction Model using Social Network Analysis

In Telecommunication Industry

Chettapong Punyachonkool

Dr. Arnond Sakworawich

The First NIDA Business Analytics and Data Sciences Contest/ConferenceSeptember 2, 2016

Page 2: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Chettapong PunyachonkoolData Engineer, Business Intelligence Strategy

The Siam Commercial Bank

[email protected]/in/chettapong-punyachonkool

Business Analytic and Research

Applied Statistics, NIDA

Page 3: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Topics

› Social Network Analysis basic concepts› Social Network Analysis with R› Visualizing Social Network› Using SNA to predict Subscriber Churn in

Telco

Page 4: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Social Network Analysis Basic Concepts

Page 5: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Social Network

Social Network: A social structure composed of individuals (or organizations) interconnected by one or more specific types of interdependencies such as friendship, kinship, financial exchanges, communication exchanges, etc.

Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation. Peak Consulting, http://www.peakconsulting.eu

Page 6: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Social Network Analysis

Social Network Analysis: The application of graph theory to understand, categorize and quantify relationships in a social network.

Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation. Peak Consulting, http://www.peakconsulting.eu

Page 7: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Why should you care about SNA? Traditional marketing practices are becoming obsolete.

• Test and control group methodologies no longer work as intended. • Information exchange between individuals within

an online social network is extremely high. • Difficult to keep control group “pure”.

• Need to understand behaviour across and within communities rather than focusing just on individuals.

• Leverage (and protect against) high velocity of information exchange within on-line social networks.

Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation. Peak Consulting, http://www.peakconsulting.eu

Page 8: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Why should you care about SNA?

Customer are sceptical: if you want to sell your products to your customers, convince their friends.

Use social network analysis to understand more about your customers and their communities.

Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation. Peak Consulting, http://www.peakconsulting.eu

Page 9: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Customer with the Role of an Influencer

• Influential user adopts a product or behaviour.• Influential user tells (and influences) his or her

immediate contacts within the community.• These immediate contacts tell their contacts.

It is important...

• To identify these people.• To influence these people.• To monitor the behaviour of these people.

Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation. Peak Consulting, http://www.peakconsulting.eu

Page 10: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Social Network Analysis Application

Source: http://www.martingrandjean.ch/connected-world-air-traffic-network/

Page 11: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Social Network Analysis Application

Source: http://www.martingrandjean.ch/connected-world-air-traffic-network/

Page 12: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Social Network Analysis Application

https://linkurious.icij.org/widget/4c040978

Page 13: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Social Network Analysis Application

Source: Apichart Wisitkitchakarn (2013), Risk Analysis of East Asian Stock Markets,The Capital Market Research Institute, The Stock Exchange of Thailand.

Page 14: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Social Network Analysis Application

Source: Valdis Krebs (2001). Connecting the Dots. Tracking Two Identified Terroristshttp://orgnet.com/tnet.html

Page 15: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Social Network Analysis (SNA)including a tutorial on concepts and methods

Social Media – Dr. Giorgos Cheliotis ([email protected])Communications and New Media, National University of Singapore

Page 16: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Practical applications

Newman et al, 2006

Newman et al, 2006

A very early example of network analysis comes from the city of Königsberg (now Kaliningrad). Famous mathematician Leonard Euler used a graph to prove that there is no path that crosses each of the city’s bridges only once (Newman et al, 2006).

SNA has its origins in both social science and in the broader fields of network analysis and graph theory

Network analysis concerns itself with the formulation and solution of problems that have a network structure; such structure is usually captured in a graph (see the circled structure to the right)

Graph theory provides a set of abstract concepts and methods for the analysis of graphs. These, in combination with other analytical tools and with methods developed specifically for the visualization and analysis of social (and other) networks, form the basis of what we call SNA methods.

But SNA is not just a methodology; it is a unique perspective on how society functions. Instead of focusing on individuals and their attributes, or on macroscopic social structures, it centers on relationsbetween individuals, groups, or social institutions

Page 17: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Basic Concepts

} Networks} Tie Strength} Key Players} Cohesion

How to represent various social networks

How to identify strong/weak ties in the network

How to identify key/central nodes in network

Measures of overall network structure

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 18: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Representing relations as networks

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

12

34

1 2 3 4

Graph

Anne JimMary

John

Can we study their interactions as a

network?

Communication

Anne: Jim, tell the Murrays they’re invited

Jim: Mary, you and your dad should come for dinner!

Jim: Mr. Murray, you should both come for dinner

Anne: Mary, did Jim tell you about the dinner? You must come.

John: Mary, are you hungry?

Page 19: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Network terminology

12

34

Graph

Vertex Edge

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

points lines

vertices edges, arcs math

nodes links computer science

sites bonds physics

actors ties, relations sociology

Page 20: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Entering data on a directed graph

12

34

Graph (directed)

Vertex Vertex

1 2

1 3

2 3

2 4

3 4

Edge list

Vertex 1 2 3 4

1 - 1 1 0

2 0 - 1 1

3 0 0 - 0

4 0 0 1 -

Adjacency matrix

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 21: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Entering data on a directed graph

12

34

Graph (directed)

Vertex Vertex

1 2

1 3

2 3

2 4

3 4

Edge list

Vertex 1 2 3 4

1 - 1 1 0

2 0 - 1 1

3 0 0 - 0

4 0 0 1 -

Adjacency matrix

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 22: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Representing an undirected graph

Vertex Vertex

1 2

1 3

2 3

2 4

3 4

Edge list remains the same

Vertex 1 2 3 4

1 - 1 1 0

2 1 - 1 1

3 1 1 - 1

4 0 1 1 -

Adjacency matrix becomes symmetric

12

34

Graph (undirected)

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 23: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Basic Concepts

Networks} Tie Strength

Key PlayersCohesion

How to represent various social networks

How to identify strong/weak ties in the network

How to identify key/central nodes in network

Measures of overall network structure

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 24: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Adding weights to edges

Vertex Vertex Weight

1 2 30

1 3 5

2 3 22

2 4 2

3 4 37

Edge list: add column of weights

Vertex 1 2 3 4

1 - 30 5 0

2 30 - 22 2

3 5 22 - 37

4 0 2 37 -

Adjacency matrix: add weights instead of 1

12

34

30

2

37

225

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 25: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Adding weights to edges

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

1 2 3 4

Anne JimMary

John

Communication

Anne: Jim, tell the Murrays they’re invited

Jim: Mary, you and your dad should come for dinner!

Jim: Mr. Murray, you should both come for dinner

Anne: Mary, did Jim tell you about the dinner? You must come.

John: Mary, are you hungry?

12

34

30

37

225

Graph (undirected)add weights

Page 26: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Edge weights as relationship strength

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

} Edges can represent interactions, flows of information or goods, similarities/affiliations, or social relations

} Specifically for social relations, a ‘proxy’ for the strength of a tie can be:(a) the frequency of interaction (communication)

or the amount of flow (exchange)(b) reciprocity in interaction or flow(c) the type of interaction or flow between the

two parties (e.g., intimate or not)(d) other attributes of the nodes or ties (e.g., kin

relationships)(e) The structure of the nodes’ neighborhood (e.g.

many mutual ‘friends’)

} Surveys and interviews allows us to establish the existence of mutual or one-sided strength/affection with greater certainty, but proxies above are also useful

Page 27: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Basic Concepts

NetworksTie Strength

} Key PlayersCohesion

How to represent various social networks

How to identify strong/weak ties in the network

How to identify key/central nodes in network

Measures of overall network structure

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 28: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Interpretation of measures

} Degree

} Betweenness

} Closeness

} Eigenvector

How many people can this person reach directly?

How likely is this person to be the most direct route between two people in the network?

How fast can this person reach everyone in the network?

How well is this person connected to other well-connected people?

Centrality measure Interpretation in social networks

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 29: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Degree centrality

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

} A node’s (in-) or (out-)degree is the number of links that lead into or out of the node

} In an undirected graph they are of course identical

} Often used as measure of a node’s degree of connectedness and hence also influence and/or popularity

} Useful in assessing which nodes are central with respect to spreading information and influencing others in their immediate ‘neighborhood’

12

3

45

67

2

3

4

14

11

Nodes 3 and 5 have the highest degree (4)

Hypothetical graph

Page 30: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Betweenness centrality

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

} For a given node v, calculate the number of shortest paths between nodes i and j that pass through v, and divide by all shortest paths between nodes i and j

} Sum the above values for all node pairs i,j

} Sometimes normalized such that the highest value is 1or that the sum of all betweenness centralities in the network is 1

} Shows which nodes are more likely to be in communication paths between other nodes

} Also useful in determining points where the network would break apart (think who would be cut off if nodes 3 or 5 would disappear)

12

3

45

67

0

1.5

6.5

09

00

Node 5 has higher betweenness centrality than 3

Page 31: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Closeness centrality

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

} Calculate the mean length of all shortest paths from a node to all other nodes in the network (i.e. how many hops on average it takes to reach every other node)

} Take the reciprocal of the above value so that higher values are ‘better’ (indicate higher closeness) like in other measures of centrality

} It is a measure of reach, i.e. the speed with which information can reach other nodes from a given starting node

12

3

45

67

0.5

0.67

0.75

0.460.75

0.460.46

Nodes 3 and 5 have the highest (i.e. best) closeness, while node 2 fares almost as well

Note: Sometimes closeness is calculated without taking the reciprocal of the mean shortest path length. Then lower values are ‘better’.

Page 32: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Eigenvector centrality

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

} A node’s eigenvector centrality is proportional to the sum of the eigenvector centralities of all nodes directly connected to it

} In other words, a node with a high eigenvector centrality is connected to other nodes with high eigenvector centrality

} This is similar to how Google ranks web pages: links from highly linked-to pages count more

} Useful in determining who is connected to the most connected nodes

12

3

45

67

0.36

0.49

0.54

0.190.49

0.170.17

Node 3 has the highest eigenvector centrality, closely followed by 2 and 5 Note: The term ‘eigenvector’ comes from mathematics (matrix algebra),

but it is not necessary for understanding how to interpret this measure

Page 33: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Network centralities1

2

3

45

67

0

1.5

6.5

09

00

12

3

45

67

2

3

4

14

11

12

3

45

67

0.5

0.67

0.75

0.460.75

0.460.46

12

3

45

67

0.36

0.49

0.54

0.190.49

0.170.17

Degree centrality Betweenness centrality

Closeness centrality Eigenvector centrality

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 34: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Closeness centrality

Source: www.activenetwork.net

Page 35: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Betweenness centrality

Source: www.activenetwork.net

Page 36: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Eigenvector centrality

Source: www.activenetwork.net

Page 37: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Interpretation of measures

Centrality Descripting Value meaningDegree บอกระดบการไหลผานของขอมลของ node - คานอย ขอมลไหนผานนอย

คามาก ขอมลไหนผานมากCloseness - บอกเปนศนยกลางภายของ node ใน

เครอขาย

- นามาใชวดอตราเรวในการแพรกระจาย

ขอมลจาก node ดงกลาวไปในเครอขาย

- คานอย สาคญมาก คามาก สาคญนอย- คานอย กระจายเรว คามาก กระจายชา

Betweenness บอกถงระดบความสาคญหรอระดบบทบาท

ภายในเครอขาย

- คานอย สาคญนอย คามาก สาคญมาก

Eigenvector วดระดบอทธพลของ node ภายในเครอขาย - คานอยมอทธพลนอย

คามาก มอทธพลมาก

Page 38: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Basic Concepts

NetworksTie StrengthKey Players

} Cohesion

How to represent various social networks

How to identify strong/weak ties in the network

How to identify key/central nodes in network

How to characterize a network’s structure

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 39: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Reciprocity (degree of)

1 2

3 4

} The ratio of the number of relations which are reciprocated (i.e. there is an edge in both directions) over the total number of relations in the network

} …where two vertices are said to be related if there is at least one edge between them

} In the example to the right this would be 2/5=0.4 (whether this is considered high or low depends on the context)

} A useful indicator of the degree of mutuality and reciprocal exchange in a network, which relate to social cohesion

} Only makes sense in directed graphs

Reciprocity for network = 0.4

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 40: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Density

12

34

} A network’s density is the ratio of the number of edges in the network over the total number of possible edges between all pairs of nodes (which is n(n-1)/2, where n is the number of vertices, for an undirected graph)

} In the example network to the right density=5/6=0.83 (i.e. it is a fairly dense network; opposite would be a sparse network)

} It is a common measure of how well connected a network is (in other words, how closely knit it is) – a perfectly connected network is called a clique and has density=1

} A directed graph will have half the density of its undirected equivalent, because there are twice as many possible edges, i.e. n(n-1)

} Density is useful in comparing networks against each other, or in doing the same for different regions within a single network

12

34

density = 5/6 = 0.83

density = 5/12 = 0.42

Edge present in network

Possible but not present

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 41: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

12

3

45

67

1

0.67

0.33

N/a0.17

N/aN/a

} A node’s clustering coefficient is the number of closed triplets in the node’s neighborhood over the total number of triplets in the neighborhood. It is also known as transitivity.

Network clustering coefficient = 0.375 (3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total)

Cluster A

Cluster B

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Clustering

closed triplets (CT) total number of triplets (TT)

clustering coefficient = #$$$3

Page 42: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

12

3

45

67

1

0.67

0.33

N/a0.17

N/aN/a

} A node’s clustering coefficient is the number of closed triplets in the node’s neighborhood over the total number of triplets in the neighborhood. It is also known as transitivity.

} E.g., node 1 to the right has a value of 1 because it is only connected to 2 and 3, and these nodes are also connected to one another (i.e. the only triplet in the neighborhood of 1 is closed). We say that nodes 1,2, and 3 form a clique.

Network clustering coefficient = 0.375 (3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total)

Cluster A

Cluster B

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Clustering

clustering coefficient (node1) = 133

= 1

Page 43: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

12

3

45

67

1

0.67

0.33

N/a0.17

N/aN/a

} A node’s clustering coefficient is the number of closed triplets in the node’s neighborhood over the total number of triplets in the neighborhood. It is also known as transitivity.

Network clustering coefficient = 0.375 (3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total)

Cluster A

Cluster B

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Clustering

Kn = Degree of node v

Nn = Number of Link between neighbors of node n

CCn = Clustering coefficient of node n

CCn = 2 ) *+,+ ) (,+ − 1)

CC(node2) = 2(2)3 ) (3 − 1) =

46 = 2. 45

Page 44: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

12

3

45

67

0

1.5

6.5

09

00

} Clustering algorithms identify clusters or ‘communities’ within networks based on network structure and specific clustering criteria

} Hierarchical clustering

} Similarity based clustering

} Betweenness clustering (example shown to the right with two clusters is based on edge betweenness, an equivalent for edges of the betweenness centrality presented earlier for nodes)

Network clustering coefficient = 0.375 (3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total)

Cluster A

Cluster B

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Clustering

Page 45: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Average and longest distance

12

3

45

67

} The longest shortest path (distance) between any two nodes in a network is called the network’s diameter

} The diameter of the network on the right is 3; it is a useful measure of the reach of the network (as opposed to looking only at the total number of vertices or edges)

} It also indicates how long it will take at most to reach any node in the network (sparser networks will generally have greater diameters)

} The average of all shortest paths in a network is also interesting because it indicates how far apart any two nodes will be on average (average distance)

diameter

Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore

Page 46: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Social Network Analysis with R&

Visualizing Social Network

Page 47: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Visualizing Social Network

Page 48: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Katherine Ognyanova, www.kateto.netNetSciX 2016 School of Code Workshop, Wroclaw, Poland

Assistant Professor at the School of Communication and Information at Rutgers University.

Network Analysis and Visualization with R and igraph

Page 49: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Basic SNA with R and igraph

} Networks} Tie Strength} Key Players} Cohesion

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 50: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Basic SNA with R and igraph

} Networks} Tie Strength} Key Players} Cohesion

} Create network} Edge, Vertex and Attributes} Read network data from files} Turning networks into igraph

objects

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 51: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Create Network› g1 <- graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F )

› plot(g1)

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 52: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Create Network› g2 <- graph( edges=c(1,2, 2,3, 3, 1), n=10 )

› plot(g2)

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 53: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Create Network› g3 <- graph( c("John", "Jim", "Jim", "Jill", "Jill", "John"))

› plot(g3)

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 54: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Create Network› g4 <- graph( c("John", "Jim", "Jim", "Jack", "Jim", "Jack", "John", "John"), isolates=c("Jesse", "Janis", "Jennifer", "Justin") )

› plot(g4, edge.arrow.size=.5, vertex.color="gold", vertex.size=15, vertex.frame.color="gray", vertex.label.color="black", vertex.label.cex=0.8, vertex.label.dist=2, edge.curved=0.2)

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 55: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Create Network› plot(graph_from_literal(a---b, b---c))

› plot(graph_from_literal(a--+b, b+--c))

› plot(graph_from_literal(a+-+b, b+-+c))

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 56: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Edge, Vertex and Attributes› # The edges of the object

› E(g4)

› # The vertices of the object

› V(g4)

› # The network matrix

› g4[]

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 57: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Edge, Vertex and Attributes

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

g4

vertex_attr

name gender

edge_attr

type weight

Page 58: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Read Network from files3.1 DATASET 1: edgelist

› Dataset1-Media-Example-NODES.csv

› Dataset1-Media-Example-EDGES.csv

3.2 DATASET 2: matrix

› Dataset2-Media-User-Example-NODES.csv

› Dataset2-Media-User-Example-EDGES.csv

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 59: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Turning networks into igraph objectsWe start by converting the raw data to an igraph network object. Here we use igraph’s graph.data.frame function, which takes two data frames: d and vertices.

d describes the edges of the network. Its first two columns are the IDs of the source and the target node for each edge. The following columns are edge attributes (weight, type, label, or anything else).

vertices starts with a column of node IDs. Any following columns are interpreted as node attributes.

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 60: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 61: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 62: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 63: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

NYT

Page 64: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Basic SNA with R and igraph

} Networks} Tie Strength} Key Players} Cohesion

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

} Plotting networks with igraph} Assign weight} Assign type of Tie} Deleted Tie

Page 65: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Plotting networks with igraphWe can set the node & edge options in two ways

› Specify them in the plot()

› Set attributes and add them to the igraph object

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 66: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

} Networks centrality with igraph} Degree} Closeness} Betweenness} Eigenvector

Basic SNA with R and igraph

} Networks} Tie Strength} Key Players} Cohesion

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 67: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Network centrality with igraph

› Degree centrality› Closeness centrality › Betweenness centrality › Eigenvector centrality

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 68: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Network centrality with igraph

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 69: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

} Density and Reciprocity} Clustering} Averages & longest distance

Basic SNA with R and igraph

} Networks} Tie Strength} Key Players} Cohesion

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 70: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Transitivity with igraph

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 71: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Community Clustering based on edge betweenness

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 72: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Averages & longest distance

Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorialthe NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)

Page 73: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Using SNA to predict Subscriber churn

in Telecommunication industry

Page 74: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Customer Churn

Page 75: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

“Churn represents the loss of an existing customer

to a competitor”

A prevalent problem in:› Telecommunication services› Home mortgage refinance› Credit card

Churn is especially important to mobile phone service providers› Easy for a subscriber to switch services.› Mobile Number Portability (MNP) will remove last important

obstacle

What is Customer Churn ?

Page 76: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

The CRM core objectives included:

– Customer Acquisition– Customer Retention

› Churn Perdition– Cross-sell/Up Sell– Maximizing Lifetime Customer Value

Core CRM

Page 77: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

§ Segmentation

§ Predictive Analytics

§ Customer Acquisition§Costs of Customer Acquisition. (COCA)

§ Servicing§QoS (Network)§Call Center, Service Hall/Center (Net Promotor Score)

§ Customer Retention§Churn Perdition

§ Customer Lifetime Value (CLV)

Core CRM in Telecommunication

Page 78: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Most Telco define their customer segments using some of the following

› Payment type (prepaid vs. postpaid)

› ARPU (Average Revenue Per User >> revenue generated)

› Tenure (age of user : AOU)

› Demographics (location, income, job, gender, age, … etc.)

› Usage – voice, data, other non-voice, roaming

› Handsets/Devices– 2G,3G, 4G Device, Smartphone vs Feature phone

› Package– Package, Price plans

Segmentation

Page 79: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Process current and historical data in order to make predictions about future events.

› Making customer decisions. › Next Best Offer› Package & Price plan

› Cross-sell & Up-sell opportunities

› Credit scoring for setting dynamic limits (risk management)

› Fraud detection (postpaid only)

› Revenue Allocation

› Customer Lifetime Value

Predictive Analytics

Page 80: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

› Quality of Service – Network Utilization– Drop Call

› Call Center, Service Hall– The number of times that Customer contact (complain) via

Call Center/ Service Hall– Service Scoring (Net Promotor Score)

Servicing

Page 81: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

› Type of Churn

– Voluntary Churn– Involuntary Churn

• Type of Customer Retention

– Reactive– Proactive

Customer Retention

Page 82: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Source: Mohammed Akram Ayyubi Define - The Churn Reasons Taxonomy

The Churn Taxonomy

Page 83: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Data SourceCustomer Demographic:• Zip code• Income• Occupation• Age• Gender• Living Address• Occupation Address

Order:• Customer Type Crop/SME/Indy• Payment Type (Pre/Post)• Current Package• Package Plan• ARPU• Additional produce/service

Customer Relation:• Number of Questions about the services

from e.g. IVR• Number of Visits to retail shops or online

website• Number of Complaints solved• Number of total complaints

Service Usage: (CDR)• Number of calls• Volume of Data usage• Number of Outgoing calls• Number of Incoming calls• Number of Roaming calls• Number of International calls• Total minutes of usage (MOU)/Volume• Number of Drop calls

Page 84: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Data SourceBilling Data:• Total amount of bill• Total number of barred (one-way barred)• Total number of full barred (two-way

barred)

Network:• Cell Site Location• Network Type ( 4G/3G/2G )• Network Utilization• QoS

Page 85: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Data Mining Functionalities, Techniques & CRM Applications

Page 86: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Churn Modeling Techniques

Page 87: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Subscribers 25MPrepaid Subs

4MPostpaid Subs

21MPrepaid APRU

550 THBPostpaid APRU

160 THB

Source: NBTC (http://ttid.nbtc.go.th/mobile_db.html)

Page 88: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

ผใชบรการเครอขายโทรศพทแบบรายเดอนประเภทบคคลทOวไปของผ

ใหบรการเครอขายโทรศพทมอถอแหงหนO งทOมสถานะเปนผยกเลกบรการในชวง

เดอนกรกฎาคมถงเดอนสงคาคม พ.ศ. 2558

July August

Subscribers 1.5 MChurner 8.2 K

Page 89: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

1. Subscriber Centrality

– X1: Subscriber Centrality Degree – X2: Subscriber Centrality Closeness – X3: Subscriber Centrality Betweenness– X4: Subscriber Centrality Eigenvector

2. Number of Neighbors

– X5: Number of Churn Neighbors – X6: Number of Non-Churn Neighbors

3. Churn Neighbors Centrality

– X7: Churn Neighbors Centrality Degree– X8: Churn Neighbors Centrality Closeness – X9: Churn Neighbors Centrality Betweenness– X10: Churn Neighbors Centrality Eigenvector

4. Non- Churn Neighbors Centrality

– X11: Non-Churn Neighbors Centrality Degree– X12: Non-Churn Neighbors Centrality Closeness– X13: Non-Churn Neighbors Centrality Betweenness – X14: Non-Churn Neighbors Centrality Eigenvector

Page 90: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Subscriber Centrality

Churn Neighbors Centrality

Non- Churn Neighbors Centrality

Subscriber Churn StatusNumber of Neighbors

Page 91: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

adjacency matrix

ChurnerCDR

K-Meanclustering

Cluster 1

Cluster 2

Cluster 3

Cluster selection

Centralitycalculation

ChurnPrediction

12 3

6 5 4

Page 92: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

JulyJune

July August

Subscribers 1.5 MChurner 8.2 K

Page 93: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

ทาการ summarize ขอมลจานวนนาทในการโทร (Call Duration) และ จานวนครR งในการโทร (Number of Call) ใหอยในรปแบบ per subscriber per call

จะไดจานวน ความสมพนธ (links) = 1,747,835 transections

MO MT CallDuration

Numberof Call

A B 3 1

A C 2 4

A D 16 2

B D 23 9

E D 1 1

Page 94: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

แบงกลมโดยใชเทคนค K-mean clustering โดยใชจานวนนาทในการโทร (Call Duration) และจานวนครR งในการโทร (Number of Call) เปนตวแปรในการแบงกลม โดยกาหนดให k = 3 จะได cluster ดงรป

Blue : Cluster 1Yellow : Cluster 2

Red : Cluster 3

Page 95: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

1. พจารณาจากจานวน สมาชกในแตละกลม เมOอแบงจานวนกลมมากขR น

NumberofClusterK=2 K=3 K=4 K=5 K=6

#tran

sectionin

eachclusterg

roup 1 1,737,166 1,725,216 1,705,381 1,675,572 1,638,3022 10,669 21,801 38,463 61,628 88,1673 818 3,831 9,388 16,9244 160 1,160 3,7015 87 6646 77

2. พจารณา ratio ของคาInter-cluster distances and Intra-cluster distances เมO อแบงจานวนกลมมากขR น

ทาไมถงเลอก K= 3 ?

Page 96: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Blue : Cluster 1 Yellow : Cluster 2 Red : Cluster 3

Cluster Median of Call Duration

Median of number of Call Group Description

Cluster#1 2.13 2.00 โทรนอย - โทรสRนCluster#2 1,998.03 348.50 โทรนอย - โทรนานCluster#3 215.97 101.00 โทรบอย - โทรส-น

Page 97: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

เมO อรวมทRงสอง Cluster#2 และ และ Cluster#3 เขาดวยกน จะได cluster ดงรป

Yellow : Cluster 2Red : Cluster 3

Page 98: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Prepaid Customer Heavy Voice Call on Net

Page 99: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

A B C D E

A 0 1 1 1 0

B 1 0 0 1 0

C 1 0 0 0 0

D 1 1 0 0 1

E 0 0 0 1 0

MO MT

A B

A C

A D

B D

E D

adjacency matrix Social Network GraphEdge list

Page 100: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

ChurnerNon-churner

Page 101: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

ChurnerNon-churner

Page 102: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

ChurnerNon-churner

Page 103: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

นา adjacency matrix ทOไดจากขRนตอนทO 4 มาคานวณคาระดบความเปนศนยกลางของแตละ Subscriber ทRงหมด 4 คา โดยอางองทฤษฎกราฟ ดงนR

Centrality Descripting Value meaning

Degree บอกระดบการไหลผานของขอมลของ node - คานอย ขอมลไหนผานนอย คามาก ขอมลไหนผานมาก

Closeness - บอกเปนศนยกลางภายของ node ในเครอขาย- นามาใชวดอตราเรวในการแพรกระจายขอมลจาก

node ดงกลาวไปในเครอขาย

- คานอย สาคญมาก คามาก สาคญนอย- คานอย กระจายเรว คามาก กระจายชา

Betweenness บอกถงระดบความสาคญหรอระดบบทบาทภายใน

เครอขาย

- คานอย สาคญนอย คามาก สาคญมาก

Eigenvector วดระดบอทธพลของ node ภายในเครอขาย - คานอยมอทธพลนอย

คามาก มอทธพลมาก

Page 104: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

5.1 ทาการคานวนคาความเปนศนยกลางของ Subscriber Centrality ทRงหมด จะไดสวนทO5.2 หลงจากนRนคานวนหาคา Neighbours Centrality ของทกๆ subscriber จะไดสวนทO

2, 3, 4 และ 5 ตามลาดบ

(1) Subscriber Centrality

(4) Churn Neighbors Centrality

(5) Non- Churn Neighbors Centrality

(2) Number of Neighbors A B C D E

A 0 1 1 1 0

B 1 0 0 1 0

C 1 0 0 0 0

D 1 1 0 0 1

E 0 0 0 1 0

adjacency matrixข RนตอนทO 5.1

Subscriber Centrality

ข RนตอนทO 5.2 Neighbors Centrality

Page 105: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

คานวณคาความระดบความเปนศนยกลางเพOอสรางตวแปรอสระทRงหมด 14 ตวแปร

1. Subscriber Centrality– X1: Subscriber Centrality Degree – X2: Subscriber Centrality Closeness – X3: Subscriber Centrality Betweenness– X4: Subscriber Centrality Eigenvector

2. Number of Neighbors – X5: Number of Churn Neighbors – X6: Number of Non-Churn Neighbors

Page 106: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

คานวณคาความระดบความเปนศนยกลางเพOอสรางตวแปรอสระทRงหมด 14 ตวแปร

3. Churn Neighbors Centrality– X7: Churn Neighbors Centrality Degree– X8: Churn Neighbors Centrality Closeness – X9: Churn Neighbors Centrality Betweenness– X10: Churn Neighbors Centrality Eigenvector

4. Non- Churn Neighbors Centrality– X11: Non-Churn Neighbors Centrality Degree– X12: Non-Churn Neighbors Centrality Closeness– X13: Non-Churn Neighbors Centrality Betweenness – X14: Non-Churn Neighbors Centrality Eigenvector

Page 107: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์
Page 108: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Principal Component Analysis ตวแปรอสระ X3 –X14 เพOอทา Principal Component Analysis โดยเลอกจานวน factor

ทRงหมด 3 factors เพOอใหสามารถแตละ factor เพOอใหสามารถนยามความหมายของแตละ factor ได อยางชดเจน

Page 109: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Principal Component Analysis จาก PCA สามารถสกดปจจย (factor extraction) ไดดงนR

VariablesFactor1:Churner

NeighborsCentrality

Factor2:Non-Churner

NeighborsCentrality

Factor3:SocialNetworkEigenvalue

X8 ChurnNeighborsCentralityCloseness 1X5 NumberofChurnNeighbors 0.998X9 ChurnNeighborsCentralityBetweenness 0.701X7 ChurnNeighborsCentralityDegree 0.555X6 NumberofNon-ChurnNeighbors 1.013X12 Non-ChurnNeighborsCentralityCloseness 1.013X11 Non-ChurnNeighborsCentralityDegree 0.596X13 Non-ChurnNeighborsCentralityBetweenness 0.315X10 ChurnNeighborsCentralityEigenvector 1.048X4 SubscriberCentralityEigenvector 0.866X14 Non-ChurnNeighborsCentralityEigenvector 0.407

Page 110: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Principal Component Analysis

X1: Subscriber Centrality Degree

X2: Subscriber Centrality Closeness

X3:Subscriber Centrality Betweenness

X8: Churn Neighbors Centrality ClosenessX5: Number of Churn NeighborsX9: Churn Neighbors Centrality BetweennessX7: Churn Neighbors Centrality Degree

X6: Number of Non-Churn NeighborsX12: Non-Churn Neighbors Centrality ClosenessX11: Non-Churn Neighbors Centrality DegreeX13: Non-Churn Neighbors Centrality Betweenness

X10: Churn Neighbors Centrality EigenvectorX4: Subscriber Centrality EigenvectorX14: Non-Churn Neighbors Centrality Eigenvector

PA1: Churn Neighbors Centrality

PA2: Non-Churn Neighbors Centrality

PA3: Social Network Eigenvalue

Y: Subscriber Churn Status

Page 111: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์
Page 112: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

ในขRนตอนนR จะสราง Churn prediction model และ Model evaluation

X3:Subscriber Centrality Betweenness

X1: Subscriber Centrality Degree

PC1: Churn Neighbors CentralityPC2: Non-Churn Neighbors Centrality

X2: Subscriber Centrality Closeness

Churn model evaluation

Train

ChurnModels

Test

ChurnPrediction

ROC

PC3: Social Network Eigenvalue

Page 113: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Model#1 Model#2 Model#3Dependent Variable (DV) Y: Subscriber Churn Status Y: Subscriber Churn Status Y: Subscriber Churn Status

Independent Variable (IV) X1: Subscriber Centrality DegreePC1: Non-Churn Neighbors Centrality PC2 :Churn Neighbors Centrality

X1: Subscriber Centrality Degree X2: Subscriber Centrality ClosenessX3: Subscriber Centrality Betweenness

X1: Subscriber Centrality Degree X2: Subscriber Centrality ClosenessPC3: Social Network Eigenvalue

Number of Independent variables 3 3 3Akaike information criterion (AIC) 22710 23546 19540

Pseudo R-Squared

llh -11351.1129 -11769.0146 -9765.7565llhNull -14840.6155 -14840.6155 -14840.6155

McFadden's pseudo r-squared 0.2351 0.2070 0.3420

Cragg and Uhler's pseudo r-squared 0.3687 0.3305 0.5009

Model evaluation

Accuracy 0.7247 0.7201 0.727095% CI (0.7155, 0.7338) (0.7108, 0.7292) (0.7178, 0.736)Kappa 0.4072 0.3969 0.4115

Sensitivity 0.9891 0.9864 0.9944Specificity 0.3938 0.3867 0.3923

Pos Pred Value 0.6713 0.6681 0.6719Neg Pred Value 0.9666 0.9579 0.9824

Prevalence 0.5559 0.5559 0.5559Detection Rate 0.5498 0.5483 0.5527

Detection Prevalence 0.8191 0.8207 0.8226Balanced Accuracy 0.6915 0.6866 0.6934

AUC 0.8556 0.8511 0.8581

Page 114: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Model#1 Model#2 Model3

Confusion Matrix Reference Reference Reference

1 0 1 0 1 0

Predicted1 1,622 56

Predicted1 1,593 70

Predicted1 1,616 29

0 2,497 5,099 0 2,526 5,085 0 2,503 5,126

AUC 0.8556 0.8511 0.8581

ROC Model#1 ROC Model#2 ROC Model#3

Page 115: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Churner Non-churner

Page 116: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Churner / Predicted ChurnerNon-churner / Predicted Non-churnerNon-churner / Predicted ChurnerChurner / Predicted Non-churner

Page 117: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Churner / Predicted ChurnerNon-churner / Predicted Non-churnerNon-churner / Predicted ChurnerChurner / Predicted Non-churner

Page 118: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์

Churner / Predicted ChurnerNon-churner / Predicted Non-churnerNon-churner / Predicted ChurnerChurner / Predicted Non-churner

Page 119: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์
Page 120: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์
Page 121: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์
Page 122: Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์