Upload
anonymous-rrgvqj
View
213
Download
1
Embed Size (px)
DESCRIPTION
Segmentation of stock trading customers
Citation preview
Segmentation of stock trading customers according to potential value
H.W. Shina,*, S.Y. Sohnb
aSamsung Economy Research Institute, Kukje Cener Building, 191, Hangangro 2-Ga, Seoul, South KoreabDepartment of Computer Science and Industrial Systems Engineering, Yonsei University, Seoul, South Korea
Abstract
In this article, we use three clustering methods (K-means, self-organizing map, and fuzzy K-means) to find properly graded stock market
brokerage commission rates based on the 3-month long total trades of two different transaction modes (representative assisted and online
trading system). Stock traders for both modes are classified in terms of the amount of the total trade as well as the amount of trade of each
transaction mode, respectively. Results of our empirical analysis indicate that fuzzy K-means cluster analysis is the most robust approach for
segmentation of customers of both transaction modes. We then propose a decision tree based rule to classify three groups of customers and
suggest different brokerage commission rates of 0.4, 0.45, and 0.5% for representative assisted mode and 0.06, 0.1, and 0.18% for online
trading system, respectively.
q 2003 Elsevier Ltd. All rights reserved.
Keywords: Customer relationship management; Customer segmentation; K-means clustering; Self-organizing map; Fuzzy K-means
1. Introduction
The scale of Korean stock market has been rapidly
increased in 1990s. In spite of the financial crisis occurred in
Korea in 1997, there were more than 30 domestic security
corporations, and daily average stock transaction had
reached 4800 billion won in 2000, compared to 4100 billion
won a year ago. It indicates that the commission based on
the transaction was considerably increased as well. This
commission is one of the main sources for profit of security
corporations and each security corporation introduces its
own commission rate to increase the profit. It is typically
based on each trading amount itself. However, this kind of
system does not consider the potential customer value over
time. Those who have traded more in a cumulative manner
continuously over a longer time period needs to be treated in
a better manner (Hartfeil, 1996). In commercial banking
system, Zeithaml, Rust, and Lemon (2001) presented that
superior 20% of customers produced 82% of the bank’s
retail profit. Hunt (1999) showed that the charge system of
insurance corporation should be arranged not uniformly but
differently according to customer’s potential value. This
argument supports the value of better treatment of loyal
customers.
In this article, we propose a robust clustering algorithm
to classify the stock traders into several groups in terms of
the three 3-month transaction in order to suggest the graded
commission policy for each group.
Variables used for clustering criteria are transactions
made on both representative assisted trading and online
Home Trading System (HTS). Clustering methods used are
K-means clustering, self-organizing map (SOM), and fuzzy
K-means method. The cut-off value of each customer group
is set based on classification and regression tree (CART).
The rest of this article is organized as follows. In Section 2
we describe three clustering methods along with the
performance measure for comparison. In Section 3 we
apply proposed algorithms to the field data and come up
with three groups of customers. Subsequently, in Section 4
we present new brokerage commission rate and it is
compared to the existing commission rate in terms of profit.
Finally in Section 5, we discuss the implication of our
results and suggest further study areas.
2. Three clustering algorithms
Cluster analysis can be used for gathering objects
(observation) on the basis of their variables. We use three
0957-4174/$ - see front matter q 2003 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2003.12.002
Expert Systems with Applications 27 (2004) 27–33
www.elsevier.com/locate/eswa
* Corresponding author. Tel.: þ82-2-3780-8022; fax: þ8-22-3780-8152.
E-mail addresses: [email protected] (H.W. Shin);
[email protected] (S.Y. Sohn).
kinds of clustering methods for customer segmentation:
K-means, SOM, and fuzzy K-means. For brief description of
each method, let us assume that we are interested in
clustering N samples with respect to P variables into K
clusters. For sample i; xi ¼ ðxi1; xi2;…; xip;…; xiPÞ represents
a vector of P characteristic variables. Typically K is un-
known but for stock customer segmentation, we use K ¼ 3:
2.1. K-means clustering algorithm
K-means method is widely used due to rapid processing
ability of large data. K-means clustering proceeds in the
following order. Firstly, K number of observations is
randomly selected among all N number of observations
according to the number of clusters. They become centers
of initial clusters. Secondly, for each of remaining
N –K observations, find the nearest cluster in terms of
the Euclidean distance with respect to xi ¼
ðxi1; xi2;…; xip;…; xiPÞ
After each observation is assigned the nearest cluster,
recompute the center of the cluster. Lastly, after the
allocation of all observation, calculate the Euclidean
distance between each observation and cluster’s center
point and confirm whether it is allocated to the nearest
cluster or not.
2.2. Self-organizing map
The SOM is an unsupervised neural network model
devised by Kohonen (1982). As with other neural networks
the analysis is based on the solution of a large number of
simple operations that can be performed in parallel. The
SOM network typically has two layers of nodes: an input
layer and an output layer. The neurons in the output layer
are arranged in a grid and are influenced by their neighbors
in this grid. The goal is to automatically cluster the input
samples in such a way that similar samples are represented
by the same output neuron (Kim & Han, 2001; Mangiameli,
Chen, & West, 1996).
Since each of the characteristic variables is linked to
every output neuron by a weighted connection, each output
neuron j ðj ¼ 1;…;KÞ has the same number of weights wj
associated with as the number of input variables. Starting
from a randomly initialized weights, it learns to adapt its
weight according to the input samples as follows.
When an input sample, xi; is presented to the SOM
network, the neurons compute distance between
weight vectors wj ¼ ðwj1;wj2;…;wjp;…;wjPÞ and the input
x ¼ ðxi1; xi2;…; xip;…; xiPÞ: The neuron with the minimum
distance, called winner, is then determined based on
Min Dj
¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXPp¼1
½xip 2 wjp�2
vuut ð1Þ
where wjp is the weight of the j th neuron linked to p th
variable.
The weights of the winner as well as in its neighborhood
are then updated using the following equation:
wj new ¼ wj old þ akxi 2 wj oldk ð2Þ
where wj new is the new weight vector and wj old is the
old weight vector of the j th neuron, and a is the
learning rate ð0 , a , 1Þ: This procedure is over when
the difference in the error (e.g. average of the Euclidean
distances of each input sample and its best matching
weight vector) between the current and the previous
iteration is smaller than a given value 1. After the stop
criterion is satisfied, each neuron in the network
represents a cluster.
2.3. Fuzzy K-means clustering analysis
Fuzzy set theory was introduced in the 1960s as a
way of explaining uncertainty in data structure (Zadeh,
1965). Fuzzy K-means (also known as fuzzy c-means)
clustering has been investigated by Bezdek (1981) and
was compared to the non-fuzzy clustering method.
Hruschka (1986) and Weber (1996) showed in their
empirical study that fuzzy clustering provided more
insight than non-fuzzy clustering in terms of market
segment information.
Fuzzy clustering segments the samples into 1 , K , N
clusters, estimates sample cluster membership and simul-
taneously estimates the cluster centers. The cluster member-
ship of xi in the cluster s; usi; is between 0 and 1 and is
defined as follows (Ozer, 2001)
usi¼1XK
j¼1
kxi2vsk2=ðm21Þ
kxi2vjk2=ðm21Þ
! ; forxi–vj;;s;i;andm.1 ð3Þ
where m is the smoothing parameter which controls the
fuzziness of the clusters, and vs is the vector of cluster
centers ðvs1; vs2;…; vsp;…; vsPÞ defined as
vs ¼
XN
I¼1ðusiÞ
mxiXN
i¼1ðusiÞ
m; ;s: ð4Þ
Optimal value of u is obtained so as to minimize the
following objective function
MinXNi¼1
XKs¼1
ðusiÞmðkxi 2 vsk
2Þ ð5Þ
The constraints used are as follows
0 # usi # 1; ;s; i ð6Þ
XKs¼1
usi ¼ 1; ;s: ð7Þ
H.W. Shin, S.Y. Sohn / Expert Systems with Applications 27 (2004) 27–3328
Condition (6) ensures that the degrees of memberships are
between 0 and 1, and condition (7) means that, for a given
sample, the degrees of membership across the clusters sum
to one. Once optimal values of u are found, a case with
highest associated u is assigned a corresponding cluster.
2.4. Performance comparison of the three
clustering methods
We compare the performances of these clustering
methods using ‘intraclass’ method presented in Michaud
(1997). Intraclass inertia is a measure of how compact each
cluster (class) is when the number of cluster is fixed.
Usually the variables are scaled to be in the same range
(Nair & Narendran, 1997). The mean of the j th cluster Cj
that has nj samples is defined as
�xj ¼ ð�xj1; �xj2;…; �xjp;…; �xjPÞ;
where xjp ¼ ð1=njÞXi[Cj
xip ð8Þ
The intraclass inertia Ij of cluster j is defined as
Ij ¼Xi[Cj
XPp¼1
ðxip 2 �xjpÞ2 ð9Þ
Finally, the intraclass inertia FðKÞ for a given K clusters is
defined as
FðKÞ ¼1
n
XKj¼1
njIj ¼1
n
XKj¼1
Xi[Cj
XPp¼1
ðxip 2 �xjpÞ2 ð10Þ
One can see that FðKÞ is the average squared Euclidean
distance between each observation and its cluster mean.
3. A case study
We randomly select 3000 customers who had transaction
records from the middle of July to the middle of October in
1999 from stock corporation ‘A’ and apply the three
clustering methods.
The stock transaction modes used are either representa-
tive assisted or online HTS. HTS customers directly buy and
sell their stocks without the advice of the corporation’s
representatives. Results of the descriptive statistics of the
sample data are given as follows.
About 78% of the total trade amount was made by
online HTS.
In terms of gender, 68% of the customers are male.
However, average trade amount made by female customers
by both modes were 51 and 52%, respectively, for
representative assisted and online HTS. This suggests the
importance of marketing strategy for HTS and female
customers. In terms of age, those who are older than 60 used
representative assisted mode mostly. Also, their trade
amount is the highest among various generations in both
transaction modes. In terms of the average transaction
frequency, representative assisted mode is 1.8 times while
online HTS is six times per month, respectively.
We also estimate correlation between the trade amount
made by each transaction mode and the sum of
them. Apparently the correlation between the two modes
is relatively low (0.38) while those between single mode and
the total transactions are 0.76 and 0.89, respectively, for
representative assisted and online HTS.
3.1. Cluster analysis of customers
Clustering methods are used to segment the customers
for both modes, respectively, using two variables for
clustering of customers each mode. Variables used for
cluster analysis for representative assisted mode are both
‘total trade amount’ and ‘representative assisted trade
amount’ over the 3-month period. In the case of HTS
mode, we use both total trade amount and ‘trade amount in
HTS’ over the 3-month period.
Customers are segmented into three clusters (Normal,
Best, VIP customers). After some experimentation with the
parameters of clustering methods we set the following
parameters: SOM learning rate ðaÞ is equal to 0.1 and fuzzy
K-means smoothing parameter ðmÞ is equal to 1.2. Fuzzy
K-means smoothing parameter ðmÞ is equal to 1.2.
For comparison purpose, the resulting compactness of
clusters of the three clustering methods (K-means, SOM,
fuzzy K-means) is summarized in Table 1.
In case of customer segmentation in the representative
assisted mode, K-means clustering method turns out to be
Table 2
The segmentation of customers in representative assisted mode using
K-means
Number of
customers
Cluster center
Total trade amount
for 3 months (units: won)
Trade amount in
representative assisted
mode for 3 months
(units: won)
Normal 2969 31.0 million 6.4 million
Best 30 84.0 billion 25.6 billion
VIP 1 558.0 billion 297.3 billion
Table 1
Intraclass inertia of each clustering method
Clustering method Mode Intraclass inertia
K-means Representative assisted mode 7.2685 £ 1016p
HTS 1.09 £ 1017
SOM Representative assisted mode 1.04394 £ 1017
HTS 1.12 £ 1018
Fuzzy K-means Representative assisted mode 7.29 £ 1016
HTS 9.55 £ 1016p
H.W. Shin, S.Y. Sohn / Expert Systems with Applications 27 (2004) 27–33 29
the best while in the segmentation of customers in HTS,
fuzzy K-means method is the winner. Table 2 and Fig. 1
represent the segmentation of customers in Representative
assisted mode using K-means clustering method while
Table 3 and Fig. 2 represent the segmentation of customers
of HTS using fuzzy K-means.
The results indicate that the number of Best customers
and VIP customers are small in the case of representative
assisted mode compared to HTS.
As shown in Figs. 1 and 2, there is a particular data that
have a very large amount of total trade (558 billion won for
3 months) among VIP customers. This customer may be
considered as an outlier. Therefore, we compare the
clustering results without this particular customer. Results
are given in Table 4.
In this case, fuzzy K-means has the best performance in
representative assisted mode. SOM is the most suitable in
HTS, but fuzzy K-means produces fairly good performance
as well. Generally, we can conclude that fuzzy K-means
provides relatively robust results in terms of intraclass
inertia for both modes.
3.2. Classification of three group of customers
In practice, we need threshold values to classify the three
different groups We use decision tree to find the threshold
values for customer segmentation of both transaction
modes. The class (Normal, Best, VIP) of outcome is
categorized by fuzzy K-means after deleing an outlier.
Seventy percentage of 2999 (except a particular customer)
customers’ data are assigned for training while 30% are
assigned for validation using a segment based stratified
sampling approach. We then use CART algorithm to find
the threshold values for the three groups.
Trees in Figs. 3 and 4 show the threshold values for
customer segmentation. From Fig. 3, if the total trade
amount of both modes for three months is less than
about 19.3 billion won, they are defined as Normal
customers. Also if the total trade amount of both modes
for 3 months is more than 19.3 billion won and the trade
amount in the representative assisted mode for 3 months
is less than 125 billion won, they are defined as Best
customers. The others customers are VIP customers.
From Fig. 4, if the trade amount in HTS for 3 months is
less than about 13.6 billion won and the total trade amount
of both modes is less than 23.3 billion won, they are defined
as Normal customers. Also, if the trade amount in HTS for 3
months is more than 13.6 billion won and the total trade
amount of both modes is more than 75.9 billion won, they
are defined as VIP customers. The rest of them are
considered as Best customers.
Fig. 1. Transaction distribution in a representative assisted mode for 3 months.
Table 3
The segmentation of customers of HTS using fuzzy K-means
Number of
customers
Cluster center
Total trade
amount for
3 months (units: won)
Trade amount
in HTS mode
for 3 months (units: won)
Normal 2915 26.1 million 19.9 million
Best 80 41.8 billion 30.9 billion
VIP 5 261.8 billion 199.1 billion
H.W. Shin, S.Y. Sohn / Expert Systems with Applications 27 (2004) 27–3330
4. New brokerage commission policy
In this section, we suggest the graded brokerage
commission policy based on the three clusters of customers.
The new policy must be effective enough to avoid the
churning behavior of the existing customers and at the same
time it should result in sufficient profit to the security
corporation. As shown in Table 5, we suggest that the
proposed commission of Normal, Best, and VIP customers
be 0.5, 0.45, and 0.4% in the representative assisted mode
while 0.18, 0.1, and 0.06% for HTS, respectively. This
policy is then compared to the existing commission system
of A stock corporation (see Table 6).
Next, we compare the profit of existing commission
policy with the profit of the proposed commission policy in
Table 7. The proposed commission policy is based on the
threshold values obtained by decision tree using fuzzy
K-means algorithm.
As shown in Table 7, one can see that the new policy
would provide the expected profit which is similar to that of
the existing policy. However, it should be noted that the
proposed commission policy have additional positive effects
on customer relationship management (CRM) by recogniz-
ing the value of different levels of customers. Therefore, in a
long run, we can conclude that the new policy would bring
higher profit than the existing commission policy.
5. Conclusion
In this article, we found a fuzzy K-means clustering
being the most stable to group stock trading customers and
used it to classify three tiers of customers (Normal, Best,
and VIP level) based on the total trade amount over 3-month
period. For each group, different brokerage commission rate
is assigned as 0.4, 0.45, and 0.5% for the representative
assisted mode while 0.06, 0.1, and 0.18% for HTS.
This approach is different from the existing graded
commission policy in that the proposed policy adopts the
idea of the graded commission based on the historically
accumulated transaction amount made by customer. This
new approach is expected to bring more profit by treat loyal
customers in a better manner and subsequently retain them in
a longer term.
Data used in this article for clustering contain relatively
short history of customers’ transaction. After data ware-
housing project is completed and it accumulates a larger
amount of information, clustering may need to be re-done for
tuning.
Our new policy is mainly dependent on the cumulative
transaction. Some other facts such as frequency of
transaction may need to be included in the policy.
Fig. 2. Transaction distribution in HTS for 3 months.
Table 4
Intraclass inertia by clustering method (without a particular customer)
Cluster analysis
method
Mode Intra-class Inertia
K-means Representative assisted mode 6.19 £ 1016
HTS 6.82 £ 1016
SOM Representative assisted mode 7.45 £ 1016
HTS 5.43 £ 1016p
Fuzzy Representative assisted mode 4.12 £ 1016p
K-means HTS 5.48 £ 1016
H.W. Shin, S.Y. Sohn / Expert Systems with Applications 27 (2004) 27–33 31
Fig. 4. Classifying the customer for HTS mode (unit: won, the number in the parenthesis is the is the count per class).
Fig. 3. Classifying the customers of the Representative assisted mode (unit: won, the number in the parenthesis is the count per class).
H.W. Shin, S.Y. Sohn / Expert Systems with Applications 27 (2004) 27–3332
More variations of approach based on the longer time-series
data set are left for further study areas.
Acknowledgement
This work was supported by grant No. R04-2002-000-
20003-0 from Korea Science & Engineering Foundation.
References
Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function
algorithms. New York: Plenum Press.
Hartfeil, G. (1996). Bank one measures profitability of customers, not just
products. Journal of Retail Banking Services, 18(2), 23–29.
Hruschka, H. (1986). Market definition and segmentation using fuzzy
clustering methods. International Journal of Research in Marketing, 3,
117–134.
Hunt, P. (1999). The pricing is right. Canadian Insurance Statistics,
26–28.
Kim, K. S., & Han, I. (2001). The cluster-indexing method for case-based
reasoning using self-organizing maps and learning vector quantization
for bond rating cases. Expert Systems with Applications, 21(3),
147–156.
Kohonen, T. (1982). Self-organized formation of topologically correct,
feature maps. Biological Cybernetics, 43(1), 59–69.
Mangiameli, P., Chen, S. K., & West, D. A. (1996). Comparison of SOM
neural network and hierarchical clustering methods. European Journal
of Operational Research, 93(2), 402–417.
Michaud, P. (1997). Clustering techniques. Future Generation Computer
System, 13(2), 135–147.
Nair, G. J., & Narendran, T. T. (1997). Cluster goodness: a new measure of
performance for cluster formation in the design of cellular manufactur-
ing systems. International Journal of Production Economics, 48(1),
49–61.
Ozer, M. (2001). User segmentation of online music services using fuzzy
clustering. Omega, 29(2), 193–206.
Weber, R. (1996). Customer segmentation for banks and insurance groups
with fuzzy clustering techniques. In J. F. Baldwin (Ed.), Fuzzy logic.
New York: Wiley.
Zeithaml, V. A., Rust, R. T., & Lemon, K. N. (2001). The customer
pyramid: creating and serving profitable customers. California
Management Review, 43(4), 118–142.
Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–353.
Table 6
Currently used commission rates of ’A’ stock corporation
Mode Amount of transaction Brokerage commission
Representative
assisted mode
Under 200 million 0.5%
From 200 to 500 million 0.45% þ 1000
Over 500million 0.4% þ 500
HTS Under 250million 0.23%
From 250 to 500 million 0.19% þ 1000
From 500 to 1000 million 0.17% þ 500
From 1000 to 3000million 0.15%
3000 million 0.09%
Table 7
Comparison of the two commission policies in ‘A’ stock corporation (unit:
won)
Class Profit by the existing
commission policy
Profit by the proposed
commission policy
Representative assisted
mode
1,473,640,285 1,428,058,209
HTS 1,349,532,283 1,356,896,165
Total commission 2,823,172,568 2,784,954,374
Table 5
Newly proposed commission rate
Brokerage commission
in representative
assisted mode (%)
Brokerage
commission in HTS (%)
Normal 0.5 0.18
Best 0.45 0.1
VIP 0.4 0.06
H.W. Shin, S.Y. Sohn / Expert Systems with Applications 27 (2004) 27–33 33