Paper4-10.1.1.96.9084

Segmentation of stock trading customers according to potential value

H.W. Shina,*, S.Y. Sohnb

aSamsung Economy Research Institute, Kukje Cener Building, 191, Hangangro 2-Ga, Seoul, South KoreabDepartment of Computer Science and Industrial Systems Engineering, Yonsei University, Seoul, South Korea

Abstract

In this article, we use three clustering methods (K-means, self-organizing map, and fuzzy K-means) to find properly graded stock market

brokerage commission rates based on the 3-month long total trades of two different transaction modes (representative assisted and online

trading system). Stock traders for both modes are classified in terms of the amount of the total trade as well as the amount of trade of each

transaction mode, respectively. Results of our empirical analysis indicate that fuzzy K-means cluster analysis is the most robust approach for

segmentation of customers of both transaction modes. We then propose a decision tree based rule to classify three groups of customers and

suggest different brokerage commission rates of 0.4, 0.45, and 0.5% for representative assisted mode and 0.06, 0.1, and 0.18% for online

trading system, respectively.

q 2003 Elsevier Ltd. All rights reserved.

Keywords: Customer relationship management; Customer segmentation; K-means clustering; Self-organizing map; Fuzzy K-means

1. Introduction

The scale of Korean stock market has been rapidly

increased in 1990s. In spite of the financial crisis occurred in

Korea in 1997, there were more than 30 domestic security

corporations, and daily average stock transaction had

reached 4800 billion won in 2000, compared to 4100 billion

won a year ago. It indicates that the commission based on

the transaction was considerably increased as well. This

commission is one of the main sources for profit of security

corporations and each security corporation introduces its

own commission rate to increase the profit. It is typically

based on each trading amount itself. However, this kind of

system does not consider the potential customer value over

time. Those who have traded more in a cumulative manner

continuously over a longer time period needs to be treated in

a better manner (Hartfeil, 1996). In commercial banking

system, Zeithaml, Rust, and Lemon (2001) presented that

superior 20% of customers produced 82% of the bank’s

retail profit. Hunt (1999) showed that the charge system of

insurance corporation should be arranged not uniformly but

differently according to customer’s potential value. This

argument supports the value of better treatment of loyal

customers.

In this article, we propose a robust clustering algorithm

to classify the stock traders into several groups in terms of

the three 3-month transaction in order to suggest the graded

commission policy for each group.

Variables used for clustering criteria are transactions

made on both representative assisted trading and online

Home Trading System (HTS). Clustering methods used are

K-means clustering, self-organizing map (SOM), and fuzzy

K-means method. The cut-off value of each customer group

is set based on classification and regression tree (CART).

The rest of this article is organized as follows. In Section 2

we describe three clustering methods along with the

performance measure for comparison. In Section 3 we

apply proposed algorithms to the field data and come up

with three groups of customers. Subsequently, in Section 4

we present new brokerage commission rate and it is

compared to the existing commission rate in terms of profit.

Finally in Section 5, we discuss the implication of our

results and suggest further study areas.

2. Three clustering algorithms

Cluster analysis can be used for gathering objects

(observation) on the basis of their variables. We use three

0957-4174/$ - see front matter q 2003 Elsevier Ltd. All rights reserved.

doi:10.1016/j.eswa.2003.12.002

Expert Systems with Applications 27 (2004) 27–33

www.elsevier.com/locate/eswa

* Corresponding author. Tel.: þ82-2-3780-8022; fax: þ8-22-3780-8152.

E-mail addresses: [email protected] (H.W. Shin);

[email protected] (S.Y. Sohn).

http://www.elsevier.com/locate/eswa

kinds of clustering methods for customer segmentation:

K-means, SOM, and fuzzy K-means. For brief description of

each method, let us assume that we are interested in

clustering N samples with respect to P variables into K

clusters. For sample i; xi ¼ ðxi1; xi2;…; xip;…; xiPÞ represents

a vector of P characteristic variables. Typically K is un-

known but for stock customer segmentation, we use K ¼ 3:

2.1. K-means clustering algorithm

K-means method is widely used due to rapid processing

ability of large data. K-means clustering proceeds in the

following order. Firstly, K number of observations is

randomly selected among all N number of observations

according to the number of clusters. They become centers

of initial clusters. Secondly, for each of remaining

N –K observations, find the nearest cluster in terms of

the Euclidean distance with respect to xi ¼

ðxi1; xi2;…; xip;…; xiPÞ

After each observation is assigned the nearest cluster,

recompute the center of the cluster. Lastly, after the

allocation of all observation, calculate the Euclidean

distance between each observation and cluster’s center

point and confirm whether it is allocated to the nearest

cluster or not.

2.2. Self-organizing map

The SOM is an unsupervised neural network model

devised by Kohonen (1982). As with other neural networks

the analysis is based on the solution of a large number of

simple operations that can be performed in parallel. The

SOM network typically has two layers of nodes: an input

layer and an output layer. The neurons in the output layer

are arranged in a grid and are influenced by their neighbors

in this grid. The goal is to automatically cluster the input

samples in such a way that similar samples are represented

by the same output neuron (Kim & Han, 2001; Mangiameli,

Chen, & West, 1996).

Since each of the characteristic variables is linked to

every output neuron by a weighted connection, each output

neuron j ðj ¼ 1;…;KÞ has the same number of weights wj

associated with as the number of input variables. Starting

from a randomly initialized weights, it learns to adapt its

weight according to the input samples as follows.

When an input sample, xi; is presented to the SOM

network, the neurons compute distance between

weight vectors wj ¼ ðwj1;wj2;…;wjp;…;wjPÞ and the input

x ¼ ðxi1; xi2;…; xip;…; xiPÞ: The neuron with the minimum

distance, called winner, is then determined based on

Min Dj

¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXPp¼1

½xip 2 wjp�2

vuut ð1Þ

where wjp is the weight of the j th neuron linked to p th

variable.

The weights of the winner as well as in its neighborhood

are then updated using the following equation:

wj new ¼ wj old þ akxi 2 wj oldk ð2Þ

where wj new is the new weight vector and wj old is the

old weight vector of the j th neuron, and a is the

learning rate ð0 , a , 1Þ: This procedure is over when

the difference in the error (e.g. average of the Euclidean

distances of each input sample and its best matching

weight vector) between the current and the previous

iteration is smaller than a given value 1. After the stop

criterion is satisfied, each neuron in the network

represents a cluster.

2.3. Fuzzy K-means clustering analysis

Fuzzy set theory was introduced in the 1960s as a

way of explaining uncertainty in data structure (Zadeh,

1965). Fuzzy K-means (also known as fuzzy c-means)

clustering has been investigated by Bezdek (1981) and

was compared to the non-fuzzy clustering method.

Hruschka (1986) and Weber (1996) showed in their

empirical study that fuzzy clustering provided more

insight than non-fuzzy clustering in terms of market

segment information.

Fuzzy clustering segments the samples into 1 , K , N

clusters, estimates sample cluster membership and simul-

taneously estimates the cluster centers. The cluster member-

ship of xi in the cluster s; usi; is between 0 and 1 and is

defined as follows (Ozer, 2001)

usi¼1XK

j¼1

kxi2vsk2=ðm21Þ

kxi2vjk2=ðm21Þ

! ; forxi–vj;;s;i;andm.1 ð3Þ

where m is the smoothing parameter which controls the

fuzziness of the clusters, and vs is the vector of cluster

centers ðvs1; vs2;…; vsp;…; vsPÞ defined as

vs ¼

XN

I¼1ðusiÞ

mxiXN

i¼1ðusiÞ

m; ;s: ð4Þ

Optimal value of u is obtained so as to minimize the

following objective function

MinXNi¼1

XKs¼1

ðusiÞmðkxi 2 vsk

2Þ ð5Þ

The constraints used are as follows

0 # usi # 1; ;s; i ð6Þ

XKs¼1

usi ¼ 1; ;s: ð7Þ

H.W. Shin, S.Y. Sohn / Expert Systems with Applications 27 (2004) 27–3328

Condition (6) ensures that the degrees of memberships are

between 0 and 1, and condition (7) means that, for a given

sample, the degrees of membership across the clusters sum

to one. Once optimal values of u are found, a case with

highest associated u is assigned a corresponding cluster.

2.4. Performance comparison of the three

clustering methods

We compare the performances of these clustering

methods using ‘intraclass’ method presented in Michaud

(1997). Intraclass inertia is a measure of how compact each

cluster (class) is when the number of cluster is fixed.

Usually the variables are scaled to be in the same range

(Nair & Narendran, 1997). The mean of the j th cluster Cj

that has nj samples is defined as

�xj ¼ ð�xj1; �xj2;…; �xjp;…; �xjPÞ;

where xjp ¼ ð1=njÞXi[Cj

xip ð8Þ

The intraclass inertia Ij of cluster j is defined as

Ij ¼Xi[Cj

XPp¼1

ðxip 2 �xjpÞ2 ð9Þ

Finally, the intraclass inertia FðKÞ for a given K clusters is

defined as

FðKÞ ¼1

n

XKj¼1

njIj ¼1

n

XKj¼1

Xi[Cj

XPp¼1

ðxip 2 �xjpÞ2 ð10Þ

One can see that FðKÞ is the average squared Euclidean

distance between each observation and its cluster mean.

3. A case study

We randomly select 3000 customers who had transaction

records from the middle of July to the middle of October in

1999 from stock corporation ‘A’ and apply the three

clustering methods.

The stock transaction modes used are either representa-

tive assisted or online HTS. HTS customers directly buy and

sell their stocks without the advice of the corporation’s

representatives. Results of the descriptive statistics of the

sample data are given as follows.

About 78% of the total trade amount was made by

online HTS.

In terms of gender, 68% of the customers are male.

However, average trade amount made by female customers

by both modes were 51 and 52%, respectively, for

representative assisted and online HTS. This suggests the

importance of marketing strategy for HTS and female

customers. In terms of age, those who are older than 60 used

representative assisted mode mostly. Also, their trade

amount is the highest among various generations in both

transaction modes. In terms of the average transaction

frequency, representative assisted mode is 1.8 times while

online HTS is six times per month, respectively.

We also estimate correlation between the trade amount

made by each transaction mode and the sum of

them. Apparently the correlation between the two modes

is relatively low (0.38) while those between single mode and

the total transactions are 0.76 and 0.89, respectively, for

representative assisted and online HTS.

3.1. Cluster analysis of customers

Clustering methods are used to segment the customers

for both modes, respectively, using two variables for

clustering of customers each mode. Variables used for

cluster analysis for representative assisted mode are both

‘total trade amount’ and ‘representative assisted trade

amount’ over the 3-month period. In the case of HTS

mode, we use both total trade amount and ‘trade amount in

HTS’ over the 3-month period.

Customers are segmented into three clusters (Normal,

Best, VIP customers). After some experimentation with the

parameters of clustering methods we set the following

parameters: SOM learning rate ðaÞ is equal to 0.1 and fuzzy

K-means smoothing parameter ðmÞ is equal to 1.2. Fuzzy

K-means smoothing parameter ðmÞ is equal to 1.2.

For comparison purpose, the resulting compactness of

clusters of the three clustering methods (K-means, SOM,

fuzzy K-means) is summarized in Table 1.

In case of customer segmentation in the representative

assisted mode, K-means clustering method turns out to be

Table 2

The segmentation of customers in representative assisted mode using

K-means

Number of

customers

Cluster center

Total trade amount

for 3 months (units: won)

Trade amount in

representative assisted

mode for 3 months

(units: won)

Normal 2969 31.0 million 6.4 million

Best 30 84.0 billion 25.6 billion

VIP 1 558.0 billion 297.3 billion

Table 1

Intraclass inertia of each clustering method

Clustering method Mode Intraclass inertia

K-means Representative assisted mode 7.2685 £ 1016p

HTS 1.09 £ 1017

SOM Representative assisted mode 1.04394 £ 1017

HTS 1.12 £ 1018

Fuzzy K-means Representative assisted mode 7.29 £ 1016

HTS 9.55 £ 1016p

H.W. Shin, S.Y. Sohn / Expert Systems with Applications 27 (2004) 27–33 29

the best while in the segmentation of customers in HTS,

fuzzy K-means method is the winner. Table 2 and Fig. 1

represent the segmentation of customers in Representative

assisted mode using K-means clustering method while

Table 3 and Fig. 2 represent the segmentation of customers

of HTS using fuzzy K-means.

The results indicate that the number of Best customers

and VIP customers are small in the case of representative

assisted mode compared to HTS.

As shown in Figs. 1 and 2, there is a particular data that

have a very large amount of total trade (558 billion won for

3 months) among VIP customers. This customer may be

considered as an outlier. Therefore, we compare the

clustering results without this particular customer. Results

are given in Table 4.

In this case, fuzzy K-means has the best performance in

representative assisted mode. SOM is the most suitable in

HTS, but fuzzy K-means produces fairly good performance

as well. Generally, we can conclude that fuzzy K-means

provides relatively robust results in terms of intraclass

inertia for both modes.

3.2. Classification of three group of customers

In practice, we need threshold values to classify the three

different groups We use decision tree to find the threshold

values for customer segmentation of both transaction

modes. The class (Normal, Best, VIP) of outcome is

categorized by fuzzy K-means after deleing an outlier.

Seventy percentage of 2999 (except a particular customer)

customers’ data are assigned for training while 30% are

assigned for validation using a segment based stratified

sampling approach. We then use CART algorithm to find

the threshold values for the three groups.

Trees in Figs. 3 and 4 show the threshold values for

customer segmentation. From Fig. 3, if the total trade

amount of both modes for three months is less than

about 19.3 billion won, they are defined as Normal

customers. Also if the total trade amount of both modes

for 3 months is more than 19.3 billion won and the trade

amount in the representative assisted mode for 3 months

is less than 125 billion won, they are defined as Best

customers. The others customers are VIP customers.

From Fig. 4, if the trade amount in HTS for 3 months is

less than about 13.6 billion won and the total trade amount

of both modes is less than 23.3 billion won, they are defined

as Normal customers. Also, if the trade amount in HTS for 3

months is more than 13.6 billion won and the total trade

amount of both modes is more than 75.9 billion won, they

are defined as VIP customers. The rest of them are

considered as Best customers.

Fig. 1. Transaction distribution in a representative assisted mode for 3 months.

Table 3

The segmentation of customers of HTS using fuzzy K-means

Number of

customers

Cluster center

Total trade

amount for

3 months (units: won)

Trade amount

in HTS mode

for 3 months (units: won)

Normal 2915 26.1 million 19.9 million

Best 80 41.8 billion 30.9 billion

VIP 5 261.8 billion 199.1 billion


4. New brokerage commission policy

In this section, we suggest the graded brokerage

commission policy based on the three clusters of customers.

The new policy must be effective enough to avoid the

churning behavior of the existing customers and at the same

time it should result in sufficient profit to the security

corporation. As shown in Table 5, we suggest that the

proposed commission of Normal, Best, and VIP customers

be 0.5, 0.45, and 0.4% in the representative assisted mode

while 0.18, 0.1, and 0.06% for HTS, respectively. This

policy is then compared to the existing commission system

of A stock corporation (see Table 6).

Next, we compare the profit of existing commission

policy with the profit of the proposed commission policy in

Table 7. The proposed commission policy is based on the

threshold values obtained by decision tree using fuzzy

K-means algorithm.

As shown in Table 7, one can see that the new policy

would provide the expected profit which is similar to that of

the existing policy. However, it should be noted that the

proposed commission policy have additional positive effects

on customer relationship management (CRM) by recogniz-

ing the value of different levels of customers. Therefore, in a

long run, we can conclude that the new policy would bring

higher profit than the existing commission policy.

5. Conclusion

In this article, we found a fuzzy K-means clustering

being the most stable to group stock trading customers and

used it to classify three tiers of customers (Normal, Best,

and VIP level) based on the total trade amount over 3-month

period. For each group, different brokerage commission rate

is assigned as 0.4, 0.45, and 0.5% for the representative

assisted mode while 0.06, 0.1, and 0.18% for HTS.

This approach is different from the existing graded

commission policy in that the proposed policy adopts the

idea of the graded commission based on the historically

accumulated transaction amount made by customer. This

new approach is expected to bring more profit by treat loyal

customers in a better manner and subsequently retain them in

a longer term.

Data used in this article for clustering contain relatively

short history of customers’ transaction. After data ware-

housing project is completed and it accumulates a larger

amount of information, clustering may need to be re-done for

tuning.

Our new policy is mainly dependent on the cumulative

transaction. Some other facts such as frequency of

transaction may need to be included in the policy.

Fig. 2. Transaction distribution in HTS for 3 months.

Table 4

Intraclass inertia by clustering method (without a particular customer)

Cluster analysis

method

Mode Intra-class Inertia

K-means Representative assisted mode 6.19 £ 1016

HTS 6.82 £ 1016

SOM Representative assisted mode 7.45 £ 1016

HTS 5.43 £ 1016p

Fuzzy Representative assisted mode 4.12 £ 1016p

K-means HTS 5.48 £ 1016


Fig. 4. Classifying the customer for HTS mode (unit: won, the number in the parenthesis is the is the count per class).

Fig. 3. Classifying the customers of the Representative assisted mode (unit: won, the number in the parenthesis is the count per class).


More variations of approach based on the longer time-series

data set are left for further study areas.

Acknowledgement

This work was supported by grant No. R04-2002-000-

20003-0 from Korea Science & Engineering Foundation.

References

Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function

algorithms. New York: Plenum Press.

Hartfeil, G. (1996). Bank one measures profitability of customers, not just

products. Journal of Retail Banking Services, 18(2), 23–29.

Hruschka, H. (1986). Market definition and segmentation using fuzzy

clustering methods. International Journal of Research in Marketing, 3,

117–134.

Hunt, P. (1999). The pricing is right. Canadian Insurance Statistics,

26–28.

Kim, K. S., & Han, I. (2001). The cluster-indexing method for case-based

reasoning using self-organizing maps and learning vector quantization

for bond rating cases. Expert Systems with Applications, 21(3),

147–156.

Kohonen, T. (1982). Self-organized formation of topologically correct,

feature maps. Biological Cybernetics, 43(1), 59–69.

Mangiameli, P., Chen, S. K., & West, D. A. (1996). Comparison of SOM

neural network and hierarchical clustering methods. European Journal

of Operational Research, 93(2), 402–417.

Michaud, P. (1997). Clustering techniques. Future Generation Computer

System, 13(2), 135–147.

Nair, G. J., & Narendran, T. T. (1997). Cluster goodness: a new measure of

performance for cluster formation in the design of cellular manufactur-

ing systems. International Journal of Production Economics, 48(1),

49–61.

Ozer, M. (2001). User segmentation of online music services using fuzzy

clustering. Omega, 29(2), 193–206.

Weber, R. (1996). Customer segmentation for banks and insurance groups

with fuzzy clustering techniques. In J. F. Baldwin (Ed.), Fuzzy logic.

New York: Wiley.

Zeithaml, V. A., Rust, R. T., & Lemon, K. N. (2001). The customer

pyramid: creating and serving profitable customers. California

Management Review, 43(4), 118–142.

Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–353.

Table 6

Currently used commission rates of ’A’ stock corporation

Mode Amount of transaction Brokerage commission

Representative

assisted mode

Under 200 million 0.5%

From 200 to 500 million 0.45% þ 1000

Over 500million 0.4% þ 500

HTS Under 250million 0.23%

From 250 to 500 million 0.19% þ 1000

From 500 to 1000 million 0.17% þ 500

From 1000 to 3000million 0.15%

3000 million 0.09%

Table 7

Comparison of the two commission policies in ‘A’ stock corporation (unit:

won)

Class Profit by the existing

commission policy

Profit by the proposed

commission policy

Representative assisted

mode

1,473,640,285 1,428,058,209

HTS 1,349,532,283 1,356,896,165

Total commission 2,823,172,568 2,784,954,374

Table 5

Newly proposed commission rate

Brokerage commission

in representative

assisted mode (%)

Brokerage

commission in HTS (%)

Normal 0.5 0.18

Best 0.45 0.1

VIP 0.4 0.06


Documents

Paper4-10.1.1.96.9084