Upload
haneesha-muddasani
View
217
Download
0
Embed Size (px)
Citation preview
7/29/2019 Sales Using ANN n Fuzzy
1/15
.Computers in Industry 37 1998 115
An intelligent sales forecasting system through integration ofartificial neural network and fuzzy neural network
R.J. Kuo a,), K.C. Xue b
aDepartment of Marketing and Distribution Management, National Institute of Technology at Kaohsiung, Kaohsiung County, 824, Taiwan
bGraduate School of Management Science, I-Shou Uniersity, Kaohsiung County, 80048, Taiwan
Received 3 May 1997; accepted 16 February 1998
Abstract
This paper attempts to develop an intelligent sales forecasting system which can consider the quantitative factors as well . .as the non-quantitative factors. The proposed forecasting system consists of four parts: 1 data collection, 2 general pattern
. .model, 3 special pattern model, and 4 decision integration. In the general pattern model, a feedforward neural network .with error-backpropagation EBP learning algorithm is employed to learn the time series data, or quantitative factors.
However, unique circumstances, e.g., promotion, may cause a sudden change in the sales pattern. To this end, this paper .utilizes fuzzy logic which is capable of learning fuzzy neural network, FNN to learn the experts knowledge regarding the
effect of promotion on the sales. Finally, the outputs from the above two mentioned models are integrated with time effect
through a feedforward neural network with EBP learning algorithm. Evaluation of the model results indicates that the
proposed system performs more accurately than the conventional statistical method and single artificial neural network
.ANN .q
1998 Elsevier Science B.V. All rights reserved.
Keywords: Sales forecasting; Artificial neural networks; Fuzzy neural networks
1. Introduction
To enhance the commercial competitive advan-
tage in a constantly fluctuating environment, an or-
ganizations management must make the right deci-
sion in time depending on the information at hand.
The decision lead time ranges from several years toseveral hours based on the types of business. Thus,
making an accurate decision plays a prominent role.
)
Corresponding author. Tel.: q886-7-6011000 ext. 4213;e-mail: [email protected]
Intuitively, forecasting models can provide rea-
sonable estimates by using historical data. Therefore,
if the marketing department can estimate the sales
quantity for the next period, the materials department
can then effectively control the inventory to achieve .just-in-time JIT . In addition, the production depart-
ment can make the scheduling and arrange the facil-ity utilization. Such an action may cause the produc-
tion cost to decrease. Therefore, obtaining an accu-
rate forecast appears to be critical. Statistical meth-
ods, such as regression models and ARMA, have
long been candidates for decision makers. However,
these methods are only efficient for data which are
seasonal or cyclical. If the data are influenced by
0166-3615r98r$19.00 q 1998 Elsevier Science B.V. All rights reserved. .PII S 0 1 6 6 - 3 6 1 5 9 8 0 0 0 6 6 - 9
7/29/2019 Sales Using ANN n Fuzzy
2/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 1152
special cases, like promotion, they are inaccurate. .Artificial neural networks ANNs that are better
w xthan the conventional statistical methods 1,2 havebeen recently employed, but even then the problem
still arises. Thus, this paper first attempts to develop
a forecasting system capable of handling such a
circumstance, i.e., promotion.
Fuzzy logic has been applied in the area of con-
trol and has shown highly promising results. Fuzzy
logic attempts to capture vague experts knowledge.
However, the setup of experts knowledge resulting
from the membership function setup of fuzzy sets is
quite subjective. Therefore, this paper also aims to
develop an learning algorithm for fuzzy logic, called .fuzzy neural network FNN .
Thus, this paper develops a sales forecasting sys- .tem which consists of four parts: 1 data collection,
. . .2 general pattern model ANN , 3 special pattern
. . .model FNN , and 4 decision integration ANN .To evaluate the proposed system, the actual data
provided by a chain supermarket company are used.
According to these results, the proposed system per-
forms more accurately than the conventional statisti-
cal method and single ANN, particularly when the
promotion is conducted.
The rest of this paper is organized as follows.
Section 2 provides some necessary background in-
formation while the proposed system is discussed in
Section 3. Section 4 summarizes the evaluation re-
sults. Discussion and concluding remarks are finallymade in Sections 5 and 6, respectively.
2. Background
This section briefly reviews the sales forecasting
methods and the applications of artificial neural net-
works in this area. In addition, fuzzy modeling and
fuzzy neural networks are also discussed.
2.1. Sales forecasting
Sales forecasting always plays a prominent role in
a decision support system. Obtaining effective sales
forecasting in advance can help the decision maker
calculate production and materials costs, even deter-w xmine the sale price 3 . This will result in a lower
inventory level and achieve the objective of just-in-
time. Regarding conventional sales forecasting meth-w xods 46 , most of them used either factors or time
series data to determine the forecast. However, the
relationship between the factors or the past time .series data independent variables and the sales data
.dependent variable is always quite complicated.Obtaining results through the above mentioned ap-
proaches is quite difficult. Therefore, various deci-
sion makers prefer using their own intuition, insteadof model-based approaches i.e., time series or re-
.gression models . However, there is a model-freeapproach, ANN, that can be applied in the area of
forecasting owing to its adequate performance in
control and pattern recognition. Thus, ANN is re-
viewed in Section 2.2.
2.2. Artificial neural networks in sales forecasting
.Artificial neural network ANN is a system de-rived through models of neurophysiology. In general,
it consists of a collection of simple nonlinear com-
puting elements whose inputs and outputs are tied
together to form a network. The learning algorithms
of ANNs can generally be divided into three differ-
ent types: supervised, unsupervised, and hybrid
learning.
Many studies have attempted to apply ANN to
time-series forecasting. However, their conclusions
are often contradictory. Some studies conclude that
w xANNs are better than conventional methods 7 , whilew xothers reach an opposite conclusion 8 . Weigen etw xal. 7 introduced the eight-elimination backpropa-
gation learning procedure to effectively deal with the
overfitting problem, and applied it to sunspots and anw xexchange rate time series. Tang et al. 8 compared
the ANN and BoxJenkins models, using interna-
tional airline passenger traffic, domestic car sales
and foreign car sales in the USA. They concluded
that the BoxJenkins models outperformed the ANN
models in short-term forecasting. On the other hand,
the ANN models outperformed the BoxJenkins inlong term forecasting.w xChakraborty et al. 1 presents an ANN approach
to multivariate time-series analysis. They accurately
predicted the flour prices in three cities in USA.
According to their results, the ANN approach is a
leading contender among statistical modeling ap-
proaches.
7/29/2019 Sales Using ANN n Fuzzy
3/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 115 3
w xLachtermacher and Fuller 2 developed a cali-brated ANN model. The model used BoxJenkins
methods to identify the lag components of the data,
that should be used as input variables. Also, it
employed a heuristic to suggest the number of hid-
den units needed in structuring the model. In examin-
ing the stationary series, they observed that the
calibrated ANN models have only a slightly better
overall performance than the conventional time-series
methods used in the benchmark. In the case of a
non-stationary series, the calibrated ANN models
outperformed the ARMA model for three of the four
series, and performed almost as well as the ARMA
in fourth series. The above survey indicates that
ANN is more appropriate for the time series data.
However, only considering the time series data may
result in a bad forecast. Including both the time
series data and factors in the forecasting model
seems to be preferable. Fuzzy logic has not onlybeen successful in engineering application, but also
capable of replacing these factors by fuzzy logic as
introduced in Section 2.3.
2.3. Fuzzy modeling
The theory of fuzzy sets was first developed byw xLotfi Zadeh 9 , primarily in the context of his
interest in the analysis of complex systems. How-
ever, some of the key ideas of the theory were
envisioned by Max Black, a philosopher, almost 30w xyears prior to Zadehs seminal paper 10 . Basically,the concept of the fuzzy set is a generalization of the
classical or crisp set.
The crisp set is defined in such a way as to
dichotomize the individuals in some given universeof discourse into two groups: members those that
. certainly belong in the set and nonmembers those.that certainly do not . A sharp, unambiguous distinc-
tion exists between the members and nonmembers of
the class or category represented by the crisp set.
However, many of the collections and categories do
not have this kind of characteristic. Instead, their
boundaries seem vague, and the transition from
member to nonmember appears gradual rather than
abrupt. Thus, the fuzzy set introduces vagueness by
eliminating the sharp boundary dividing members of
the class from nonmembers. The logic based on
fuzzy set theory is called fuzzy logic. A simple
example is used to explained the general concept of
the fuzzy modeling. For instance, two arbitrary rules
are illustrated as follows. . .Rule 1: IF X is Small A and Y is Small B1 1
THEN Z is f1 . .Rule 2: IF X is Large A and Y is Large B2 2
THEN Z is f2The membership functions for X and Y are shown
in Figs. 2 and 3. Both X and Y are divided into four . .linguistic variables, say very small VS , small S ,
. .large L , and very large VL . Each rule has apremise, or IF part, which contains several precondi-
tions and a consequent, or THEN part which de-
scribes the value of one or more output actions. Now
suppose there are two preconditions, x and y, for
fuzzy variables X and Y, respectively. Then, for rule
1, the membership function values are represented as . .m x and m y for A and B , respectively. Simi-1 1 1 1
. .larly for rule 2, we have m x and m y as the2 2membership function values. Hence, the firing
. .strength of rule 1 is obtained as w sm x m y ,1 1 1 . .while w s m x m y is the firing strength of rule2 2 2
2.
The overall output O is determine by using cen-
troid defuzzification where
w f i i x =y =f qx =y =f1 2 1 2 1 2iO s 1 .
x =y qx =yw 1 2 2 1 ii
and f is the consequence, or control action, value ofiw xrule i 11 .
During the past several years, fuzzy modeling has
been applied in many practical areas, ranging from
finance to engineering. In particular, fuzzy control
has emerged as one of the most promising areas for
research in the application of fuzzy modeling. In
many applications, such as control, fuzzy-based sys-
tems have proved to be superior in performance to
conventional systems.
2.4. Fuzzy neural network
An ANN, which is employed for recognition pur-
poses, generally lacks the ability to be developed for
a given task within a reasonable time. On the otherw xhand, fuzzy modeling 9,12 , which is applied to fuse
7/29/2019 Sales Using ANN n Fuzzy
4/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 1154
the decisions from the different variables, requires anapproach to learn from experience i.e., data col-
.lected in advance . Therefore, how to successfullycombine these two approaches, ANNs and fuzzy
modeling, has become the a relevant concern of
further studies.
Recently, ANNs learning algorithm has been ap-
plied to enhance the performance of a fuzzy system
and has been demonstrated to be an innovative ap-
proach. Also, fuzzy ifthen rules were generated and
adjusted by learning methods using numerical data.w xLin and Lee 13,14 proposed the so-called Neural-
Network-Based Fuzzy Logic Control System NN-.FLCS . They introduced the low-level learning power
of neural networks in the fuzzy logic system and
provided high-level human-understandable meaning
to the normal connectionist architecture. Also, Kuow x15 introduced a feedforward ANN into fuzzy infer-
ence represented by Takagis fuzzy modeling.The above mentioned FNNs are only appropriate
for numerical data. However, the experts knowledge
is always of a fuzzy nature. Thus, some researchers
have attempted to address this problem. Ishibuchi etw xal. 16,17 proposed learning methods of neural net-
works to utilize not only numerical data but also
expert knowledge represented by fuzzy if then rules.w xLin and Lee 18 also presented a FNN, capable of
handling both the fuzzy inputs and outputs. Mean-w xwhile, Buckley and Hayashi 19 surveyed recent
results on learning algorithms and applications forFNNs.
3. Methodology
Section 2 has emphasized the relevance of sales
forecasting as well as some necessary background
information. The proposed forecasting system is dis-
cussed in more detail in this section.
The proposed intelligent forecasting system con- . .sists of 1 data collection, 2 general pattern model
. . . .ANN , 3 special pattern model FNN , and 4 .decision integration ANN . Fig. 1 shows the pro-
posed system architecture. Basically, the system de-
termines the forecasted product and the factors af-
fecting the sales first. Thereafter, the general pattern
of the sales is forecasted by an ANN while FNN
considers the effect on the sales if a promotion is
conducted. Finally, the decisions from these two
networks and time effect are integrated through the
other ANN. Each part is thoroughly discussed in the
following subsections.
3.1. Data collection
The first part of the proposed forecasting system, .data collection, concerns two data resources: 1
.quantitative factors and 2 non-quantitative factors.The quantitative factors can be accumulated from the
supermarket company while the questionnaire survey
can provide the non-quantitative data.
3.2. ANN architecture
.In this part, i.e., ANN , a feedforward ANN with .error backpropagation EBP learning algorithm is
employed. Fig. 2 shows the general structure of the
one-hidden-layer ANN. The ANNs input layer with
some neurons represents the previous sales data, say
period t-p to period t-1, which are connected to the
hidden layer. The hidden layer with some neurons is
connected with the output layer with one single
neuron which represents the sales for period t. Usu- . ally, the input neurons use y sf x sx no change
.in input and all the other neurons have the sig-
. ..y1
moidal function y sf x s 1q exp yx . Theobjective is to minimize a cost function defined as:
1 2Es T yO , 2 . . p p
2 p
where T and O are the desired and actual outputs,
respectively, and p is the sample number. The
derivation is based on the gradient descent technique
to converge toward improved weights and biases .thresholds . The updating rule for all the weightsand biases is
yEEW t syh q a W ty 1 , 3 . . .k h k h /EWk hwhere h is learning rate and a is the momentum.
Therefore, from the preceding setup and required
data, the ANN can learn the relationship between the
sales at period t and previous sales from periods t-1
7/29/2019 Sales Using ANN n Fuzzy
5/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 115 5
Fig. 1. The architecture of the proposed forecasting system.
to t-p. Thus, just input sales from periods from m-1
to m-p if we intend to forecast sales at period m.
3.3. FNN architecture
In the first part, the ANN can provide us sales
without promotion effects. However, if a promotion
is running, then ANNs forecast will be inaccurate.
Thus, this section discusses how to effectively han-
dle the circumstance of promotion by means of a
FNN. Since the FNN architecture is based on the
fuzzy logic which possesses both the precondition
and consequence, the precondition variables repre-
sent the effective factors while the sales represents
the consequence variable. First, the data and ifthen
rules are obtained through the fuzzy Delphi method.
After this procedure, the collected data can be ap-
plied to train the proposed FNN. The structure ofw xFNN presented in this study is similar to Ref. 17 .
The main difference is that the network employs the
asymmetrical bell shaped instead of triangular fuzzy
weights. In the following, the two components, FNN
and fuzzy Delphi, are discussed in more detail.
7/29/2019 Sales Using ANN n Fuzzy
6/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 1156
Fig. 2. The ANN structure.
3.3.1. FNN with asymmetric bell-shaped fuzzy
weights
Most FNNs can only handle the actual real input
w xand output except 17,18 . Thus, this component
w xintends to modify Ishibuchis work 17 . InIshibuchis work, the input, weight, and output fuzzy
numbers are symmetric triangular. However, this
assumption is not similar to a human beings think-
ing. Thus, this paper replaces the triangular fuzzy
numbers with asymmetric Gaussian functions. The
inputoutput relation of the proposed FNN is dis-
cussed in the following. However, the operations of
fuzzy numbers are presented first.
3.3.1.1. Operations of fuzzy numbers. Before describ-
ing the FNN architecture, fuzzy numbers and fuzzynumber operations are defined by the extension prin-
ciple. In the proposed algorithm, real numbers and
fuzzy numbers are denoted by the lowercase letters .e.g., a, b, . . . and a bar placed over uppercase
.letters e.g., A, B, . . . , respectively.
Fig. 3. The FNN architecture.
Since input vectors, connection weights and out-
put vectors of multi-layer feedforward neural net-
works are fuzzified in the proposed FNN, the addi-
tion, multiplication and nonlinear mapping of fuzzy
numbers are necessary for defining the proposed
FNN. Thus, they are defined as follows:
Z z sX x qY y . . .
7/29/2019 Sales Using ANN n Fuzzy
7/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 115 7
For hidden layer,
L Uw x w x w xO a s O a ,O ap h p h p h
L Uw x w xs f Net a ,f Net a , 8 .p h p h / /
hs
1 , 2 , . . . ,n ,H
nI
L L Lw x w x w xNet a s W a PO ap h h i p iis1
Lw xW a G0hi
nI
L Uw x w xq W a PO a hi p iis1
Lw xW a -0hi
Lw xqQ a , 9 .hnI
U U Uw x w x w xNet a s W a PO ap h h i p iis1
Uw xW a G0hi
nI
U Lw x w xq W a PO a h i p iis1
Uw xW a -0hi
Uw xq Q a , 10 .h
For output layer,
L Uw x w x w xO a s O a ,O ap k p k p k
L Uw x w xs f Net a ,f Net a ,p k p k / /
s1 , 2 , . . . ,n , 11 .OnO
L L Lw x w x w xNet a s W a PO ap k k h p hks1
L
w xW a G0kh
nO
L Uw x w xq W a PO a k h p hks1
Lw xW a -0kh
Lw xqQ a , 12 .k
nO
U U Uw x w x w xNet a s W a PO ap k k h p hks1
Uw xW a G0kh
nO
U Lw x w xq W a PO a k h p hks1
Uw xW a -0kh
Uw xq Q a . 13 .kThe objective is to minimize the cost function
defined as:
nO
L UE s a E qE s E , 14 . . p k a . k a . p a .a aks1
where
nO
L U
E s a E qE 15 . .p a . k a . k a .ks11 2L LL w x w xE s T a yO a , 16 . /k a . p k p k 21 2U UU w x w xE s T a yO a . 17 . /k a . p k p k 2
Where EL and EU can be viewed as the squaredk a . k a .errors for the lower boundaries and the upper bound-
aries of the a-cut sets of a fuzzy weight. Other a-cut
sets of a fuzzy weight are independently modified to
reduce E . Otherwise, the fuzzy numbers afterp a .modifications are distorted. Therefore, each fuzzy
weight is updated similar to but different from thew xapproach of Ishibuchi 17 . That is, in the proposed
FNN, the membership functions are asymmetric .Gaussian functions i.e., a general shape which is
represented as:
1 x ymexp y , x-m
L / /2 s~ 1, x smA x s 18 . .
2
1 x ymexp y , otherwise U / /2 sThus, the asymmetric Gaussian fuzzy weights are
specified by their three parameters i.e., center right.width and left width . The gradient search method is
derived for each parameter similar to error-backprop-
agation learning algorithm.
7/29/2019 Sales Using ANN n Fuzzy
8/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 1158
3.3.2. Fuzzy delphi
Section 3.3.1 has provided us a novel FNN to
learn the relationship between the fuzzy inputs and
fuzzy outputs. Thus, this subsection proposes a pilot
study, say fuzzy Delphi, for extracting the fuzzy
inputs and outputs.
Delphi method was first developed by Dalkey in
Rand Corporation. This approach has been widely
applied in many management areas, e.g., forecasting,
public policy analysis, or project planning. However,
the conventional Delphi method does not converge
very well. Besides, high survey frequencies alwaysw xresult in high costs. Thus, Ishikawa et al. 20 uti-
lized fuzzy sets theory in the Delphi method to
resolve the above shortcomings. This study modifiesw xthe method proposed by Ishikawa et al. 20 since the
Gaussian function instead of triangular function is
used. In addition, the other difference is that this
study puts three indexes pessimistic, optimistic and.average indexes in the questionnaire in order to
accelerate the convergence. Therefore, the proce-
dures of the modified fuzzy Delphi method for this
research are as follows.
a. Collect all the possible factors which may
affect the sales and make the sortition, grouping in
order to formulate the first questionnaire. In this
questionnaire, all the factors are separate and sur-
vey results are the FNN inputs.b. Select one event from each group or dimen-
.sion to formulate an ifthen rule in order to formthe second questionnaire which is a set of ifthen
rules.
c. Fuzzify the returned second questionnaires from
the senior managers and determine the pessimistic
index, optimistic index and average index. The for- . .mulations are as follows: 1 Pessimistic Minimum
index
l q l q . . . ql1 2 nl s 19 .
n
where l is the pessimistic index of the ith experti .and n is the number of the experts. 2 Optimistic .Maximum index
u qu q . . . qu1 2 nu s 20 .
n
.where u is the optimistic index of the ith expert. 3i .Average most appropriate index . For each interval
. .l , u , calculate the mid point, m s l qu r2i i i i iand then find
1rnms m =m =PPP=m 21 . .1 2 n
R L .Thereafter, the fuzzy number A s m, s , s ,which represents the mean, right width, and left
width, respectively, for an asymmetrically bell shaped
function, can be determined through the above in-
dexes:
l y mRs s 22 .
3u ym
Ls s 23 .3
d. Formulate the third questionnaire with the above
indexes and make the survey.
e. Repeat the procedure c.w xf. Employ dissemblance index rule 21 to exam-
ine the second and third questionnaire fuzzy num-
bers. Restated, the dissemblance index rule deter-
mines whether the membership functions have con- .verged or not Fig. 4 . If not, continue the next
questionnaire until converging. Otherwise, results of
the third questionnaire are the FNN output. The .distance between two fuzzy numbers A and B is
11w x w xd A ,B s d A a ,B a d as b y b . . .H 2 12as0
=
1 L L
w x w xA a yB aH /as0U Uw x w xq A a yB a d a 24 . /
where b and b are given any convenient values in1 2w x w x order to surround both A a s0 and B a s0 Fig.
. . w x3 . Basically, d A,B is in the interval 0,1 .
Fig. 4. The concept of dissemblance index of two fuzzy numbers.
7/29/2019 Sales Using ANN n Fuzzy
9/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 115 9
Fig. 5. The time series data.
3.4. Decision integration
From the above two parts, the ANN provides the
sales without any special promotion while the pro-
motion effect is forecasted by the FNN. To yield the
final result, for the sales which consider both the
general pattern and the promotional effect, only inte-
grating the results from the ANN and FNN is inade-
quate in that the promotional effect during the pro-
motion interval may differ. Thus, this part will em-
ploy the other ANN to combine the ANN result, the
FNN result, and the time effect.
4. Model evaluation results
Section 3 has presented the proposed forecasting
system. To verify the proposed systems feasibility, a
set of real data is collected. In addition, the proposed
system is also compared with the other methods,
such as single ANN and ARMA. Both the proce-
Table 1
The third questionnaire
Related products without promotion 15-day promotion length
Promotion methods Advertising media Reference index
Pessimistic Average Optimistic Affective interval
$10 discount 1. From 2000 to 2100 h on TV 7.2 7.8 8.6 _to _
2. At noon on TV 2 2.1 3.8 _to _
3. In the evening on TV 3.6 3.9 5.2 _to _
4. Radio 4 4.6 5.4 _to _
5. Newspaper 4.8 5.5 6.4 _to _
6. POP notice 6.4 7 8 _to _
7. Poster 6.2 6.5 7.4 _to _
$5 discount 1. From 2000 to 2100 h on TV 5.4 6 7.2_
to_
2. At noon on TV 1 1.1 2 _to _
3. In the evening on TV 3 3.3 4.4 _to _
4. Radio 3 3.4 4.4 _to _
5. Newspaper 3.6 4.2 5.2 _to _
6. POP notice 4.6 5 6.2 _to _
7. Poster 5 5.3 6.2 _to _
$3 discount 1. From 2000 to 2100 h on TV 1 1.3 2.3 _to _
2. At noon on TV 0.8 1 1.8 _to _
3. In the evening on TV 0.8 1 1.8 _to _
4. Radio 1.8 2 3.2 _to _
5. Newspaper 1 1.3 2.3 _to _
6. POP notice 1.8 1.9 3.3 _to _
7. Poster 2.8 2.9 4.2_
to_
Buy two get one free 1. From 2000 to 2100 h on TV 6.6 7.3 8.2 _to _
2. At noon on TV 3.2 3.2 4.6 _to _
3. In the evening on TV 3.6 3.7 5 _to _
4. Radio 4.6 5.2 6.2 _to _
5. Newspaper 4.8 5.4 6.4 _to _
6. POP notice 5.8 6.5 7.4 _to _
7. Poster 6 6.6 7.4 _to _
7/29/2019 Sales Using ANN n Fuzzy
10/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 11510
dures and results are sequentially shown in the fol-
lowing subsections.
4.1. Data collection
The data are provided by a locally well-known
chain supermarket. Since the forecasting pattern is
divided into two categories, general pattern and spe-
cial pattern, the data collection is also comprised of
two parts, as follows.
4.1.1. Time series data
The company provides the daily sales of 500 cc
papaya milk. The total number of the data points is
379 as shown in Fig. 5. The sudden convex indicates
that a promotion is conducted. In total, there are five
promotions. The date lasts from January 1, 1995 to
January 14, 1996. For the purpose of testing, these
379 data points are further divided into trainingsamples and testing samples. The front one has 334
data points while the latter one has 45 data points.
4.1.2. Expert questionnaire
To survey all the possible factors of promotion
and their effects on the sales, this paper employs
fuzzy Delphi method. The questionnaires setup is
based on related references, such as professional
journals and senior managers knowledge. The pro-
Fig. 6. Four fuzzy numbers.
cedures based on the modified fuzzy Delphi method
presented in the above section are as follows.
4.1.2.1. Factors determination. A large number of
factors can generally affect the sales. However, a
different product has different characteristics. Afterdiscussing with the senior managers, all the factors
are divided into three dimensions. The first dimen-
sion represents the methods of promotion, while the
types of advertising media are presented in the sec-
ond dimension. The third dimension represents the
competitors actions.
4.1.2.2. First questionnaire formulation. After the
events have been determined in the above procedure,
Table 2
The fuzzy number of each event for the third questionnaire
R L . . . . .Factors Events Average m s s lym r 3 s s u ym r 3
Promotion methods $10 discount 0.85 0.0310 0.0338
$5 discount 0.48 0.0257 0.0378
$3 discount 0.21 0.0206 0.0454
Buy two get one free 0.70 0.0266 0.0369
Advertising media At night on TV 0.77 0.0345 0.0387
At noon on TV 0.32 0.0159 0.0523
In the evening on TV 0.36 0.0223 0.0524
Radio 0.44 0.0126 0.0534
Newspaper 0.43 0.0213 0.0492
POP notice 0.69 0.0255 0.0325Poster 0.65 0.0290 0.0365
Promotion length 15 days 0.54 0.0221 0.0432
20 days 0.58 0.0343 0.0477
30 days 0.68 0.0227 0.0413
45 days 0.62 0.0245 0.0442
Others Related products without promotion 0.73 0.0259 0.0348
Related products with promotion 0.48 0.0177 0.0343
7/29/2019 Sales Using ANN n Fuzzy
11/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 115 11
Table 3
The combinatorial fuzzy number for each linguistic term
R LLevels Mean m Right standard deviation s Left standard deviation s
Poor 0.234188 0.039563 0.066271
Medium 0.446196 0.046899 0.059435
Good 0.637987 0.047829 0.049004
Very good 0.778053 0.023351 0.029316
these events can be used to formulate the first ques-
tionnaire. Provide the questionnaires to the senior
managers for survey. Calculate the membership func-
tion for each event. Results obtained from the first
questionnaire can provide the degree of importance
for each event and are the FNN inputs.
4.1.2.3. Second questionnaire formulation. Since the
questionnaire attempts to provide training data forFNN, the events of all three dimensions can be used
to formulate the second questionnaire which is a set .of ifthen rules Table 1 . Totally, there are 56
.4=7=2 ifthen rules. For example, the first ruleof the questionnaire is
if
discount is $10 and advertising time is from
2000 to 2100 h on TV and no related product is
under promotion
then
the effect on sales ranges from _ to _.
After the survey of this questionnaire, all the
required information, i.e., pessimistic, optimistic,
and average indexes, is calculated on the basis of
Section 3.
4.1.2.4. Third questionnaire surey. Formulate the
third questionnaire which includes the pessimistic,optimistic and average indexes. Continue the next .third survey and calculate all the information again.Table 2 presents the fuzzy number of each event.
However, directly using these fuzzy numbers is
time-consuming and complicated. Therefore, all the
similar events should be combined together. Finally,
only four linguistic variables are used: bad, medium,
good and very good. Fig. 6 and Table 3 present the
fuzzy numbers of these four linguistic terms.
Table 4
Dissemblance index rule testing results for separate events
.Factors Events d A,B ds0.06 ds0.08
Promotion methods $10 discount 0.010842 Accept Accept
$5 discount 0.012683 Accept Accept
$3 discount 0.055033 Accept Accept
Buy two get one free 0.021496 Accept Accept
Advertising media at night on TV 0.008304 Accept Accept
at noon on TV 0.072394 Reject Accept
In the evening on TV 0.036122 Accept Accept
Radio 0.020922 Accept Accept
Newspaper 0.039173 Accept Accept
POP notice 0.017868 Accept AcceptPoster 0.011316 Accept Accept
Promotion length 15 days 0.032976 Accept Accept
20 days 0.015148 Accept Accept
30 days 0.006339 Accept Accept
45 days 0.013148 Accept Accept
Others Related products without promotion 0.031801 Accept Accept
Related products with promotion 0.02144 Accept Accept
7/29/2019 Sales Using ANN n Fuzzy
12/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 11512
Table 5
The dissemblance index rule test results for combinatorial events
.Sales linguistic terms d A,B ds0.06 ds0.08
Poor 0.018939 Accept Accept
Medium 0.004207 Accept Accept
Good 0.013379 Accept Accept
Very good 0.005568 Accept Accept
4.1.2.5. Similarity testing. To determine the neces-
sity of the next survey, the dissemblance index rule
should be utilized. The single event results and
combinatorial events results are summarized in Ta-
bles 4 and 5, respectively. This finding suggests that
the membership function of each linguistic term has
converged. The next survey does not need to be
performed. Therefore, this knowledge base will be
applied to train the FNN and represents the FNN
outputs.
( )4.2. General pattern model ANN
Only 379 data points are used for training. How- .ever, in a general pattern model ANN , the data
points under promotion are not included in the train-
ing samples. Thus, after subtracting the number of
promotion data points from 379, the number of
training samples is 288. This study will test different
alternatives, say network structure. For instance, pat-
tern 1 in Table 6 suggests that the input nodes are . .t-1 th and t-2 th sales which are connected to twohidden nodes which are connected to one output
.node, t th sales. However, all of these data pointsw xshould be normalized in 0,1 . The training rate and
momentum are set to be 0.3 and 0.8, respectively.
The network will not stop learning until 50,000
Table 7 y3 .MSEs =10 for different FNN setup
Momentum Training rate
0.1 0.3 0.5
0.1 8.31 9.10 8.46
0.5 8.15 8.52 8.63
0.8 13.85 8.96 11.30
epochs. In addition, statistical method, ARMA, is .also used to model the above time series data. AR 3
has the MSE value 2.21=10y3, while the network
whose structure is 7=7=1 has the lowest MSE
value 2.20=10y3.
( )4.3. Special pattern model FNN
Data collection demonstrated that the question-
naire should have fifty six ifthen rules. After using
unsimilarity index rule to test the similarity, the .result indicates that both the event newspaper A
. .and event radio B are quite similar in that d A,Bis equal to 0.017966, i.e., being smaller than d, 0.08.
Thus, combining the two events results in the total .number of rules is 48 4=6=2 .
The setup of proposed FNN with asymmetric bell
shaped fuzzy weights is similar to the general ANN.
The network structure consists of three input nodes,
i.e., dimensions, which are connected to six hiddennodes, i.e., numbers which are connected to one
output node. The network will not stop learning until
30,000 epochs. The a-level sets are 0.2, 0.4, 0.6,
0.8, and 1.0. Different training rates and momentum
terms may yield different results. Testing results .Table 7 indicate that the lowest MSE value is
Table 6
The testing alternatives for general pattern model
y3 . . .Pattern number Network structure I=H=O MSE 10 Percentage increase %
.0 AR 3 2.21 ))
1 2=2=1 2.36 6.79
2 2=2=2=1 2.36 6.78
3 5=5=1 2.22 0.45
4 5=5=5=1 2.23 0.76
5 7=7=1 2.20 y0.456 7=7=7=1 2.20 y0.52
7/29/2019 Sales Using ANN n Fuzzy
13/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 115 13
Table 8
MSE values of training results for different models
Forecasting model MSE Percentagey3 .=10 increase of MSE
Integration ANN 3.37 )) .ARMA 2,5 4.88 46.13
.ANN1 5=5=1 4.95 48.35
.ANN2 5=5=5=1 5.50 49.80 .ANN3 10=10=1 4.76 44.77 .ANN4 10=10=10=1 4.77 43.10
8.15=10y3 as training rate and momentum are 0.1
and 0.5, respectively. According to these results, the
smallest MSE is equal to 8.15=10y3 as the training
rate and momentum are equal to 0.1 and 0.5, respec-
tively. Finally, this network becomes integrated with
general pattern model, ANN, in the next part, deci-
sion integration.
( )4.4. Decision integration model ANN
The above two parts can provide results obtainedfrom the general pattern and special pattern promo-
.tion models. This part attempts to integrate theabove two models. Basically, the inputs are originat-
ing from the following three sources:
a. General pattern model provides one input node
which is the general pattern trend;
b. A special pattern model provides five inputnodes which are from FNN output fuzzy number
as a is equal to 0.6, 0.8 and 1.0, respectively; and
c. The promotion length implies one input node.
Therefore, the total number of integration ANN
input nodes is seven. Consequently, the network
structure is 7=7=1. The training rate and momen-
tum are 0.3 and 0.8, respectively. The training limit
is 50,000 epochs. Table 8 presents the MSE value,
which is equal to 3.37=10y3. For the purpose of
Table 9The forecasting results
Forecasting model MSE Percentagey3 .=10 increase of MSE
Integration ANN 3.10 ))
Short-term memory ANN 6.61 113.63
Long-term memory ANN 6.78 118.99
Fig. 7. The integration ANN forecasting result.
comparison, this paper also uses the collected data to
formulate the ARMA model and ANN model. Table
8 summarized those results.
So far, the paper has presented the training
results. In the following, three models, i.e., integra-
.tion ANN, short term memory ANN 5=5=1 , and .long term memory ANN 10=10=1 , are used toforecast the 45 data points with one time of promo-
tion. The data points are the last forty five data
points shown in Fig. 5. The related results are listed
in Table 9 and Figs. 79.
Fig. 8. The short-term memory ANN forecasting result.
Fig. 9. The long-term memory ANN forecasting result.
7/29/2019 Sales Using ANN n Fuzzy
14/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 11514
5. Discussion
Section 4 has presented evaluation results based
on data accumulated from a chain supermarket com-
pany. The data collection appears to be subjective
since the data are provided by either the senior
managers or experts. However, the number of ex-
perts is twenty, implying that the subjective factors
can be eliminated to a minimum. In particularly the
fuzzy Delphi method is employed; the above consid-
eration can also be reduced. Among all types of
promotion methods, the event $10 discount is the
most effective event while the worst event is $3
discount. Regarding the types of the advertising
media, the event advertising time from 2000 to 2100
h on TV affects sales the most. Undoubtedly, since
2000 to 2100 h is the so-called golden interval, the
promotion of the related product yields a negative
effect.Notably, the pessimistic, optimistic, and average
indexes are put in the third questionnaire. The reason
is that providing the others experts opinions may
accelerate the converge of the fuzzy numbers. In the
third questionnaire, 13 fuzzy numbers are combined
into four fuzzy numbers. The main reason is to
reduce the computational complexity. The forecasts
precision may not differ significantly. Moreover, the
third questionnaire also used the unsimilarity index
rule to determine the necessity for the next survey.
Table 5 indicates that either d is equal to 0.06 or0.08, all the pairing fuzzy numbers are similar. Even
by viewing the separate events result, only the event
advertising at noon on TV is not similar as d is .0.06. Therefore, using these data or fuzzy numbers
to train the FNN is obviously subjective. Basically,
this is a pilot study to indicate the sources of thew xfuzzy inputs and fuzzy outputs. Both Refs. 18,17 ,
i.e., two representative papers considering the fuzzy
inputs and outputs, did not provide a mean to find
the fuzzy inputs and outputs for training.
In the second part of the proposed system, six
alternatives, from short-term to long term memories,
are tested. The MSE values indicate that the network .with long-term memory seven outperforms the net-
.work with short-term memory two or five . How-ever, continuously increasing the length of memory
does not yield a better result. The reason is that the
unnecessary information may mislead the networks
memory. Regarding the number of hidden layers, the
results indicate either one or two hidden layers can
provide a similar forecast. Owing to computational
considerations, one hidden layer is more feasible.
Moreover, it is the reason to select 7=7=1 net-
work instead of 7=7=7=1 network to integrate
with FNN.
Table 8 indicates that integration ANN outper-forms all other forecasting methods, e.g., ARMA 2,
.5 and single ANN. The reason is that the integrationANN prioritizes the promotion effect on the sales
pattern. Even with the forecasting result, the integra-
tion ANN is also second to none. Interestingly, the
short and long term ANNs always have one day
delay. It is because this kind of ANN only depends
on the previous data. However, the integration ANN
poses the promotion effect in the network.
6. Conclusions
This paper has developed a forecasting system
based on FNN to solve the sales forecasting problem
under promotion. Though directly using a single
ANN to model the sales pattern has been shown to
be better than the conventional statistical methods, it
still need further improvement. Integrating the ANN
and the FNN can provide more reliable forecast. In
addition, the fuzzy Delphi method was applied to
collect the fuzzy inputs and outputs for the FNN. Toour knowledge, this is the first paper to consider this
existing but unresolved problem. In the future, the
authors would like to further improve the FNN, like
the pruning of the fuzzy weights and training speed
of the network. From a marketing perspective, more
factors, which may yield a more precise result, can
be included.
Acknowledgements
The authors would like to thank the National
Science Council, Republic of China for partially
supporting this manuscript under Contract No. NSC
86-2416-H-327-003-E10. Mr. L.C. Shie is also ap-
preciated for providing the daily sales data and his
valuable discussion regarding chain store promotion.
In addition, the authors thank the anonymous refer-
7/29/2019 Sales Using ANN n Fuzzy
15/15
( )R.J. Kuo, K.C. Xuer Computers in Industry 37 1998 115 15
ees for reading the paper and offering many helpful
comments.
References
w x1 K. Chakraborty, K. Mehrotra, C.K. Mohan, Forecasting the
behavior of multivariate time series using neural networks, .Neural Networks 5 1992 961970.
w x2 G. Lachtermacher, J.D. Fuller, Backpropagation in Time- .series forecasting, J. Forecasting 14 1995 381393.
w x3 G.S. LeVee, The key to understanding the forecasting pro- . .cess, J. Business Forecasting 11 4 1992 1216.
w x4 G.G. Meyer, Marketing research and sales forecasting at . .Schlegel Corporation, J. Business Forecasting 12 2 1993
2223.w x5 C.W. Chase, Ways to improve sales forecasts, J. Business
. .Forecasting 12 3 1993 1517.w x6 M.M. Florance, M.S. Sawicz, Positioning sales forecasting
. .for better results, J. Business Forecasting 12 4 1993
2728.w x7 A.S. Weigen, D.E. Rumelhart, B.A. Huberman, Generaliza-tion by weight-elimination with application to forecasting,
.Adv. Neural Inf. Processing Syst. 3 1991 875882.w x8 Z. Tang, C. Almeida, P.A. Fishwick, Times series forecast-
ing using neural networks vs. BoxJenkins methodology,
Simulations, Simulations Councils, Nov., 1991, pp. 303310.w x9 L.A. Zadeh, The concept of a linguistic variable and its
application to approximate reasoning: Parts 13, Inf. Sci. 8 .1965 .
w x10 M. Black, Vagueness: An Exercise in Logical Analysis,Philosophy of Science 4, pp. 427 455, 1937.
w x11 T. Takagi, M. Sugeno, Derivation of Fuzzy Control Rulesfrom Human Operators Control Actions, Proc. of the IFAC
Symposium on Fuzzy Information, Knowledge Representa-tion and Decision Analysis, Marseilles, France, July 1983,
pp. 5560.w x12 C.C. Lee, Fuzzy logic in control systems: fuzzy logic con-
trollerParts I and II, IEEE Trans. Syst. Man, Cybernetics . .SMC 20 2 1990 404435.
w x13 C.T. Lin, C.S.G. Lee, Neural-network-based fuzzy logic .control and decision system, IEEE Trans. Comput. C 40 12
.1991 13201336.w x14 C.T. Lin, C.S.G. Lee, Reinforcement structurerparameter
learning for neural network based fuzzy logic control sys- . .tems, IEEE Trans. Fuzzy Syst. 2 1 1994 4663.
w x15 R.J. Kuo, P.H. Cohen, Manufacturing process control throughintegration of artificial neural networks and fuzzy model,
. .Fuzzy Sets and Systems 98 1 1998 1531.w x16 H. Ishibuchi, H. Okada, R. Fujioka, H. Tanaka, Neural
networks that learn from fuzzy ifthen rules, IEEE Trans. . .Fuzzy Syst. FS 1 2 1993 8597.
w x17 H. Ishibuchi, K. Kwon, H. Tanaka, A learning algorithm offuzzy neural networks with triangular fuzzy weights, Fuzzy
.Sets Syst. 71 1995 277293.w x18 C.T. Lin, C.S.G. Lee, Fuzzy adaptive learning control net-
work with on-line neural learning, Fuzzy Sets Syst. 71 .1995 2545.
w x19 J.J. Buckley, Y. Hayashi, Fuzzy neural networks: a survey, .Fuzzy Sets Syst. 66 1994 113.
w x20 A. Ishikawa, M. Amagasa, G. Tomiqawa, R. Tatsuta, H.Mieno, The mix-min Delphi method and fuzzy Delphi method
.via fuzzy integration, Fuzzy Sets Syst. 55 1993 241253.w x21 A. Kaufmann, M.M. Gupta, Introduction to Fuzzy Arith-
metic, North-Holland, Amsterdam, 1985.
R.J. Kuo received an MS degree in In-
dustrial and Manufacturing Systems En-gineering from the Iowa State Univer-
sity, Ames, IA, in 1990 and Ph.D. de-
gree in Industrial and Management Sys-
tems Engineering from the Pennsylvania
State University, University Park, PA, in
1994. Currently, he is an Associate Pro-
fessor in the Department of Marketing
and Distribution Management, National
Institute of Technology at Kaohsiung,
Taiwan. His research interests include
architecture issues of neural networks, fuzzy logic, and genetic
algorithm, and their applications in process control, forecasting
and marketing.
K.C. XUE earned his MS degree in
Management Science from the I-Shou
University, Taiwan, in 1996. His re-
search interests include neural networks,
fuzzy logic, and their applications in
marketing. Currently, he is in the mili-
tary service.