Upload
independent
View
2
Download
0
Embed Size (px)
Citation preview
Universidad de Guadalajara
Centro Universitario de Ciencias Economicas Administrativas
Neuro-Fuzzy Data Mining Mexico’s Economic Data
Thesis submitted to obtain the degree of Maestro de Tecnologıas de Informacion
presented by
Gustavo Becerra Gavino
Director: Dra. Liliana Ibeth Barbosa Santillan
Co-director: Dr. Jerome Leboeuf Pasquier
Assessor: Dr. Alberto Ramırez Ruiz
December 2013
UNIVERSIDAD DE GUADALAJARA
Abstract
MAESTRIA EN TECNOLOGIAS DE INFORMACION
Centro Universitario de Ciencias Economicas Administrativas
Maestro en Tecnologıas de Informacion
Neuro-Fuzzy Data Mining Mexico’s Economic Data
by Gustavo Becerra Gavino
Given the increase of data being collected, there is a need to explore the use of tools
to automate the recognition and extraction of patterns within some targeted data. The
present work explores the use of a neuro-fuzzy classifier for the multi-factor productivity
from the manufacturing sector in the Mexican economy. The chosen data set contains the
time series for the variables: Sale Value of products, Wages, Work Force, Days Worked,
and Hours Worked. The data is taken from the Banco de Informacion Economica at the
Instituto Nacional de Estadıstica y Geografıa.
Acknowledgements
Thanks to Dr. Liliana Ibeth Barbosa Santilla, Dr. Jerome Leboeuf Pasquier, Dr. Al-
berto Ramırez Ruiz, and MS Leonel Perez Pelayo who helped me get through developing
and writing this document. Thanks to all the instructors I’ve had throughout my aca-
demic life. All of you have contributed to my development as a person and as a computer
scientist.
ii
Contents
Abstract i
Acknowledgements ii
List of Figures v
List of Tables vii
Abbreviations viii
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Analysis Tools Readily Available . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Analytics tools at INEGI . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Justification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Giants’ Work 8
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Neuro-Fuzzy Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.1 Neuro-Fuzzy Perceptron . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.2 ANFIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4.3 NEFCLASS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4.4 Neuro-Fuzzy Reasoner . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 Neuro-Fuzzy Systems in Data Mining . . . . . . . . . . . . . . . . . . . . 13
3 Methodology 14
3.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Concrete Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Neuro-Fuzzy Classifying Productivity 18
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
iii
General Index iv
4.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3 Membership Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 IF-THEN Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.5 Neuro-Fuzzy System Design . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 MXMiner Neuro-Fuzzy Classifier 23
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.2 Neuro-Fuzzy Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2.1 productivityNFC.properties A.1 . . . . . . . . . . . . . . . . . . . 24
5.2.2 NeuroFuzzyClassifier.java A.2 . . . . . . . . . . . . . . . . . . . . . 26
5.3 Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3.1 mxminer.properties A.3 . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3.2 MXMiner.java A.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.4 Training Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6 Conclusions 34
6.1 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.2 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.2.1 Neuro-Fuzzy rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.2.2 Training set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2.3 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2.4 Automated Data Extraction . . . . . . . . . . . . . . . . . . . . . . 35
6.3 Closing Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
A Implementation 36
A.1 productivityNFC.properties . . . . . . . . . . . . . . . . . . . . . . . . 37
A.2 NeuroFuzzyClassifier.java . . . . . . . . . . . . . . . . . . . . . . . . . 38
A.3 mxminer.properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
A.4 MXMiner.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
B Data Sets 43
B.1 BIE c20131110114800.txt . . . . . . . . . . . . . . . . . . . . . . . . . . 43
B.2 BIE Manufacturing.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
B.3 Training Set.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
B.4 Sample Run Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
C Generated Web Services Client 46
C.1 Package mx.org.inegi.sistemas.bie . . . . . . . . . . . . . . . . . . . . . . . 46
Bibliography 48
List of Figures
1.1 Data: Valor de ventas de los productos elaborados . . . . . . . . . . . . . 2
1.2 Data: Remuneraciones totales . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Data: Personal ocupado total . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Data: Dıas trabajados . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Data: Total de horas trabajadas . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 History of neuro-fuzzy data Mining. . . . . . . . . . . . . . . . . . . . . . 8
2.2 Timeline - Artificial Neuron, Perceptron. . . . . . . . . . . . . . . . . . . . 9
2.3 Artificial Neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Muiltilayer Perceptron. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Timeline - Fuzzy Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.6 MF: Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.7 Fuzzy Rules Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.8 Timeline - Neuro-Fuzzy Systems. . . . . . . . . . . . . . . . . . . . . . . . 11
3.1 Experimentation in System Design . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1 Variables used as Inputs and Outputs for productivity . . . . . . . . . . . 19
4.2 Fuzzy Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 MF: Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 MF: Wages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.5 MF: Workforce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.6 MF: Days Worked . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.7 MF: Hours Worked . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.8 Fuzzy Rules Antecedents . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.9 Productivity Neuro-Fuzzy Classifier Design . . . . . . . . . . . . . . . . . 22
5.1 Timeline - Neuro-Fuzzy Classifier. . . . . . . . . . . . . . . . . . . . . . . 23
5.2 Layer Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3 Variable Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.4 Fuzzy Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.5 MF Delimiters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.6 Fuzzy Rule Antecedents . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.7 Input Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.8 Fuzzification Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.9 Rules Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.10 Output Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.11 Training and Data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
v
List of Figures vi
5.12 Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.13 Training Set.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.14 Training Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.15 Sample Run Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
List of Tables
4.1 Membership Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
C.1 Interface Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
C.2 Class Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
vii
Abbreviations
ANFIS Adaptive-Network-based Fuzzy Inference System
BIE Banco de Informacion Economica
INEGI Instituto Nacional de Estadıstica y Geografıa
KDD Knowledge Discovery in Databases
NEFCLASS NEuro-Fuzzy CLASSification
viii
Chapter 1
Introduction
1.1 Motivation
With the advent of the Internet also came the opportunity to share information and
make it more readily accessible. In Mexico the Instituto Nacional de Estadıstica y
Geografıa (INEGI) is in charge of gathering information about the country. The INEGI
maintains the Banco de Informacion Economica (BIE). The BIE is accessible through
an interface available at the INEGI website [1][2]. Given that the information is readily
available the opportunity presents itself to contribute to its understanding by providing
tools to analyze it.
1.2 Analysis Tools Readily Available
The INEGI BIE web interface readily provides as a simple analytical tool, a graphing
utility to visualize the data. The following charts are generated using the utility provided
in the BIE web interface for selected time series.
1
Chapter 1 Introduction 7
1.3 Analytics tools at INEGI
In addition to the graphing utility provided in the BIE web interface, there are other
analytic tools provided in the Analysis Lab.[3]. The INEGI provides access to analytic
tools such as Excel, STATA, and SPSS [4]. IBM SPSS provides the package SPSS Neural
Networks [5] as an extension to the main SPSS statistics software package. However, the
documentation for these products doesn’t mention any implementation of a neuro-fuzzy
system geared for analytics. Therefore, here is the opportunity to explore the usefulness
of neuro-fuzzy systems on the economic information provided at the BIE [2].
1.4 Justification
The public information about the tools used to analyze data at the INEGI tends to
indicate that neuro-fuzzy systems are not being used. Therefore, the present work will
exhibit how a neuro-fuzzy system facilitates the classification of productivity in the
manufacturing sector for a given month. The reasons why it is important to measure
productivity are that it is used for tracing technology, identifying the efficiency of a
given production system, and indexing the standard of living among others [6]. The
productivity discussed in the preset work refers to Mexico’s manufacturing sector. How-
ever, the productivity measurement for smaller economies, for example a company, is
similarly used for strategic planning in operations management[7]. The data for the
variables that will be used to determine productivity is the data graphed in figures 1.1
through 1.5. Section 3.1 presents further details about how the data will be used.
Chapter 2
Giants’ Work
2.1 Introduction
1943 - Artificial Neuron
1958 - The Perceptron
1964 - MLP
1965 - Fuzzy Sets
1992 - MLP, FS, Class’n
1993 - ANFIS
1996 - NEFCLASS
2006 - NFR
2013 - MXMiner NFC
1940 1950 1960 1970 1980 1990 2000 2010 2020
Figure 2.1: History of neuro-fuzzy data Mining.
Even in Greek mythology the existence of intelligent mechanical beings captured human
imagination as far as to describe a mythical bronze being, Talos. In this current age, the
efforts to understand how intelligence exists has provided us with useful tools that help
us make better sense of the phenomena around us. The introduction of a mathematical
model for the biological neuron gave way to having artificial neural networks capable
of learning from the data being processed. Fuzzy logic expresses the linguistic values
for variables. The combination of both neural networks and fuzzy logic provides us
with tools that learn and express values in a more human-like language. The use of
neuro-fuzzy systems in data mining automate the analysis of data.
8
Chapter 2 Giants’ Work 9
2.2 Neural Networks
1940 1950 1960 1970 1980 1990 2000 2010 2020
1943 - Artificial Neuron
1958 - The Perceptron
1964 - MLP
1965 - Fuzzy Sets
1985 - FMFs Perceptron
1992 - MLP, FS, Class’n
1993 - ANFIS
1996 - NEFCLASS
2006 - NFR
2013 - MXMiner NFC
Figure 2.2: Timeline - Artificial Neuron, Perceptron.
In 1943 Warren S. McCulloch and Walter H. Pitts[8] introduced the mathematical model
for the artificial neuron.
Figure 2.3: Artificial Neuron
This is a very simplified model basically consisting of a set of weighted inputs. Those
inputs are aggregated. The aggregated value is then passed through the activation
function and an output is produced. This model by itself does not have much usefulness.
However, it is the building block for more complex systems like the multilayer perceptron.
Input 1
Input 2
Input 3
Input n
Output
HiddenInput Output
Figure 2.4: Muiltilayer Perceptron.
Chapter 2 Giants’ Work 10
In 1958 F. Rosenblatt published the mathematical model for the Perceptron[9, 10].
He took the idea of an artificial neuron a bit further by including excitatory inputs,
inhibitory inputs, and feedback signals[10]. One of the ways a neural network stores
information (learns) is through adjusting the input weights based on the feedback signals.
Later in 1969 Minsky and Papert[11] proved that the Perceptron could not learn the
XOR function[12]. This problem slowed down the advancement in artificial intelligence
until the multilayer perceptron was used to find a solution.
2.3 Fuzzy Logic
1940 1950 1960 1970 1980 1990 2000 2010 2020
1943 - Artificial Neuron
1958 - The Perceptron
1964 - MLP
1965 - Fuzzy Sets
1985 - FMFs Perceptron
1992 - MLP, FS, Class’n
1993 - ANFIS
1996 - NEFCLASS
2006 - NFR
2013 - MXMiner NFC
Figure 2.5: Timeline - Fuzzy Sets.
The notion of fuzzy sets was introduce by Zadeh[13] in 1965. In fuzzy sets the values are
expressed as a degree of membership to the elements of the set. Consider the following
membership function for the variable Value:
Figure 2.6: MF: Value
Chapter 2 Giants’ Work 11
In this membership function, also called characteristic function, the fuzzy set is
{Low,Medium,High}. The fuzzy value of Medium has a degree of membership or
truth of 0 at 305 increasing to 1 at 327; from 327 to 375 it has a degree of 1; then
it decreases from 1 at 375 to 0 at 397. The fuzzy value of High has a value of 0 at
375 increasing to 1 at 397 and staying at 1 from there on. This example illustrates the
fact that a given crisp value for a variable can be a member of two fuzzy sets when the
variable goes through the membership functions and gets fuzzified.
At the same time that a variable looses information through the process of fuzzification,
it gains on flexibility, tolerance, and expressiveness[14]. Fuzzy logic uses IF-THEN
constructs to express the relations between fuzzy variables. For example:
1 IF value IS high AND wages IS high THEN productivity IS low;2 IF value IS high AND wages IS medium THEN productivity IS medium;3 IF value IS high AND wages IS low THEN productivity IS high;
Figure 2.7: Fuzzy Rules Example
2.4 Neuro-Fuzzy Systems
1940 1950 1960 1970 1980 1990 2000 2010 2020
1943 - Artificial Neuron
1958 - The Perceptron
1964 - MLP
1965 - Fuzzy Sets
1985 - FMFs Perceptron
1992 - MLP, FS, Class’n
1993 - ANFIS
1996 - NEFCLASS
2006 - NFR
2013 - MXMiner NFC
Figure 2.8: Timeline - Neuro-Fuzzy Systems.
2.4.1 Neuro-Fuzzy Perceptron
As early as 1985 Keller, Hunt, and Douglas[15] researched the idea of combining the fuzzy
logic and the perceptron. Their efforts aimed at alleviating the problem that the crisp
perceptrons had on converging in the case where the classes in a hyperplane were not
linearly separable. Later came Sankar and Sushimita[16] who introduced the use of fuzzy
membership functions along with a supervised learning perceptron for classification. The
Chapter 2 Giants’ Work 12
fuzzy perceptron is a 3-layered network intended to include knowledge defined in the
rules it is implementing. In machine learning, supervised learning refers to the process
of feeding knowledge previously defined into the system as opposed to unsupervised
learning where the system discovers hidden information as it processes data.
2.4.2 ANFIS
The ANFIS[17], Adaptive-Network-based Fuzzy Inference System, is a multilayer fuzzy
perceptron that uses predefined human knowledge provided in the fuzzy IF-THEN rules.
It also combines adaptive neurons which are neurons with specific parameters that are
updated to achieve a desired input-output mapping as the training set is processed.
2.4.3 NEFCLASS
NEFLASS[18–20] is a 3-layered feedforward fuzzy perceptron. It is intended to determine
the correct class for a given set of values from the input variables. The output neurons
represent the fuzzy set for the variable being classified. The NEFCLASS was used as
the inspiration for the Neuro-Fuzzy Reasoner.
2.4.4 Neuro-Fuzzy Reasoner
The neuro-fuzzy reasoner[21] is based on the NEFCLASS. It is a 4-layered feedforward
fuzzy perceptron. However, it differs from NEFCLASS in that the membership functions
are not modified throughout its execution. It was originally designed to classify how good
a class was based on the score a student had on an exam and how quickly the student
could answer the given exam. The present work takes on the main ideas from this model
and applies them for the classification of Productivity based on five variables: Value,
Wages, Workforce, Days Worked, and Hours Worked.
2.5 Data Mining
We live in an age when information about our activities is being collected constantly.
The amount of data is so vast and varied that the traditional tools and ways of analyzing
such data are rapidly being surpassed in their capacity. It has simply become unfeasible
to meticulously look for patterns hidden within the mountains of data using traditional
statistics and specialized personnel [22]. There comes the need to devise artifacts and
systems to automate the extraction of hidden patterns within all the information there
Chapter 2 Giants’ Work 13
is available to us.
With all the tools available for data mining, sometimes data mining may just be con-
necting the output of one model to another using graphical tools [23]. Even so, the idea
remains the same. Data mining is about discovering new information hidden within the
data. It is the analysis phase within the process of Knowledge Discovery in Databases
(KDD). To that end, computer scientists have devised and the computer industry has
implemented various tools geared to ease the effort in analyzing data [22, 24]. In the
present work the focus is placed on neuro-fuzzy systems applied to data mining.
2.6 Neuro-Fuzzy Systems in Data Mining
The main uses of neuro-fuzzy systems in analytics are for clustering, regression and
classification[25, 26]. In clustering, the data is arranged in groups of similar items as
it is being processed. Therefore clustering is used to discover and learn unsuspected
associations within the data. Clustering uses mainly unsupervised learning. The goal of
a regression is to approximate a relation between two sets X and Y by mapping items
between X and Y. Regression uses generally supervised learning. Classification intends
to place items in a data set within a predefined class based on an assessment of its
features. Classification uses supervised learning[25, 27]. The present work implements
a neuro-fuzzy classifier for the variable “Productivity”.
Chapter 3
Methodology
The present work requires an experiment in system design. In experimental computer
science [28] the process of experimentation in system design has four phases: Idea,
system design, concrete implementation, and experimental evaluation.
Figure 3.1: Experimentation in System Design
The following sections explain how the first cycle in the experimentation in system design
process will be accomplished for the present work.
14
Chapter 3 Methodology 15
3.1 Idea
The idea is to explore the feasibility of using a neuro-fuzzy system to classify the produc-
tivity from the manufacturing sector in Mexico’s economy to facilitate its interpretation.
The INEGI BIE provides a convenient web-based interface to access the data made avail-
able. Through it there are up to 310659 [2] time series obtainable on various topics about
Mexico’s economic information as of November 2013. Given the diversity of attainable
data, it is necessary to study it and focus on what is of interest for a given project.
For the present work, the target is a set of five time series belonging to the monthly
survey for the manufacturing sector. Therefore, in the terms used in the BIE portal all
of them belong to the route theme “Manufacturas >Encuesta mensual de la industria
manufacturera (EMIM)”. Once at that level, the series are found following the given
routes:
1. Valor de ventas de los productos elaborados
2. Remuneraciones totales pagadas >Remuneraciones totales
3. Total de personal ocupado >Personal ocupado total
4. Dıas trabajados
5. Total de horas trabajadas >Total de horas trabajadas
For ease of reference, the following corresponding variables will be used for the rest of
this writing.
1. Value
2. Wages
3. Workforce
4. Days Worked
5. Hours Worked
The data for these variables is available at the BIE[2] by executing a query using the file
provided in Appendix B.1. The downloaded series have a range of January 2007 to June
2013 B.2. Figures 1.1 through 1.5 display the corresponding graphs for the variables.
Superimposing the graphs yields the system visualization in figure 3.2. This system is
oversimplified. Even so, it already illustrates how complex the analysis of productivity
is.
Chapter 3 Methodology 17
3.2 System Design
In data mining, neuro-fuzzy systems are used for clustering, regression and classification[25,
26].The present work uses a neuro-fuzzy system to tell (classify) if productivity is low,
medium, or high for a given month based on the variables value, wages, work force, days
worked, and hours worked.
3.3 Concrete Implementation
The neuro-fuzzy system will be implemented using the computer language Java on top
of the Neuroph framework [29]. The implementation will consist of three classes: Neu-
roFuzzyClassifier, NFCFactory, and MXMiner. The NueroFuzzyClassifier class will en-
capsulate a flexible implementation for a neuro-fuzzy classifier. The NFCFactory class
will be used to produce an instance of the NeuroFuzzyClassifier based on a productiv-
ityNFC.properties file holding the features defined for the neuro-fuzzy classifier. The
MXMiner class will perform the loading of the training set, training the system, and
imputing the data into the system.
3.4 Experimental Evaluation
The implemented system will be executed using the data in Appendix B.2. The results
obtained will tell us how good productivity was for a given month. Since the imple-
mentation of the neuro-fuzzy system in the present work is a prototype, the results can
be improved. The accuracy of the results depends on the information provided to the
system during the learning process. That includes the fuzzy rule constructs and the
training set. The more accurate the information infused into the system the more accu-
rate the results will be. The execution of the neuro-fuzzy system will conclude the first
cycle in experimentation in system design.
Chapter 4
Neuro-Fuzzy Classifying
Productivity
4.1 Introduction
The most basic definition of Productivity[30] in a production system is:
Productivity =Output
Input(4.1)
Definition of Productivity used in the present work
This is a simplified definition. Multifactor productivity ivolves one output and many
inputs[6, 7]. Still, for the present work the information that is necessary to understand
about productivity is that by definition, productivity is directly proportional to the
output and inversely proportional to the inputs in a production system. Therefore a high
output tends to improve productivity and a high input tends to decrease productivity.
4.2 Variables
In chapters 1 and 3 the information available at the INEGI1 BIE1 is discussed. The
chosen time series contain the data for the variables Value, Wages, Workforce, Days
Worked, and Hours Worked. In manufacturing, these variables can be classified as
follows:
18
Chapter 4 Neuro-Fuzzy Classifying Productivity 19
Figure 4.1: Variables used as Inputs and Outputs for productivity
Figure 3.2 presents the visualization of the inputs and output for the manufacturing
sector in the Mexican economy. The question about productivity in such system is then:
When is productivity low, medium, or high? The values low, medium, and high in this
question are the values for the fuzzy set for Productivity. The fuzzy sets for the other
variables are defined similarly. Thus the fuzzy sets for the system are:
Figure 4.2: Fuzzy Sets
Given that the delimiters for the fuzzy sets for the fuzzy functions are defined by the
available human knowledge, they are roughly based on the statistical quartiles for the
time series. Section 4.3 presents the membership functions for the system. The trapezoid
function is used for ease of implementation only four points are needed as the delimiters
for the functions. The data for the variable Wages for each month varies very slightly
with the exception of the month of December. Thus the delimiters for the membership
functions for Wages in figure 4.5 are closer in shape compared to the rest with the
exception of the membership functions for days worked. As is evident, the membership
functions for the number of Days Worked in figure 4.6 take up a rectangular shape.
This is due to the fact that the Days Worked for a given month is an integer. The
range for Days Worked in the data set is between 23 and 28 with most of the months
having 26 days worked.
Chapter 4 Neuro-Fuzzy Classifying Productivity 20
Table 4.1: Membership Functions
4.3 Membership Functions
Figure 4.3: MF: Value Figure 4.4: MF: Wages
Figure 4.5: MF: Workforce Figure 4.6: MF: Days Worked
Figure 4.7: MF: Hours Worked
Chapter 4 Neuro-Fuzzy Classifying Productivity 21
4.4 IF-THEN Constructs
As already established in equation 4.1, productivity is directly proportional to the out-
puts and inversely proportional to the inputs in a production system. Therefore, based
on that knowledge the antecedent parts of the fuzzy rules are constructed as shown in
figure 4.8. The consequent part of the rules will be learned by the neuro-fuzzy classifier
based on the desired output in the training set. See section 5.14.
1 IF value IS high AND wages IS medium;\2 IF value IS high AND wages IS low;\3 IF value IS medium AND wages IS high;\4 IF value IS medium AND wages IS medium;\5 IF value IS medium AND wages IS low;\6 IF value IS low AND wages IS high;\7 IF value IS low AND wages IS medium;\8 IF value IS low AND wages IS low;\9 \
10 IF value IS high AND work_force IS high;\11 IF value IS high AND work_force IS medium;\12 IF value IS high AND work_force IS low;\13 IF value IS medium AND work_force IS high;\14 IF value IS medium AND work_force IS medium;\15 IF value IS medium AND work_force IS low;\16 IF value IS low AND work_force IS high;\17 IF value IS low AND work_force IS medium;\18 IF value IS low AND work_force IS low;\19 \20 IF value IS high AND days_worked IS high;\21 IF value IS high AND days_worked IS medium;\22 IF value IS high AND days_worked IS low;\23 IF value IS medium AND days_worked IS high;\24 IF value IS medium AND days_worked IS medium;\25 IF value IS medium AND days_worked IS low;\26 IF value IS low AND days_worked IS high;\27 IF value IS low AND days_worked IS medium;\28 IF value IS low AND days_worked IS low;\29 \30 IF value IS high AND hours_worked IS high;\31 IF value IS high AND hours_worked IS medium;\32 IF value IS high AND hours_worked IS low;\33 IF value IS medium AND hours_worked IS high;\34 IF value IS medium AND hours_worked IS medium;\35 IF value IS medium AND hours_worked IS low;\36 IF value IS low AND hours_worked IS high;\37 IF value IS low AND hours_worked IS medium;\38 IF value IS low AND hours_worked IS low;
Figure 4.8: Fuzzy Rules Antecedents
4.5 Neuro-Fuzzy System Design
In the next page figure 4.9 is a pictorial representation of the neuro-fuzzy system topol-
ogy. The design implements the neuro-fuzzy classifier for productivity of the Mexican
manufacturing production system. The inputs are Wages, Work Force, Days Worked,
and Hours Worked; the output is Value. The fuzzy antecedents are implemented through
the neuron connections between layers 2 and 3. The neural network output is the clas-
sification of productivity based on the neural network input variables.
Chapter 4 Neuro-Fuzzy Classifying Productivity 22
Figure 4.9: Productivity Neuro-Fuzzy Classifier Design
Chapter 5
MXMiner Neuro-Fuzzy Classifier
1943 - Artificial Neuron
1958 - The Perceptron
1964 - MLP
1965 - Fuzzy Sets
1992 - MLP, FS, Class’n
1993 - ANFIS
1996 - NEFCLASS
2006 - NFR
2013 - MXMiner NFC
1940 1950 1960 1970 1980 1990 2000 2010 2020
Figure 5.1: Timeline - Neuro-Fuzzy Classifier.
5.1 Introduction
The implementation of the system is based on the idea exposed in the Neuro-fuzzy
Reasoner[21]. However, since the implementation for that work is particular to the prob-
lem described in [21], it was necessary to implement a more flexible neuro-fuzzy classifier
using the artificial intelligence Java library Neuroph[29]. The system involves four main
files: productivityNFC.properties, NeuroFuzzyClassifier.java, mxminer.properties, and
MXMiner.java. The entire content of these files is included in Appendix A. Following
there is a brief explanation about the purpose of portions from those files.
23
Chapter 5 MXMiner Neuro-Fuzzy Classifier 24
5.2 Neuro-Fuzzy Classifier
5.2.1 productivityNFC.properties A.1
This file is a standard Java properties file. It provides the specifications for the neuro-
fuzzy system. The first 5 lines define the labels for the layers.
1 #Layer labels 4 needed.2 nfc.layer.label.1=Input3 nfc.layer.label.2=Fuzzification4 nfc.layer.label.3=Rules5 nfc.layer.label.4=Output
Figure 5.2: Layer Labels
Lines 8-9 define the names for the input and output variables. These names will be
used through the rest of the property keys to define the fuzzy sets, membership function
delimeters, and fuzzy rule antecedents.
8 nfc.input_variables=value,wages,work_force,days_worked,hours_worked9 nfc.output_variable=productivity
Figure 5.3: Variable Names
Lines 12-17 define the fuzzy sets for the variables. The variable names previously defined
are used within the key exactly as they were spelled earlier.
12 nfc.fz.value=low,medium,high13 nfc.fz.wages=low,medium,high14 nfc.fz.work_force=low,medium,high15 nfc.fz.days_worked=low,medium,high16 nfc.fz.hours_worked=low,medium,high17 nfc.fz.productivity=low,medium,high
Figure 5.4: Fuzzy Sets
Chapter 5 MXMiner Neuro-Fuzzy Classifier 25
Lines 20-38 define the membership functions delimiters.
20 nfc.mf.delims.value.low=14275406,14697580,305349556,32756250421 nfc.mf.delims.value.medium=305349556,327562504,375477803,39769075122 nfc.mf.delims.value.high=375477803,397690751,47016501,474386752324 nfc.mf.delims.wages.low=13690175,15282811,26213287,2780592325 nfc.mf.delims.wages.medium=26213287,27805923,27720643,2931327926 nfc.mf.delims.wages.high=27720643,29313279,26431270,280239062728 nfc.mf.delims.work_force.low=1896401,1945143,3097305,314604729 nfc.mf.delims.work_force.medium=3097305,3146047,3224960,327370230 nfc.mf.delims.work_force.high=3224960,3273702,2286344,23350863132 nfc.mf.delims.days_worked.low=13,13,26,2633 nfc.mf.delims.days_worked.medium=26,26,27,2734 nfc.mf.delims.days_worked.high=27,27,39,393536 nfc.mf.delims.hours_worked.low=448819,465301,611156,62763837 nfc.mf.delims.hours_worked.medium=611156,627638,641690,65817238 nfc.mf.delims.hours_worked.high=641690,658172,780679,797161
Figure 5.5: MF Delimiters
Lines 44-82 define the fuzzy antecedents for the rules to be used in the system. The
consequent parts of the rules will be learned by the system through the training set.
44 IF value IS high AND wages IS medium;\45 IF value IS high AND wages IS low;\46 IF value IS medium AND wages IS high;\47 IF value IS medium AND wages IS medium;\48 IF value IS medium AND wages IS low;\49 IF value IS low AND wages IS high;\50 IF value IS low AND wages IS medium;\51 IF value IS low AND wages IS low;\52 \53 IF value IS high AND work_force IS high;\54 IF value IS high AND work_force IS medium;\55 IF value IS high AND work_force IS low;\56 IF value IS medium AND work_force IS high;\57 IF value IS medium AND work_force IS medium;\58 IF value IS medium AND work_force IS low;\59 IF value IS low AND work_force IS high;\60 IF value IS low AND work_force IS medium;\61 IF value IS low AND work_force IS low;\62 \63 IF value IS high AND days_worked IS high;\64 IF value IS high AND days_worked IS medium;\65 IF value IS high AND days_worked IS low;\66 IF value IS medium AND days_worked IS high;\67 IF value IS medium AND days_worked IS medium;\68 IF value IS medium AND days_worked IS low;\69 IF value IS low AND days_worked IS high;\70 IF value IS low AND days_worked IS medium;\71 IF value IS low AND days_worked IS low;\72 \73 IF value IS high AND hours_worked IS high;\74 IF value IS high AND hours_worked IS medium;\75 IF value IS high AND hours_worked IS low;\76 IF value IS medium AND hours_worked IS high;\77 IF value IS medium AND hours_worked IS medium;\78 IF value IS medium AND hours_worked IS low;\79 IF value IS low AND hours_worked IS high;\80 IF value IS low AND hours_worked IS medium;\81 IF value IS low AND hours_worked IS low;
Figure 5.6: Fuzzy Rule Antecedents
Chapter 5 MXMiner Neuro-Fuzzy Classifier 26
5.2.2 NeuroFuzzyClassifier.java A.2
This file contains the class implementing a neuro-fuzzy classifier based on the properties
provided on the productivityNFC.properties file.
Lines 54-66 construct the input (first) layer.
54 // Create Input Layer55 NeuronProperties neuronProperties = new NeuronProperties();56
57 String[] inputVariables = (new CSVParser(new StringReader(58 props.getProperty("nfc.input_variables")),csvs)).getLine();59
60 Layer inputLayer = LayerFactory.createLayer(inputVariables.length, neuronProperties);61 inputLayer.setLabel(props.getProperty("nfc.layer.label.1"));62
63 for(i = 0 ; i < inputVariables.length; i++){64 inputLayer.getNeuronAt(i).setLabel(inputVariables[i]);65 }66 this.addLayer(inputLayer);
Figure 5.7: Input Layer
Lines 68-102 construct the fuzzification layer. In this layer each neuron represents a
value in the fuzzy sets for the input variables. Each input neuron is connected to as
many neurons as there are values in the corresponding variable’s fuzzy set. If a variable
has three fuzzy values in its fuzzy set, then the neuron representing the input variable
will be connected to three fuzzification neurons. The transfer function for the neuron
implements the membership function for the fuzzy value.
68 //Create Fuzzification Layer69 neuronProperties.setProperty("transferFunction",TransferFunctionType.TRAPEZOID);70 Layer fuzzyLayer = LayerFactory.createLayer(0,neuronProperties);71 fuzzyLayer.setLabel(props.getProperty("nfc.layer.label.2"));72
73 String[] fuzzySet, mfDelims;74 double[] delims = new double[4];75 Neuron[] inputNeurons = inputLayer.getNeurons();76 Neuron fuzzyNeuron;77 Trapezoid mf;78
79 for(i = 0; i < inputNeurons.length; i++){80 fuzzySet = (new CSVParser(new StringReader(81 props.getProperty("nfc.fz."+inputNeurons[i].getLabel())),csvs)).getLine();82 for(String fuzzyValue : fuzzySet){83 mfDelims = (new CSVParser(new StringReader(84 props.getProperty("nfc.mf.delims."+inputVariables[i]+"."+fuzzyValue))85 ,csvs)).getLine();86 for(j = 0; j < delims.length; j++){87 delims[j] = Double.parseDouble(mfDelims[j]);88 }89 fuzzyNeuron = NeuronFactory.createNeuron(neuronProperties);90 fuzzyNeuron.setLabel(inputNeurons[i].getLabel()+"."+fuzzyValue);91
92 mf = (Trapezoid) fuzzyNeuron.getTransferFunction();93 mf.setLeftLow(delims[0]);94 mf.setLeftHigh(delims[1]);95 mf.setRightLow(delims[2]);96 mf.setRightHigh(delims[3]);97 fuzzyLayer.addNeuron(fuzzyNeuron);98
99 ConnectionFactory.createConnection(inputNeurons[i], fuzzyNeuron, 1);100 }101 }102 this.addLayer(fuzzyLayer);
Chapter 5 MXMiner Neuro-Fuzzy Classifier 27
Figure 5.8: Fuzzification Layer
Lines 104-156 build the rules layer. The nfc.fuzzy rules key in the productivityNFC.properties
file is a string with the character “;” as the delimiter for the rule antecedents. The rule
antecedents are implemented through the connections from the fuzzification neurons to
the neurons in this layer.
104 //Create Rules Layer105 NeuronProperties ruleNeuronProperties = new NeuronProperties(Neuron.class,106 WeightedSum.class, Linear.class);107 Layer rulesLayer = LayerFactory.createLayer(0,ruleNeuronProperties);108 rulesLayer.setLabel(props.getProperty("nfc.layer.label.3"));109110 CSVStrategy rcsvs = NeuroFuzzyClassifier.CSVS_RULES;111 CSVStrategy rpcsvs = NeuroFuzzyClassifier.CSVS_RULE_PARSER;112
113 //Rule tokens length must be 4x of the form:114 //IF <input variable> IS <fuzzy value> {AND <input variable> IS <fuzzy value> }*;115
116 String [] rules = (new CSVParser(new StringReader(117 props.getProperty("nfc.fuzzy_rules")),rcsvs)).getLine();118 String [] ruleTokens;119 String antecedentLabel;120 Neuron antecedent, ruleNeuron;121 Neuron fuzzyNeurons[] = fuzzyLayer.getNeurons();122 boolean found;123
124 for(String rule : rules){125 if(rule.length() == 0) continue;126
127 ruleTokens =(new CSVParser(new StringReader(rule),rpcsvs)).getLine();128 if(ruleTokens.length % 4 != 0){//The rule does not follow the required syntax129 continue;130 }131
132 ruleNeuron = NeuronFactory.createNeuron(ruleNeuronProperties);133 ruleNeuron.setLabel(rule);134
135 for(i = 0; i < (ruleTokens.length); i = i + 4 ){136 //Each neuron has a label <input variable>.<fuzzy value>137 antecedentLabel = ruleTokens[i+1]+"."+ruleTokens[i+3];138 j = 0;139 found = false;140 antecedent = null;141 while(j < fuzzyNeurons.length && !found){142 if(antecedentLabel.equals(fuzzyNeurons[j].getLabel())){143 found = true;144 antecedent = fuzzyNeurons[j];145 }146 j++;147 }148 if(found){149 ConnectionFactory.createConnection(antecedent, ruleNeuron, 1);150 }else{151 System.out.println("Orphan rule neuron. Check rules.");152 }153 }154 rulesLayer.addNeuron(ruleNeuron);155 }156 this.addLayer(rulesLayer);
Figure 5.9: Rules Layer
Lines 158-173 create the output layer. The neurons in this layer represent the fuzzy set
for the output variable.
158 // create the output layer159 neuronProperties = new NeuronProperties();160 neuronProperties.setProperty("transferFunction", TransferFunctionType.LINEAR);161 Layer outputLayer = LayerFactory.createLayer(0, neuronProperties);162 outputLayer.setLabel(props.getProperty("nfc.layer.label.4"));163
164 fuzzySet = (new CSVParser(new StringReader(165 props.getProperty("nfc.fz."+props.getProperty("nfc.output_variable"))),166 csvs)).getLine();167 for(String fuzzyValue : fuzzySet){
Chapter 5 MXMiner Neuro-Fuzzy Classifier 28
168 fuzzyNeuron = NeuronFactory.createNeuron(neuronProperties);169 fuzzyNeuron.setLabel(props.getProperty("nfc.output_variable")+"."+fuzzyValue);170 outputLayer.addNeuron(fuzzyNeuron);171 }172 this.addLayer(outputLayer);
Figure 5.10: Output Layer
5.3 Execution
The actual execution of analysis is encoded within the files mxminer.properties and
MXMiner.java.
5.3.1 mxminer.properties A.3
The two most meaningful keys in this properties file are mxminer.data.src.csv.training set.filename
and mxminer.data.src.csv.filename which are the files used for training the neuro-fuzzy
classifier and to contain the data to be analyzed. The contents of the training set and
the data are exposed in appendices B.3 and B.2.
1 #Training Data Set2 mxminer.data.src.csv.training_set.filename=Training_Set.csv3 #Data Source4 mxminer.data.src.csv.filename=BIE_Manufacturing.csv56 #Messages7 mxminer.msg.info.start=Begin Execution.8 mxminer.msg.info.finished=Finished execution.9 mxminer.msg.info.training=Training classifier.
10 mxminer.msg.info.done_training=Done Training.11 mxminer.msg.info.classifying=Classifying data set.12 mxminer.msg.info.done_classifying=Done classifying data set.
Figure 5.11: Training and Data sets
5.3.2 MXMiner.java A.4
This Java program executes the neuro-fuzzy system training, and the analysis of the
data. Lines 103-107 specify the parameters for the learning process withing the neural
network. The network is set to use backpropagation [31] for correcting the output error
as it processes the learning set in line 107. The learning function SigmoidDeltaRule [32]
is used in the system.
75 NFCFactory factory = new NFCFactory();76 NeuroFuzzyClassifier pnfc = factory.createProductivityNFC();77
78 double[] input = new double[pnfc.getInputNeurons().length];79 double[] output = new double[pnfc.getOutputNeurons().length];80
81 CSVParser csvp = new CSVParser(new FileReader(82 MXMiner.PROPERTIES.getProperty("mxminer.data.src.csv.training_set.filename")));
Chapter 5 MXMiner Neuro-Fuzzy Classifier 29
83 DataSet trainingSet = new DataSet(input.length, output.length);84
85 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.training"));86
87 String[] pLine = csvp.getLine();88 while(pLine != null){89 date = pLine[0];90 for(i = 1; i < pLine.length - output.length; i++){91 input[i - 1] = Double.parseDouble(pLine[i]);92 }93
94 for(i = input.length + 1; i < pLine.length; i++){95 output[i-(input.length + 1)] = Double.parseDouble(pLine[i]);96 }97
98 MXMiner.printIO(date, input, output);99
100 trainingSet.addRow(new DataSetRow(input,output));101 pLine = csvp.getLine();102 }103 SigmoidDeltaRule lr = new SigmoidDeltaRule();104 lr.setMaxError(0.001);105 lr.setMaxIterations(1000000);106 pnfc.setLearningRule(lr);107 pnfc.learn(trainingSet);108
109 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.done_training"));110
111 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.classifying"));112 csvp = new CSVParser(new FileReader(113 MXMiner.PROPERTIES.getProperty("mxminer.data.src.csv.filename")));114 pLine = csvp.getLine();//First Line contains the headers.115 pLine = csvp.getLine();116117
118 while(pLine != null){119 date = pLine[0];120 for(i = 1; i < pLine.length; i++){121 input[i-1] = Double.parseDouble(pLine[i]);122 }123
124 pnfc.setInput(input);125 pnfc.calculate();126 output = pnfc.getOutput();127 for(i=0; i < output.length; i++){128 output[i] = round(output[i],0);129 }130 MXMiner.printIO(date, input, output);131
132 pLine = csvp.getLine();133 }
Figure 5.12: Execution
Chapter 5 MXMiner Neuro-Fuzzy Classifier 30
5.4 Training Set
The training set is build up with arbitrarily selected data points. In this experiment, the
desired output is determined by visual inspection. The blue lines in figure 5.14 represent
the data points used in the training set. The network will learn the consequent part
of the fuzzy rules based on the desired output in this set. Therefore, the training set
accuracy and size has a direct impact on how well the network will classify a given input
set.
1 2007/02,263998702,24973475,3278289,26,619420,1,0,02 2007/03,310607965,27033069,3308320,28,685089,1,0,03 2007/08,308778193,26892918,3299133,28,688920,1,0,04 2007/10,310607965,27033069,3308320,28,685089,1,0,05 2008/02,298032052,26186693,3310715,25,641760,1,0,06 2009/02,271684225,24486493,3001100,25,557060,0,.90,.107 2009/07,298041996,25485422,2920772,27,603720,0,.20,.808 2010/02,313124956,24852226,2979297,24,564321,0,1,09 2010/05,338645819,26293934,3073525,26,612984,.30,.70,0
10 2011/03,391026808,29144214,3150472,26,655261,.60,.40,011 2011/10,412777215,28319083,3181799,27,648098,0,0,112 2012/01,406022459,28154321,3171539,26,639749,0,0,113 2012/06,441333819,30070146,3241250,26,659821,0,0,114 2013/06,441702297,30945010,3278432,26,650930,0,0,1
Figure 5.13: Training Set.csv
Chapter 5 MXMiner Neuro-Fuzzy Classifier 32
5.5 Results
An execution of the Neuro-Fuzzy Classifier for Productivity based on the variables of
Value, Wages, Workforce, Days Worked, Hours Worked yields the output file in Ap-
pendinx B.4. The entire system is summarized in the following figure:
In the chart of figure 5.15 the lines with the numbers represent how strong productivity
belongs to each of the elements in the productivity fuzzy set. From top to bottom, the
first line is the fuzzy value of High the second line is for Medium and the bottom
line is for Low . Therefore, the output suggests that productivity was between medium
and high for the months of 2007/01 to 2008/02. In contrast, the months of 2013/03 to
2013/08 were high. Note worthy is the output for the months of 2012/05 and 2012/06
that suggests a strongly high productivity for that period. This system output is con-
sistent with the high ratio of outputs over inputs for the manufacturing sector for the
same period. Overall the output of the neuro-fuzzy classifier constructed for the present
work is consistent with the expected results based on a visual inspection on the graphed
data. Thus it facilitates the classification of productivity for a given month However,
the system can be improved in several ways detailed in section 6.2.
Chapter 6
Conclusions
In the field of information technology there is always a problem to solve and many
ways to solve it. In the present work the implementation for a neuro-fuzzy classifier for
Productivity to classify it as High , Medium , and Low is presented.
6.1 Achievements
The two main contributions in the present work are the NeuroFuzzyClassifier imple-
mentation and the fuzzy classification of manufacturing productivity. The NeuroFuzzy-
Classifier is an implementation of a feedforward neuro-fuzzy classifier based on the work
presented in the neuro-fuzzy reasoner work[21]. Compared to the neuro-fuzzy reasoner
the neuro-fuzzy classifier increases flexibility in implementation and allows for a wider
range of classification applications. The proposed classification of productivity based on
many imputs is a subject to be researched further. With that note, a neuro-fuzzy system
is presented as an approach to classify productivity in complex multi-factor systems.
6.2 Further Work
6.2.1 Neuro-Fuzzy rules
One of the ways to infuse pre-existing knowledge into a neuro-fuzzy system is through
the definition of the fuzzy rules. Better constructed fuzzy rules will yield better results.
To that end, an individual with vast understanding of the intricacies of the system to
be automated will define better fuzzy rules and contribute to its accuracy.
34
Chapter 6 Conclusions 35
6.2.2 Training set
On the same page as with fuzzy rules, the chosen training set is another way of embedding
knowledge into the neuro-fuzzy system. The desired output will be used to learn the
consequent part of the fuzzy rules. Thus if the training set contains precise information,
the system will be able to better cope with the data it analyzes. The available data set
has only 81 items. A subset of that data of 14 items were used for training. If more
data is made available a bigger training set can be selected and thus better train the
neuro-fuzzy classifier.
6.2.3 Output
In its present state, the neuro-fuzzy classifier outputs numbers less than 0 and greater
than 1. However the degree of membership to a fuzzy element is a value between 0 and
1 inclusive [13]. Therefore it is necessary to find a way to normalize the output.
6.2.4 Automated Data Extraction
Given the advances in the information technology now-a-days we are presented with
a variety of tools to process information faster. One of those tools is web services.
The INEGI BIE publishes a WebServices Descriptor. When this descriptor is used to
generate a web services client the classes in Appendix C are created. Those classes
come in handy when automating the Extraction Transformation and Loading of the
information available in the BIE. However, due to scope of the experiment and time
constraints, for the present work it was necessary to download the data using the web
GUI instead of constructing a web services client and execute an extract-transform-load
process automatically.
6.3 Closing Remarks
Through out the present work, the goal is to explore the use of the existing tools geared
towards analytics. Neuro-fuzzy systems is the focus on this work because of their adapt-
ability and learning capabilities. Neuro-fuzzy systems present a good opportunity to
analyze data using the way humans express quantities and use the learning capacity of
neural networks to store information and use that information to adapt to their purposes.
Given how flexible neuro-fuzzy systems are, there may just be a sea of applications wat-
ing to be discovered. For the time being, an application of the neuro-fuzzy system to
facilitate the classification of productivity has been presented.
Appendix A Implementation 37
A.1 productivityNFC.properties
1 #Layer labels 4 needed.2 nfc.layer.label.1=Input3 nfc.layer.label.2=Fuzzification4 nfc.layer.label.3=Rules5 nfc.layer.label.4=Output67 #Data Variables.8 nfc.input_variables=value,wages,work_force,days_worked,hours_worked9 nfc.output_variable=productivity
1011 #Fuzzy Sets for each variable.12 nfc.fz.value=low,medium,high13 nfc.fz.wages=low,medium,high14 nfc.fz.work_force=low,medium,high15 nfc.fz.days_worked=low,medium,high16 nfc.fz.hours_worked=low,medium,high17 nfc.fz.productivity=low,medium,high1819 #Membership functions delimiters20 nfc.mf.delims.value.low=14275406,14697580,305349556,32756250421 nfc.mf.delims.value.medium=305349556,327562504,375477803,39769075122 nfc.mf.delims.value.high=375477803,397690751,47016501,474386752324 nfc.mf.delims.wages.low=13690175,15282811,26213287,2780592325 nfc.mf.delims.wages.medium=26213287,27805923,27720643,2931327926 nfc.mf.delims.wages.high=27720643,29313279,26431270,280239062728 nfc.mf.delims.work_force.low=1896401,1945143,3097305,314604729 nfc.mf.delims.work_force.medium=3097305,3146047,3224960,327370230 nfc.mf.delims.work_force.high=3224960,3273702,2286344,23350863132 nfc.mf.delims.days_worked.low=13,13,26,2633 nfc.mf.delims.days_worked.medium=26,26,27,2734 nfc.mf.delims.days_worked.high=27,27,39,393536 nfc.mf.delims.hours_worked.low=448819,465301,611156,62763837 nfc.mf.delims.hours_worked.medium=611156,627638,641690,65817238 nfc.mf.delims.hours_worked.high=641690,658172,780679,7971613940 #Fuzzy rules of the form:41 #IF <input variable> IS <fuzzy value> {AND <input variable> IS <fuzzy value> }*;42 nfc.fuzzy_rules=\43 IF value IS high AND wages IS high;\44 IF value IS high AND wages IS medium;\45 IF value IS high AND wages IS low;\46 IF value IS medium AND wages IS high;\47 IF value IS medium AND wages IS medium;\48 IF value IS medium AND wages IS low;\49 IF value IS low AND wages IS high;\50 IF value IS low AND wages IS medium;\51 IF value IS low AND wages IS low;\52 \53 IF value IS high AND work_force IS high;\54 IF value IS high AND work_force IS medium;\55 IF value IS high AND work_force IS low;\56 IF value IS medium AND work_force IS high;\57 IF value IS medium AND work_force IS medium;\58 IF value IS medium AND work_force IS low;\59 IF value IS low AND work_force IS high;\60 IF value IS low AND work_force IS medium;\61 IF value IS low AND work_force IS low;\62 \63 IF value IS high AND days_worked IS high;\64 IF value IS high AND days_worked IS medium;\65 IF value IS high AND days_worked IS low;\66 IF value IS medium AND days_worked IS high;\67 IF value IS medium AND days_worked IS medium;\68 IF value IS medium AND days_worked IS low;\69 IF value IS low AND days_worked IS high;\70 IF value IS low AND days_worked IS medium;\71 IF value IS low AND days_worked IS low;\72 \73 IF value IS high AND hours_worked IS high;\74 IF value IS high AND hours_worked IS medium;\75 IF value IS high AND hours_worked IS low;\76 IF value IS medium AND hours_worked IS high;\77 IF value IS medium AND hours_worked IS medium;\78 IF value IS medium AND hours_worked IS low;\79 IF value IS low AND hours_worked IS high;\80 IF value IS low AND hours_worked IS medium;\81 IF value IS low AND hours_worked IS low;
Appendix A Implementation 38
A.2 NeuroFuzzyClassifier.java
1 package edu.mxminer;23 import java.io.StringReader;4 import java.util.Properties;56 import org.apache.commons.csv.CSVParser;7 import org.apache.commons.csv.CSVStrategy;89 import org.neuroph.core.Layer;
10 import org.neuroph.core.NeuralNetwork;11 import org.neuroph.core.Neuron;12 import org.neuroph.core.input.WeightedSum;13 import org.neuroph.core.transfer.Linear;14 import org.neuroph.core.transfer.Trapezoid;15 import org.neuroph.nnet.learning.LMS;16 import org.neuroph.util.ConnectionFactory;17 import org.neuroph.util.LayerFactory;18 import org.neuroph.util.NeuralNetworkFactory;19 import org.neuroph.util.NeuralNetworkType;20 import org.neuroph.util.NeuronFactory;21 import org.neuroph.util.NeuronProperties;22 import org.neuroph.util.TransferFunctionType;23 /**24 * This class constructs a Neuro-Fuzzy Classifier based on the properties defined25 * in a file following the Java properties file standard. An example can be found26 * in the file productivityNFC.properties27 *28 * @author [email protected] */30 public class NeuroFuzzyClassifier extends NeuralNetwork <LMS>{31 private static final long serialVersionUID = 1L;32
33 /**34 * CSV Strategies for parsing CSV Strings35 */36 private static final CSVStrategy CSVS_DEFAULT = new CSVStrategy(’,’,’"’,’#’);37 private static final CSVStrategy CSVS_RULES = new CSVStrategy(’;’,’"’,’#’);38 private static final CSVStrategy CSVS_RULE_PARSER = new CSVStrategy(’ ’,’"’,’#’);39
40 /**41 * Buids up the NeuralNetwork containing the Neuro-Fuzzy system to be used42 * as a neuro-fuzzy classifier based on the properties provided.43 * @param props properties containing the specifications for the neuro-fuzzy44 * system.45 */46 public void loadFromProperties(Properties props){47 int i, j;//iterator variables.48 try{49 CSVStrategy csvs = NeuroFuzzyClassifier.CSVS_DEFAULT;50
51 // set network type52 this.setNetworkType(NeuralNetworkType.MULTI_LAYER_PERCEPTRON);53
54 // Create Input Layer55 NeuronProperties neuronProperties = new NeuronProperties();56
57 String[] inputVariables = (new CSVParser(new StringReader(58 props.getProperty("nfc.input_variables")),csvs)).getLine();59
60 Layer inputLayer = LayerFactory.createLayer(inputVariables.length, neuronProperties);61 inputLayer.setLabel(props.getProperty("nfc.layer.label.1"));62
63 for(i = 0 ; i < inputVariables.length; i++){64 inputLayer.getNeuronAt(i).setLabel(inputVariables[i]);65 }66 this.addLayer(inputLayer);67
68 //Create Fuzzification Layer69 neuronProperties.setProperty("transferFunction",TransferFunctionType.TRAPEZOID);70 Layer fuzzyLayer = LayerFactory.createLayer(0,neuronProperties);71 fuzzyLayer.setLabel(props.getProperty("nfc.layer.label.2"));72
73 String[] fuzzySet, mfDelims;74 double[] delims = new double[4];75 Neuron[] inputNeurons = inputLayer.getNeurons();76 Neuron fuzzyNeuron;77 Trapezoid mf;78
79 for(i = 0; i < inputNeurons.length; i++){80 fuzzySet = (new CSVParser(new StringReader(81 props.getProperty("nfc.fz."+inputNeurons[i].getLabel())),csvs)).getLine();82 for(String fuzzyValue : fuzzySet){83 mfDelims = (new CSVParser(new StringReader(84 props.getProperty("nfc.mf.delims."+inputVariables[i]+"."+fuzzyValue))85 ,csvs)).getLine();86 for(j = 0; j < delims.length; j++){87 delims[j] = Double.parseDouble(mfDelims[j]);88 }89 fuzzyNeuron = NeuronFactory.createNeuron(neuronProperties);90 fuzzyNeuron.setLabel(inputNeurons[i].getLabel()+"."+fuzzyValue);91
Appendix A Implementation 39
92 mf = (Trapezoid) fuzzyNeuron.getTransferFunction();93 mf.setLeftLow(delims[0]);94 mf.setLeftHigh(delims[1]);95 mf.setRightLow(delims[2]);96 mf.setRightHigh(delims[3]);97 fuzzyLayer.addNeuron(fuzzyNeuron);98
99 ConnectionFactory.createConnection(inputNeurons[i], fuzzyNeuron, 1);100 }101 }102 this.addLayer(fuzzyLayer);103
104 //Create Rules Layer105 NeuronProperties ruleNeuronProperties = new NeuronProperties(Neuron.class,106 WeightedSum.class, Linear.class);107 Layer rulesLayer = LayerFactory.createLayer(0,ruleNeuronProperties);108 rulesLayer.setLabel(props.getProperty("nfc.layer.label.3"));109110 CSVStrategy rcsvs = NeuroFuzzyClassifier.CSVS_RULES;111 CSVStrategy rpcsvs = NeuroFuzzyClassifier.CSVS_RULE_PARSER;112
113 //Rule tokens length must be 4x of the form:114 //IF <input variable> IS <fuzzy value> {AND <input variable> IS <fuzzy value> }*;115
116 String [] rules = (new CSVParser(new StringReader(117 props.getProperty("nfc.fuzzy_rules")),rcsvs)).getLine();118 String [] ruleTokens;119 String antecedentLabel;120 Neuron antecedent, ruleNeuron;121 Neuron fuzzyNeurons[] = fuzzyLayer.getNeurons();122 boolean found;123
124 for(String rule : rules){125 if(rule.length() == 0) continue;126
127 ruleTokens =(new CSVParser(new StringReader(rule),rpcsvs)).getLine();128 if(ruleTokens.length % 4 != 0){//The rule does not follow the required syntax129 continue;130 }131
132 ruleNeuron = NeuronFactory.createNeuron(ruleNeuronProperties);133 ruleNeuron.setLabel(rule);134
135 for(i = 0; i < (ruleTokens.length); i = i + 4 ){136 //Each neuron has a label <input variable>.<fuzzy value>137 antecedentLabel = ruleTokens[i+1]+"."+ruleTokens[i+3];138 j = 0;139 found = false;140 antecedent = null;141 while(j < fuzzyNeurons.length && !found){142 if(antecedentLabel.equals(fuzzyNeurons[j].getLabel())){143 found = true;144 antecedent = fuzzyNeurons[j];145 }146 j++;147 }148 if(found){149 ConnectionFactory.createConnection(antecedent, ruleNeuron, 1);150 }else{151 System.out.println("Orphan rule neuron. Check rules.");152 }153 }154 rulesLayer.addNeuron(ruleNeuron);155 }156 this.addLayer(rulesLayer);157
158 // create the output layer159 neuronProperties = new NeuronProperties();160 neuronProperties.setProperty("transferFunction", TransferFunctionType.LINEAR);161 Layer outputLayer = LayerFactory.createLayer(0, neuronProperties);162 outputLayer.setLabel(props.getProperty("nfc.layer.label.4"));163
164 fuzzySet = (new CSVParser(new StringReader(165 props.getProperty("nfc.fz."+props.getProperty("nfc.output_variable"))),166 csvs)).getLine();167 for(String fuzzyValue : fuzzySet){168 fuzzyNeuron = NeuronFactory.createNeuron(neuronProperties);169 fuzzyNeuron.setLabel(props.getProperty("nfc.output_variable")+"."+fuzzyValue);170 outputLayer.addNeuron(fuzzyNeuron);171 }172 this.addLayer(outputLayer);173
174 ConnectionFactory.fullConnect(rulesLayer, outputLayer);175 NeuralNetworkFactory.setDefaultIO(this);176 this.setLearningRule(new LMS());177 }catch(Exception e){178 e.printStackTrace();179 }180 }181 }
Appendix A Implementation 40
A.3 mxminer.properties
1 #Training Data Set2 mxminer.data.src.csv.training_set.filename=Training_Set.csv3 #Data Source4 mxminer.data.src.csv.filename=BIE_Manufacturing.csv56 #Messages7 mxminer.msg.info.start=Begin Execution.8 mxminer.msg.info.finished=Finished execution.9 mxminer.msg.info.training=Training classifier.
10 mxminer.msg.info.done_training=Done Training.11 mxminer.msg.info.classifying=Classifying data set.12 mxminer.msg.info.done_classifying=Done classifying data set.
Appendix A Implementation 41
A.4 MXMiner.java
1 package edu.mxminer;23 import java.io.FileInputStream;4 import java.io.FileReader;5 import java.math.BigDecimal;6 import java.util.Properties;78 import org.apache.commons.csv.CSVParser;9 import org.neuroph.core.data.DataSet;
10 import org.neuroph.core.data.DataSetRow;11 import org.neuroph.nnet.learning.*;12 /**13 * This class instantiates a NeuroFuzzyClassifier trains it based on the14 * mxminer.data.src.csv.training_set.filename. Then analyzes the data in the15 * mxminer.data.src.csv.filename.16 *17 * @author [email protected] */19 public class MXMiner{20 /**21 * Properties filename.22 */23 private static final String PROPERTIES_FILE_NAME = "mxminer.properties";24
25 /**26 * Application Properties.27 */28 private static final Properties PROPERTIES;29
30 /**31 * Static initialization.32 */33 static{34 PROPERTIES = new Properties();35 try{36 PROPERTIES.load(new FileInputStream(MXMiner.PROPERTIES_FILE_NAME));37 }catch(Exception e){38 e.printStackTrace();39 }40 }41
42 private static void printIO(String date, double[] input, double[] output){43 StringBuilder sb = new StringBuilder();44 sb.append(date).append(",");45 sb.append(arrayToStringBuilder(input));46 sb.append(",").append(arrayToStringBuilder(output));47 System.out.println(sb.toString());48 }49
50 private static StringBuilder arrayToStringBuilder(double[] array){51 StringBuilder sb = new StringBuilder();52 if(array != null && array.length > 0){53 for(int i = 0; i < array.length-1; i++){54 sb.append(array[i]).append(",");55 }56 sb.append(array[array.length-1]);57 }58 return sb;59 }60
61 public static double round(double value, int places) {62 if (places < 0) throw new IllegalArgumentException();63
64 BigDecimal bd = new BigDecimal(value);65 bd = bd.setScale(places, BigDecimal.ROUND_HALF_UP);66 return bd.doubleValue();67 }68
69 public static void main(String[] args ){70 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.start"));71
72 try {73 int i;74 String date;75 NFCFactory factory = new NFCFactory();76 NeuroFuzzyClassifier pnfc = factory.createProductivityNFC();77
78 double[] input = new double[pnfc.getInputNeurons().length];79 double[] output = new double[pnfc.getOutputNeurons().length];80
81 CSVParser csvp = new CSVParser(new FileReader(82 MXMiner.PROPERTIES.getProperty("mxminer.data.src.csv.training_set.filename")));83 DataSet trainingSet = new DataSet(input.length, output.length);84
85 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.training"));86
87 String[] pLine = csvp.getLine();88 while(pLine != null){89 date = pLine[0];90 for(i = 1; i < pLine.length - output.length; i++){
Appendix A Implementation 42
91 input[i - 1] = Double.parseDouble(pLine[i]);92 }93
94 for(i = input.length + 1; i < pLine.length; i++){95 output[i-(input.length + 1)] = Double.parseDouble(pLine[i]);96 }97
98 MXMiner.printIO(date, input, output);99
100 trainingSet.addRow(new DataSetRow(input,output));101 pLine = csvp.getLine();102 }103 SigmoidDeltaRule lr = new SigmoidDeltaRule();104 lr.setMaxError(0.001);105 lr.setMaxIterations(1000000);106 pnfc.setLearningRule(lr);107 pnfc.learn(trainingSet);108
109 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.done_training"));110
111 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.classifying"));112 csvp = new CSVParser(new FileReader(113 MXMiner.PROPERTIES.getProperty("mxminer.data.src.csv.filename")));114 pLine = csvp.getLine();//First Line contains the headers.115 pLine = csvp.getLine();116117
118 while(pLine != null){119 date = pLine[0];120 for(i = 1; i < pLine.length; i++){121 input[i-1] = Double.parseDouble(pLine[i]);122 }123
124 pnfc.setInput(input);125 pnfc.calculate();126 output = pnfc.getOutput();127 for(i=0; i < output.length; i++){128 output[i] = round(output[i],0);129 }130 MXMiner.printIO(date, input, output);131
132 pLine = csvp.getLine();133 }134
135 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.done_classifying"));136
137 } catch (Exception e) {138 e.printStackTrace();139 }140 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.finished"));141 }142 }
Appendix B
Data Sets
B.1 BIE c20131110114800.txt
1 20072 20133 ,222030,218898,228990,227946,225162,4 a5 ap6 v7 Todo8 False9 False
B.2 BIE Manufacturing.csv
1 "Period","Value","Wages","Work Force","Days worked","Hours worked"2 "2007/01",265871675,25172106,3291091,28,6567503 "2007/02",263998702,24973475,3278289,26,6194204 "2007/03",300636026,26716185,3286320,28,6734005 "2007/04",281769622,26054783,3281394,27,6296056 "2007/05",305278703,26737916,3296105,28,6711147 "2007/06",305562113,26390991,3275204,28,6591518 "2007/07",288834753,25376873,3277529,28,6544269 "2007/08",308778193,26892918,3299133,28,688920
10 "2007/09",292154844,25864843,3300073,27,65432411 "2007/10",310607965,27033069,3308320,28,68508912 "2007/11",300047312,26471132,3299723,27,65954013 "2007/12",277996670,33090879,3277176,28,60957714 "2008/01",300751930,26293068,3305799,27,66560615 "2008/02",298032052,26186693,3310715,25,64176016 "2008/03",306349057,26957981,3307268,27,64221417 "2008/04",325122721,27562404,3298995,26,66817318 "2008/05",327394287,27230048,3299519,27,66767819 "2008/06",328782141,27082204,3269197,26,65512520 "2008/07",326554448,26914270,3264595,27,67432921 "2008/08",320795490,26615340,3260806,27,65711222 "2008/09",314156993,26822137,3231608,26,64681123 "2008/10",343472740,27714549,3201666,27,67395024 "2008/11",310335067,26066483,3159720,26,61495025 "2008/12",290632233,32915736,3093774,26,57218126 "2009/01",268634708,25244248,3046084,27,59269327 "2009/02",271684225,24486493,3001100,25,55706028 "2009/03",299988787,25789790,2965733,26,58714229 "2009/04",281915496,25621929,2943663,26,57434530 "2009/05",279742797,24796980,2934755,27,56677531 "2009/06",287433371,25057454,2926165,26,58241232 "2009/07",298041996,25485422,2920772,27,60372033 "2009/08",302537145,24784922,2934465,26,58688634 "2009/09",310531110,25431704,2959633,26,59459335 "2009/10",333414297,26056372,2981907,26,62311336 "2009/11",320418549,25301422,2998887,26,589032
43
Appendix B Data Sets 44
37 "2009/12",318664456,33362030,2987857,26,57575938 "2010/01",309015381,25055729,2966063,26,58522539 "2010/02",313124956,24852226,2979297,24,56432140 "2010/03",345307016,27335872,3002055,27,61780341 "2010/04",326811124,26525066,3033240,26,60757942 "2010/05",338645819,26293934,3073525,26,61298443 "2010/06",352534600,26896039,3068929,26,62600644 "2010/07",339512327,26886968,3095700,27,64260445 "2010/08",355259052,26909978,3114016,27,63430046 "2010/09",348683606,27405315,3125982,26,63704047 "2010/10",352331814,27126456,3139467,27,64209648 "2010/11",350903744,27028506,3129807,26,62265449 "2010/12",344398267,34855546,3112302,26,60432850 "2011/01",345958741,26728127,3102121,26,61965551 "2011/02",341518120,26576823,3120594,24,59271752 "2011/03",391026808,29144214,3150472,26,65526153 "2011/04",357781166,27731620,3162669,25,61404954 "2011/05",376677787,27743063,3166547,27,63861555 "2011/06",381902897,28734739,3172268,26,65322856 "2011/07",373437027,27676239,3167570,27,63812557 "2011/08",393696126,28698604,3171124,27,65576358 "2011/09",396569509,28802392,3184327,26,64870059 "2011/10",412777215,28319083,3181799,27,64809860 "2011/11",411275503,28635623,3179145,26,63524561 "2011/12",396355597,36385165,3158912,26,61054662 "2012/01",406022459,28154321,3171539,26,63974963 "2012/02",400622565,28683264,3180790,25,62207964 "2012/03",432107430,30517276,3200542,26,66477665 "2012/04",401091778,29262928,3217021,26,62407066 "2012/05",432941650,30176955,3237353,26,66717867 "2012/06",441333819,30070146,3241250,26,65982168 "2012/07",423222610,29820748,3236974,27,65738769 "2012/08",429542249,30580464,3241400,27,68233870 "2012/09",406021015,29387218,3253697,25,64582271 "2012/10",434966750,29942844,3257610,27,67964572 "2012/11",428907802,30455156,3257521,26,65942573 "2012/12",398068950,37227588,3245709,26,60384074 "2013/01",416846561,30336538,3238170,26,65843375 "2013/02",398064498,29330063,3249984,24,61776176 "2013/03",415569568,30867138,3264899,26,64029677 "2013/04",423175878,30751826,3281684,26,66094778 "2013/05",430993977,31792104,3291010,26,67921179 "2013/06",441702297,30945010,3278432,26,65093080 "2013/07",436312913,31636887,3291829,27,67516381 "2013/08",447308814,31560790,3299679,26,685197
B.3 Training Set.csv
1 2007/02,263998702,24973475,3278289,26,619420,1,0,02 2007/03,310607965,27033069,3308320,28,685089,1,0,03 2007/08,308778193,26892918,3299133,28,688920,1,0,04 2007/10,310607965,27033069,3308320,28,685089,1,0,05 2008/02,298032052,26186693,3310715,25,641760,1,0,06 2009/02,271684225,24486493,3001100,25,557060,0,.90,.107 2009/07,298041996,25485422,2920772,27,603720,0,.20,.808 2010/02,313124956,24852226,2979297,24,564321,0,1,09 2010/05,338645819,26293934,3073525,26,612984,.30,.70,0
10 2011/03,391026808,29144214,3150472,26,655261,.60,.40,011 2011/10,412777215,28319083,3181799,27,648098,0,0,112 2012/01,406022459,28154321,3171539,26,639749,0,0,113 2012/06,441333819,30070146,3241250,26,659821,0,0,114 2013/06,441702297,30945010,3278432,26,650930,0,0,1
Appendix B Data Sets 45
B.4 Sample Run Results
1 Begin Execution.2 Training classifier.3 2007/02,2.63998702E8,2.4973475E7,3278289.0,26.0,619420.0,1.0,0.0,0.04 2007/03,3.10607965E8,2.7033069E7,3308320.0,28.0,685089.0,1.0,0.0,0.05 2007/08,3.08778193E8,2.6892918E7,3299133.0,28.0,688920.0,1.0,0.0,0.06 2007/10,3.10607965E8,2.7033069E7,3308320.0,28.0,685089.0,1.0,0.0,0.07 2008/02,2.98032052E8,2.6186693E7,3310715.0,25.0,641760.0,1.0,0.0,0.08 2009/02,2.71684225E8,2.4486493E7,3001100.0,25.0,557060.0,0.0,0.9,0.19 2009/07,2.98041996E8,2.5485422E7,2920772.0,27.0,603720.0,0.0,0.2,0.8
10 2010/02,3.13124956E8,2.4852226E7,2979297.0,24.0,564321.0,0.0,1.0,0.011 2010/05,3.38645819E8,2.6293934E7,3073525.0,26.0,612984.0,0.3,0.7,0.012 2011/03,3.91026808E8,2.9144214E7,3150472.0,26.0,655261.0,0.6,0.4,0.013 2011/10,4.12777215E8,2.8319083E7,3181799.0,27.0,648098.0,0.0,0.0,1.014 2012/01,4.06022459E8,2.8154321E7,3171539.0,26.0,639749.0,0.0,0.0,1.015 2012/06,4.41333819E8,3.0070146E7,3241250.0,26.0,659821.0,0.0,0.0,1.016 2013/06,4.41702297E8,3.094501E7,3278432.0,26.0,650930.0,0.0,0.0,1.017 Done Training.18 Classifying data set.19 2007/01,2.65871675E8,2.5172106E7,3291091.0,28.0,656750.0,-1.0,2.0,2.020 2007/02,2.63998702E8,2.4973475E7,3278289.0,26.0,619420.0,1.0,2.0,2.021 2007/03,3.00636026E8,2.6716185E7,3286320.0,28.0,673400.0,-1.0,2.0,2.022 2007/04,2.81769622E8,2.6054783E7,3281394.0,27.0,629605.0,0.0,3.0,2.023 2007/05,3.05278703E8,2.6737916E7,3296105.0,28.0,671114.0,-1.0,2.0,2.024 2007/06,3.05562113E8,2.6390991E7,3275204.0,28.0,659151.0,-1.0,2.0,2.025 2007/07,2.88834753E8,2.5376873E7,3277529.0,28.0,654426.0,-1.0,2.0,2.026 2007/08,3.08778193E8,2.6892918E7,3299133.0,28.0,688920.0,-1.0,2.0,2.027 2007/09,2.92154844E8,2.5864843E7,3300073.0,27.0,654324.0,-1.0,2.0,2.028 2007/10,3.10607965E8,2.7033069E7,3308320.0,28.0,685089.0,-1.0,2.0,2.029 2007/11,3.00047312E8,2.6471132E7,3299723.0,27.0,659540.0,-1.0,2.0,2.030 2007/12,2.7799667E8,3.3090879E7,3277176.0,28.0,609577.0,0.0,2.0,0.031 2008/01,3.0075193E8,2.6293068E7,3305799.0,27.0,665606.0,-1.0,2.0,2.032 2008/02,2.98032052E8,2.6186693E7,3310715.0,25.0,641760.0,0.0,1.0,2.033 2008/03,3.06349057E8,2.6957981E7,3307268.0,27.0,642214.0,0.0,3.0,2.034 2008/04,3.25122721E8,2.7562404E7,3298995.0,26.0,668173.0,1.0,2.0,3.035 2008/05,3.27394287E8,2.7230048E7,3299519.0,27.0,667678.0,0.0,3.0,3.036 2008/06,3.28782141E8,2.7082204E7,3269197.0,26.0,655125.0,1.0,2.0,3.037 2008/07,3.26554448E8,2.691427E7,3264595.0,27.0,674329.0,1.0,4.0,3.038 2008/08,3.2079549E8,2.661534E7,3260806.0,27.0,657112.0,0.0,3.0,3.039 2008/09,3.14156993E8,2.6822137E7,3231608.0,26.0,646811.0,1.0,2.0,3.040 2008/10,3.4347274E8,2.7714549E7,3201666.0,27.0,673950.0,1.0,4.0,2.041 2008/11,3.10335067E8,2.6066483E7,3159720.0,26.0,614950.0,1.0,2.0,2.042 2008/12,2.90632233E8,3.2915736E7,3093774.0,26.0,572181.0,0.0,1.0,0.043 2009/01,2.68634708E8,2.5244248E7,3046084.0,27.0,592693.0,0.0,3.0,1.044 2009/02,2.71684225E8,2.4486493E7,3001100.0,25.0,557060.0,0.0,1.0,1.045 2009/03,2.99988787E8,2.578979E7,2965733.0,26.0,587142.0,1.0,2.0,1.046 2009/04,2.81915496E8,2.5621929E7,2943663.0,26.0,574345.0,1.0,2.0,1.047 2009/05,2.79742797E8,2.479698E7,2934755.0,27.0,566775.0,0.0,3.0,1.048 2009/06,2.87433371E8,2.5057454E7,2926165.0,26.0,582412.0,1.0,2.0,1.049 2009/07,2.98041996E8,2.5485422E7,2920772.0,27.0,603720.0,0.0,3.0,1.050 2009/08,3.02537145E8,2.4784922E7,2934465.0,26.0,586886.0,1.0,2.0,1.051 2009/09,3.1053111E8,2.5431704E7,2959633.0,26.0,594593.0,1.0,2.0,1.052 2009/10,3.33414297E8,2.6056372E7,2981907.0,26.0,623113.0,2.0,2.0,1.053 2009/11,3.20418549E8,2.5301422E7,2998887.0,26.0,589032.0,1.0,2.0,2.054 2009/12,3.18664456E8,3.336203E7,2987857.0,26.0,575759.0,1.0,1.0,1.055 2010/01,3.09015381E8,2.5055729E7,2966063.0,26.0,585225.0,1.0,2.0,1.056 2010/02,3.13124956E8,2.4852226E7,2979297.0,24.0,564321.0,0.0,2.0,1.057 2010/03,3.45307016E8,2.7335872E7,3002055.0,27.0,617803.0,2.0,4.0,1.058 2010/04,3.26811124E8,2.6525066E7,3033240.0,26.0,607579.0,2.0,3.0,2.059 2010/05,3.38645819E8,2.6293934E7,3073525.0,26.0,612984.0,2.0,2.0,1.060 2010/06,3.525346E8,2.6896039E7,3068929.0,26.0,626006.0,2.0,3.0,1.061 2010/07,3.39512327E8,2.6886968E7,3095700.0,27.0,642604.0,1.0,3.0,1.062 2010/08,3.55259052E8,2.6909978E7,3114016.0,27.0,634300.0,1.0,4.0,1.063 2010/09,3.48683606E8,2.7405315E7,3125982.0,26.0,637040.0,1.0,3.0,2.064 2010/10,3.52331814E8,2.7126456E7,3139467.0,27.0,642096.0,1.0,4.0,1.065 2010/11,3.50903744E8,2.7028506E7,3129807.0,26.0,622654.0,2.0,3.0,1.066 2010/12,3.44398267E8,3.4855546E7,3112302.0,26.0,604328.0,2.0,2.0,0.067 2011/01,3.45958741E8,2.6728127E7,3102121.0,26.0,619655.0,2.0,3.0,1.068 2011/02,3.4151812E8,2.6576823E7,3120594.0,24.0,592717.0,1.0,2.0,1.069 2011/03,3.91026808E8,2.9144214E7,3150472.0,26.0,655261.0,0.0,3.0,1.070 2011/04,3.57781166E8,2.773162E7,3162669.0,25.0,614049.0,2.0,3.0,1.071 2011/05,3.76677787E8,2.7743063E7,3166547.0,27.0,638615.0,1.0,4.0,2.072 2011/06,3.81902897E8,2.8734739E7,3172268.0,26.0,653228.0,1.0,2.0,1.073 2011/07,3.73437027E8,2.7676239E7,3167570.0,27.0,638125.0,1.0,4.0,2.074 2011/08,3.93696126E8,2.8698604E7,3171124.0,27.0,655763.0,0.0,4.0,1.075 2011/09,3.96569509E8,2.8802392E7,3184327.0,26.0,648700.0,1.0,3.0,1.076 2011/10,4.12777215E8,2.8319083E7,3181799.0,27.0,648098.0,0.0,2.0,1.077 2011/11,4.11275503E8,2.8635623E7,3179145.0,26.0,635245.0,0.0,1.0,1.078 2011/12,3.96355597E8,3.6385165E7,3158912.0,26.0,610546.0,2.0,3.0,1.079 2012/01,4.06022459E8,2.8154321E7,3171539.0,26.0,639749.0,0.0,1.0,1.080 2012/02,4.00622565E8,2.8683264E7,3180790.0,25.0,622079.0,0.0,1.0,0.081 2012/03,4.3210743E8,3.0517276E7,3200542.0,26.0,664776.0,0.0,0.0,2.082 2012/04,4.01091778E8,2.9262928E7,3217021.0,26.0,624070.0,1.0,2.0,0.083 2012/05,4.3294165E8,3.0176955E7,3237353.0,26.0,667178.0,0.0,0.0,2.084 2012/06,4.41333819E8,3.0070146E7,3241250.0,26.0,659821.0,0.0,0.0,2.085 2012/07,4.2322261E8,2.9820748E7,3236974.0,27.0,657387.0,-1.0,1.0,1.086 2012/08,4.29542249E8,3.0580464E7,3241400.0,27.0,682338.0,0.0,1.0,2.087 2012/09,4.06021015E8,2.9387218E7,3253697.0,25.0,645822.0,0.0,0.0,1.088 2012/10,4.3496675E8,2.9942844E7,3257610.0,27.0,679645.0,0.0,1.0,2.089 2012/11,4.28907802E8,3.0455156E7,3257521.0,26.0,659425.0,0.0,0.0,2.090 2012/12,3.9806895E8,3.7227588E7,3245709.0,26.0,603840.0,1.0,1.0,1.091 2013/01,4.16846561E8,3.0336538E7,3238170.0,26.0,658433.0,0.0,0.0,2.092 2013/02,3.98064498E8,2.9330063E7,3249984.0,24.0,617761.0,1.0,1.0,1.093 2013/03,4.15569568E8,3.0867138E7,3264899.0,26.0,640296.0,1.0,1.0,1.094 2013/04,4.23175878E8,3.0751826E7,3281684.0,26.0,660947.0,0.0,0.0,1.095 2013/05,4.30993977E8,3.1792104E7,3291010.0,26.0,679211.0,0.0,0.0,1.096 2013/06,4.41702297E8,3.094501E7,3278432.0,26.0,650930.0,0.0,0.0,1.097 2013/07,4.36312913E8,3.1636887E7,3291829.0,27.0,675163.0,0.0,1.0,1.098 2013/08,4.47308814E8,3.156079E7,3299679.0,26.0,685197.0,0.0,1.0,1.099 Done classifying data set.
100 Finished execution.
Appendix C
Generated Web Services Client
C.1 Package mx.org.inegi.sistemas.bie
Table C.1: Interface Summary
Interface Description
WebServiceBackFillAjaxHttpGet This class was generated by Apache CXF 2.7.6 2013-08-24T12:44:14.182-05:00WebServiceBackFillAjaxHttpPost This class was generated by Apache CXF 2.7.6 2013-08-24T12:44:14.106-05:00WebServiceBackFillAjaxSoap This class was generated by Apache CXF 2.7.6 2013-08-24T12:44:14.157-05:00
46
Appendix C Generated Web Services Client 47
Table C.2: Class Summary
Class Description
AddAllChilds Java class for anonymous complex type.AddAllChildsResponse Java class for anonymous complex type.AgregaNodos Java class for anonymous complex type.AgregaNodosBuscador Java class for anonymous complex type.AgregaNodosBuscadorResponse Java class for anonymous complex type.AgregaNodosResponse Java class for anonymous complex type.AgregaSeriesCar Java class for anonymous complex type.AgregaSeriesCarResponse Java class for anonymous complex type.ArrayOfSerieValores2005Periodo Java class for ArrayOfSerieValores2005Periodo complex type.ArrayOfString Java class for ArrayOfString complex type.BuildPanelContains Java class for anonymous complex type.BuildPanelContainsResponse Java class for anonymous complex type.CheckCarSeries Java class for anonymous complex type.CheckCarSeriesResponse Java class for anonymous complex type.ConsultaCuadro Java class for anonymous complex type.ConsultaCuadroResponse Java class for anonymous complex type.DatosXSerie Java class for anonymous complex type.DatosXSerieResponse Java class for anonymous complex type.ExportaIQY Java class for anonymous complex type.ExportaIQYResponse Java class for anonymous complex type.ExportaMetadato Java class for anonymous complex type.ExportaMetadatoResponse Java class for anonymous complex type.ExportaSeries Java class for anonymous complex type.ExportaSeriesResponse Java class for anonymous complex type.ExtraeMetaDato Java class for anonymous complex type.ExtraeMetaDatoResponse Java class for anonymous complex type.FillYearsControls Java class for anonymous complex type.FillYearsControlsResponse Java class for anonymous complex type.FrecuenciasActivas Java class for anonymous complex type.FrecuenciasActivasResponse Java class for anonymous complex type.GuardaSeriesSelecciondas Java class for anonymous complex type.GuardaSeriesSelecciondasResponse Java class for anonymous complex type.InterSeriesSelecToCar Java class for anonymous complex type.InterSeriesSelecToCarResponse Java class for anonymous complex type.JerarquiaDownTemaSerie Java class for anonymous complex type.JerarquiaDownTemaSerieResponse Java class for anonymous complex type.LimpiaSeriesSeleccionadas Java class for anonymous complex type.LimpiaSeriesSeleccionadasResponse Java class for anonymous complex type.ObjectFactory This object contains factory methods.SeleccionaPorSerie Java class for anonymous complex type.SeleccionaPorSerieResponse Java class for anonymous complex type.SeleccionaTodasSeries Java class for anonymous complex type.SeleccionaTodasSeriesResponse Java class for anonymous complex type.SeleccionaTodo Java class for anonymous complex type.SeleccionaTodoResponse Java class for anonymous complex type.SerieInformation Java class for anonymous complex type.SerieInformationResponse Java class for anonymous complex type.SeriesGrafica Java class for anonymous complex type.SeriesGraficaResponse Java class for anonymous complex type.SeriesParaConsulta Java class for anonymous complex type.SeriesParaConsultaResponse Java class for anonymous complex type.SeriesSeleccionadas Java class for anonymous complex type.SeriesSeleccionadasResponse Java class for anonymous complex type.SerieValores2005Periodo Java class for SerieValores2005Periodo complex type.SesionesBuscador Java class for anonymous complex type.SesionesBuscadorResponse Java class for anonymous complex type.SetSystemAccess Java class for anonymous complex type.SetSystemAccessFast Java class for anonymous complex type.SetSystemAccessFastResponse Java class for anonymous complex type.SetSystemAccessResponse Java class for anonymous complex type.TotalElementosdelaRama Java class for anonymous complex type.TotalElementosdelaRamaResponse Java class for anonymous complex type.WebServiceBackFillAjax This class was generated by Apache CXF 2.7.6 2013-08-24T12:44:14.205-05:00
Bibliography
[1] INEGI, “Instituto nacional de estadıstica y geografıa.” [Online]. Available:
http://www.inegi.org.mx/
[2] ——, “Banco de informacion economica.” [Online]. Available: http://www.inegi.
org.mx/sistemas/bie/
[3] ——, “Laboratorio de analisis de datos.” [Online]. Available: http://www.inegi.org.
mx/est/contenidos/proyectos/accesomicrodatos/default lab analisis datos.aspx
[4] ——, “Laboratorio de analisis de datos - usuarios de organismos internacionales,
instituciones academicas o de investigacion.” [Online]. Available: http://www.
inegi.org.mx/est/contenidos/proyectos/accesomicrodatos/default lad ia.aspx
[5] IBM SPSS Neural Networks 20, IBM, 2011.
[6] P. Schreyer and D. Pilat, “Measuring productivity,” OECD Economic studies,
vol. 33, no. 2001/2, pp. 127–170, 2001.
[7] W. J. Stevenson and M. Hojati, Operations management. McGraw-Hill/Irwin
Boston, 2007, vol. 8.
[8] W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous
activity,” The Bulletin of Mathematical Biophysics, vol. 5, no. 4, pp. 115–133, 1943.
[9] F. Rosenblatt, “The perceptron: a probabilistic model for information storage and
organization in the brain,” Psychological Review, vol. 65, no. 6, pp. 386–408, Nov.
1958.
[10] ——, Two theorems of statistical separability in the perceptron. United States
Department of Commerce, 1958.
[11] M. Minsky and P. Seymour, “Perceptrons.” 1969.
[12] J. Leboeuf Pasquier, Programacio Basada en Redes Neuronales. Amate Editorial,
2006.
48
Bibliography 49
[13] L. A. Zadeh, “Fuzzy sets,” Information and control, vol. 8, no. 3, pp. 338–353, 1965.
[14] J. Leboeuf Pasquier, Programacio Basada en Logica Difusa. Amate Editorial,
2006.
[15] J. M. Keller and D. J. Hunt, “Incorporating fuzzy membership functions into the
perceptron algorithm,” Pattern Analysis and Machine Intelligence, IEEE Transac-
tions on, no. 6, pp. 693–699, 1985.
[16] S. K. Pal and S. Mitra, “Multilayer perceptron, fuzzy sets, and classification,”
Neural Networks, IEEE Transactions on, vol. 3, no. 5, pp. 683–697, 1992.
[17] J.-S. Jang, “Anfis: Adaptive-network-based fuzzy inference system,” Systems, Man
and Cybernetics, IEEE Transactions on, vol. 23, no. 3, pp. 665–685, 1993.
[18] D. Nauck and R. Kruse, “Nefclassmdash; a neuro-fuzzy approach for the classifi-
cation of data,” in Proceedings of the 1995 ACM symposium on applied computing.
ACM, 1995, pp. 461–465.
[19] D. Nauck, U. Nauck, and R. Kruse, “Generating classification rules with the neuro-
fuzzy system nefclass,” in Fuzzy Information Processing Society, 1996. NAFIPS.
1996 Biennial Conference of the North American. IEEE, 1996, pp. 466–470.
[20] D. D. Nauck, “Fuzzy data analysis with nefclass,” in IFSA World Congress and
20th NAFIPS International Conference, 2001. Joint 9th, vol. 3. IEEE, 2001, pp.
1413–1418.
[21] Z. Sevarac, “Neuro fuzzy reasoner for student modeling,” in Advanced Learning
Technologies, 2006. Sixth International Conference on. IEEE, 2006, pp. 740–744.
[22] I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and
techniques. Morgan Kaufmann, 2005.
[23] T. Khabaza, “Hard hats for data miners: Myths and pitfalls of data mining,”
Business intelligence, data warehousing and analytics editorial from DMReview,
2005.
[24] M. Kantardzic, Data mining: concepts, models, methods, and algorithms. John
Wiley & Sons, 2011.
[25] S. Mitra, S. K. Pal, and P. Mitra, “Data mining in soft computing framework: A
survey,” IEEE transactions on neural networks, vol. 13, no. 1, pp. 3–14, 2002.
[26] E. Hullermeier, “Fuzzy methods in machine learning and data mining: Status and
prospects,” Fuzzy Sets and Systems, vol. 156, no. 3, pp. 387–406, 2005.
Bibliography 50
[27] J. Vieira, F. M. Dias, and A. Mota, “Neuro-fuzzy systems: a survey,” in 5th WSEAS
NNA International Conference on Neural Networks and Applications, Udine, Italia,
2004.
[28] D. G. Feitelson, “Experimental computer science: The need for a cultural change,”
Internet version: http://www. cs. huji. ac. il/˜ feit/papers/exp05. pdf, 2006.
[29] Contributors, “Neuroph.” [Online]. Available: http://neuroph.sourceforge.net/
index.html
[30] M. Rogers, The definition and measurement of productivity. Melbourne Institute
of Applied Economic and Social Research, 1998.
[31] R. Hecht-Nielsen, “Theory of the backpropagation neural network,” in Neural Net-
works, 1989. IJCNN., International Joint Conference on. IEEE, 1989, pp. 593–605.
[32] B. Widrow and M. A. Lehr, “30 years of adaptive neural networks: perceptron,
madaline, and backpropagation,” Proceedings of the IEEE, vol. 78, no. 9, pp. 1415–
1442, 1990.