60
Universidad de Guadalajara Centro Universitario de Ciencias Econ´ omicas Administrativas Neuro-Fuzzy Data Mining Mexico’s Economic Data Thesis submitted to obtain the degree of Maestro de Tecnolog´ ıas de Informaci´ on presented by Gustavo Becerra Gavi˜ no Director: Dra. Liliana Ibeth Barbosa Santill´ an Co-director: Dr. J´ erˆ ome Leboeuf Pasquier Assessor: Dr. Alberto Ram´ ırez Ruiz December 2013

Neuro-Fuzzy Data Mining Mexico's Economic Data

Embed Size (px)

Citation preview

Universidad de Guadalajara

Centro Universitario de Ciencias Economicas Administrativas

Neuro-Fuzzy Data Mining Mexico’s Economic Data

Thesis submitted to obtain the degree of Maestro de Tecnologıas de Informacion

presented by

Gustavo Becerra Gavino

Director: Dra. Liliana Ibeth Barbosa Santillan

Co-director: Dr. Jerome Leboeuf Pasquier

Assessor: Dr. Alberto Ramırez Ruiz

December 2013

UNIVERSIDAD DE GUADALAJARA

Abstract

MAESTRIA EN TECNOLOGIAS DE INFORMACION

Centro Universitario de Ciencias Economicas Administrativas

Maestro en Tecnologıas de Informacion

Neuro-Fuzzy Data Mining Mexico’s Economic Data

by Gustavo Becerra Gavino

Given the increase of data being collected, there is a need to explore the use of tools

to automate the recognition and extraction of patterns within some targeted data. The

present work explores the use of a neuro-fuzzy classifier for the multi-factor productivity

from the manufacturing sector in the Mexican economy. The chosen data set contains the

time series for the variables: Sale Value of products, Wages, Work Force, Days Worked,

and Hours Worked. The data is taken from the Banco de Informacion Economica at the

Instituto Nacional de Estadıstica y Geografıa.

Acknowledgements

Thanks to Dr. Liliana Ibeth Barbosa Santilla, Dr. Jerome Leboeuf Pasquier, Dr. Al-

berto Ramırez Ruiz, and MS Leonel Perez Pelayo who helped me get through developing

and writing this document. Thanks to all the instructors I’ve had throughout my aca-

demic life. All of you have contributed to my development as a person and as a computer

scientist.

ii

Contents

Abstract i

Acknowledgements ii

List of Figures v

List of Tables vii

Abbreviations viii

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Analysis Tools Readily Available . . . . . . . . . . . . . . . . . . . . . . . 1

1.3 Analytics tools at INEGI . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Justification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Giants’ Work 8

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.3 Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Neuro-Fuzzy Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4.1 Neuro-Fuzzy Perceptron . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4.2 ANFIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4.3 NEFCLASS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4.4 Neuro-Fuzzy Reasoner . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.5 Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.6 Neuro-Fuzzy Systems in Data Mining . . . . . . . . . . . . . . . . . . . . 13

3 Methodology 14

3.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.3 Concrete Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.4 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4 Neuro-Fuzzy Classifying Productivity 18

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

iii

General Index iv

4.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.3 Membership Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.4 IF-THEN Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.5 Neuro-Fuzzy System Design . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5 MXMiner Neuro-Fuzzy Classifier 23

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.2 Neuro-Fuzzy Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.2.1 productivityNFC.properties A.1 . . . . . . . . . . . . . . . . . . . 24

5.2.2 NeuroFuzzyClassifier.java A.2 . . . . . . . . . . . . . . . . . . . . . 26

5.3 Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5.3.1 mxminer.properties A.3 . . . . . . . . . . . . . . . . . . . . . . . . 28

5.3.2 MXMiner.java A.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5.4 Training Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

6 Conclusions 34

6.1 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6.2 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6.2.1 Neuro-Fuzzy rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6.2.2 Training set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

6.2.3 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

6.2.4 Automated Data Extraction . . . . . . . . . . . . . . . . . . . . . . 35

6.3 Closing Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

A Implementation 36

A.1 productivityNFC.properties . . . . . . . . . . . . . . . . . . . . . . . . 37

A.2 NeuroFuzzyClassifier.java . . . . . . . . . . . . . . . . . . . . . . . . . 38

A.3 mxminer.properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

A.4 MXMiner.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

B Data Sets 43

B.1 BIE c20131110114800.txt . . . . . . . . . . . . . . . . . . . . . . . . . . 43

B.2 BIE Manufacturing.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

B.3 Training Set.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

B.4 Sample Run Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

C Generated Web Services Client 46

C.1 Package mx.org.inegi.sistemas.bie . . . . . . . . . . . . . . . . . . . . . . . 46

Bibliography 48

List of Figures

1.1 Data: Valor de ventas de los productos elaborados . . . . . . . . . . . . . 2

1.2 Data: Remuneraciones totales . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Data: Personal ocupado total . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Data: Dıas trabajados . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.5 Data: Total de horas trabajadas . . . . . . . . . . . . . . . . . . . . . . . 6

2.1 History of neuro-fuzzy data Mining. . . . . . . . . . . . . . . . . . . . . . 8

2.2 Timeline - Artificial Neuron, Perceptron. . . . . . . . . . . . . . . . . . . . 9

2.3 Artificial Neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.4 Muiltilayer Perceptron. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.5 Timeline - Fuzzy Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.6 MF: Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.7 Fuzzy Rules Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.8 Timeline - Neuro-Fuzzy Systems. . . . . . . . . . . . . . . . . . . . . . . . 11

3.1 Experimentation in System Design . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.1 Variables used as Inputs and Outputs for productivity . . . . . . . . . . . 19

4.2 Fuzzy Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.3 MF: Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.4 MF: Wages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.5 MF: Workforce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.6 MF: Days Worked . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.7 MF: Hours Worked . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.8 Fuzzy Rules Antecedents . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.9 Productivity Neuro-Fuzzy Classifier Design . . . . . . . . . . . . . . . . . 22

5.1 Timeline - Neuro-Fuzzy Classifier. . . . . . . . . . . . . . . . . . . . . . . 23

5.2 Layer Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.3 Variable Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.4 Fuzzy Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.5 MF Delimiters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.6 Fuzzy Rule Antecedents . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.7 Input Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.8 Fuzzification Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.9 Rules Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.10 Output Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5.11 Training and Data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

v

List of Figures vi

5.12 Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.13 Training Set.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.14 Training Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.15 Sample Run Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

List of Tables

4.1 Membership Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

C.1 Interface Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

C.2 Class Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

vii

Abbreviations

ANFIS Adaptive-Network-based Fuzzy Inference System

BIE Banco de Informacion Economica

INEGI Instituto Nacional de Estadıstica y Geografıa

KDD Knowledge Discovery in Databases

NEFCLASS NEuro-Fuzzy CLASSification

viii

To my mother, Aurora Gavino Pacheco, who has always looked fora way to ease my journey.

ix

Chapter 1

Introduction

1.1 Motivation

With the advent of the Internet also came the opportunity to share information and

make it more readily accessible. In Mexico the Instituto Nacional de Estadıstica y

Geografıa (INEGI) is in charge of gathering information about the country. The INEGI

maintains the Banco de Informacion Economica (BIE). The BIE is accessible through

an interface available at the INEGI website [1][2]. Given that the information is readily

available the opportunity presents itself to contribute to its understanding by providing

tools to analyze it.

1.2 Analysis Tools Readily Available

The INEGI BIE web interface readily provides as a simple analytical tool, a graphing

utility to visualize the data. The following charts are generated using the utility provided

in the BIE web interface for selected time series.

1

Chapter 1 Introduction 2

Figure 1.1: Data: Valor de ventas de los productos elaborados

Chapter 1 Introduction 3

Figure 1.2: Data: Remuneraciones totales

Chapter 1 Introduction 4

Figure 1.3: Data: Personal ocupado total

Chapter 1 Introduction 5

Figure 1.4: Data: Dıas trabajados

Chapter 1 Introduction 6

Figure 1.5: Data: Total de horas trabajadas

Chapter 1 Introduction 7

1.3 Analytics tools at INEGI

In addition to the graphing utility provided in the BIE web interface, there are other

analytic tools provided in the Analysis Lab.[3]. The INEGI provides access to analytic

tools such as Excel, STATA, and SPSS [4]. IBM SPSS provides the package SPSS Neural

Networks [5] as an extension to the main SPSS statistics software package. However, the

documentation for these products doesn’t mention any implementation of a neuro-fuzzy

system geared for analytics. Therefore, here is the opportunity to explore the usefulness

of neuro-fuzzy systems on the economic information provided at the BIE [2].

1.4 Justification

The public information about the tools used to analyze data at the INEGI tends to

indicate that neuro-fuzzy systems are not being used. Therefore, the present work will

exhibit how a neuro-fuzzy system facilitates the classification of productivity in the

manufacturing sector for a given month. The reasons why it is important to measure

productivity are that it is used for tracing technology, identifying the efficiency of a

given production system, and indexing the standard of living among others [6]. The

productivity discussed in the preset work refers to Mexico’s manufacturing sector. How-

ever, the productivity measurement for smaller economies, for example a company, is

similarly used for strategic planning in operations management[7]. The data for the

variables that will be used to determine productivity is the data graphed in figures 1.1

through 1.5. Section 3.1 presents further details about how the data will be used.

Chapter 2

Giants’ Work

2.1 Introduction

1943 - Artificial Neuron

1958 - The Perceptron

1964 - MLP

1965 - Fuzzy Sets

1992 - MLP, FS, Class’n

1993 - ANFIS

1996 - NEFCLASS

2006 - NFR

2013 - MXMiner NFC

1940 1950 1960 1970 1980 1990 2000 2010 2020

Figure 2.1: History of neuro-fuzzy data Mining.

Even in Greek mythology the existence of intelligent mechanical beings captured human

imagination as far as to describe a mythical bronze being, Talos. In this current age, the

efforts to understand how intelligence exists has provided us with useful tools that help

us make better sense of the phenomena around us. The introduction of a mathematical

model for the biological neuron gave way to having artificial neural networks capable

of learning from the data being processed. Fuzzy logic expresses the linguistic values

for variables. The combination of both neural networks and fuzzy logic provides us

with tools that learn and express values in a more human-like language. The use of

neuro-fuzzy systems in data mining automate the analysis of data.

8

Chapter 2 Giants’ Work 9

2.2 Neural Networks

1940 1950 1960 1970 1980 1990 2000 2010 2020

1943 - Artificial Neuron

1958 - The Perceptron

1964 - MLP

1965 - Fuzzy Sets

1985 - FMFs Perceptron

1992 - MLP, FS, Class’n

1993 - ANFIS

1996 - NEFCLASS

2006 - NFR

2013 - MXMiner NFC

Figure 2.2: Timeline - Artificial Neuron, Perceptron.

In 1943 Warren S. McCulloch and Walter H. Pitts[8] introduced the mathematical model

for the artificial neuron.

Figure 2.3: Artificial Neuron

This is a very simplified model basically consisting of a set of weighted inputs. Those

inputs are aggregated. The aggregated value is then passed through the activation

function and an output is produced. This model by itself does not have much usefulness.

However, it is the building block for more complex systems like the multilayer perceptron.

Input 1

Input 2

Input 3

Input n

Output

HiddenInput Output

Figure 2.4: Muiltilayer Perceptron.

Chapter 2 Giants’ Work 10

In 1958 F. Rosenblatt published the mathematical model for the Perceptron[9, 10].

He took the idea of an artificial neuron a bit further by including excitatory inputs,

inhibitory inputs, and feedback signals[10]. One of the ways a neural network stores

information (learns) is through adjusting the input weights based on the feedback signals.

Later in 1969 Minsky and Papert[11] proved that the Perceptron could not learn the

XOR function[12]. This problem slowed down the advancement in artificial intelligence

until the multilayer perceptron was used to find a solution.

2.3 Fuzzy Logic

1940 1950 1960 1970 1980 1990 2000 2010 2020

1943 - Artificial Neuron

1958 - The Perceptron

1964 - MLP

1965 - Fuzzy Sets

1985 - FMFs Perceptron

1992 - MLP, FS, Class’n

1993 - ANFIS

1996 - NEFCLASS

2006 - NFR

2013 - MXMiner NFC

Figure 2.5: Timeline - Fuzzy Sets.

The notion of fuzzy sets was introduce by Zadeh[13] in 1965. In fuzzy sets the values are

expressed as a degree of membership to the elements of the set. Consider the following

membership function for the variable Value:

Figure 2.6: MF: Value

Chapter 2 Giants’ Work 11

In this membership function, also called characteristic function, the fuzzy set is

{Low,Medium,High}. The fuzzy value of Medium has a degree of membership or

truth of 0 at 305 increasing to 1 at 327; from 327 to 375 it has a degree of 1; then

it decreases from 1 at 375 to 0 at 397. The fuzzy value of High has a value of 0 at

375 increasing to 1 at 397 and staying at 1 from there on. This example illustrates the

fact that a given crisp value for a variable can be a member of two fuzzy sets when the

variable goes through the membership functions and gets fuzzified.

At the same time that a variable looses information through the process of fuzzification,

it gains on flexibility, tolerance, and expressiveness[14]. Fuzzy logic uses IF-THEN

constructs to express the relations between fuzzy variables. For example:

1 IF value IS high AND wages IS high THEN productivity IS low;2 IF value IS high AND wages IS medium THEN productivity IS medium;3 IF value IS high AND wages IS low THEN productivity IS high;

Figure 2.7: Fuzzy Rules Example

2.4 Neuro-Fuzzy Systems

1940 1950 1960 1970 1980 1990 2000 2010 2020

1943 - Artificial Neuron

1958 - The Perceptron

1964 - MLP

1965 - Fuzzy Sets

1985 - FMFs Perceptron

1992 - MLP, FS, Class’n

1993 - ANFIS

1996 - NEFCLASS

2006 - NFR

2013 - MXMiner NFC

Figure 2.8: Timeline - Neuro-Fuzzy Systems.

2.4.1 Neuro-Fuzzy Perceptron

As early as 1985 Keller, Hunt, and Douglas[15] researched the idea of combining the fuzzy

logic and the perceptron. Their efforts aimed at alleviating the problem that the crisp

perceptrons had on converging in the case where the classes in a hyperplane were not

linearly separable. Later came Sankar and Sushimita[16] who introduced the use of fuzzy

membership functions along with a supervised learning perceptron for classification. The

Chapter 2 Giants’ Work 12

fuzzy perceptron is a 3-layered network intended to include knowledge defined in the

rules it is implementing. In machine learning, supervised learning refers to the process

of feeding knowledge previously defined into the system as opposed to unsupervised

learning where the system discovers hidden information as it processes data.

2.4.2 ANFIS

The ANFIS[17], Adaptive-Network-based Fuzzy Inference System, is a multilayer fuzzy

perceptron that uses predefined human knowledge provided in the fuzzy IF-THEN rules.

It also combines adaptive neurons which are neurons with specific parameters that are

updated to achieve a desired input-output mapping as the training set is processed.

2.4.3 NEFCLASS

NEFLASS[18–20] is a 3-layered feedforward fuzzy perceptron. It is intended to determine

the correct class for a given set of values from the input variables. The output neurons

represent the fuzzy set for the variable being classified. The NEFCLASS was used as

the inspiration for the Neuro-Fuzzy Reasoner.

2.4.4 Neuro-Fuzzy Reasoner

The neuro-fuzzy reasoner[21] is based on the NEFCLASS. It is a 4-layered feedforward

fuzzy perceptron. However, it differs from NEFCLASS in that the membership functions

are not modified throughout its execution. It was originally designed to classify how good

a class was based on the score a student had on an exam and how quickly the student

could answer the given exam. The present work takes on the main ideas from this model

and applies them for the classification of Productivity based on five variables: Value,

Wages, Workforce, Days Worked, and Hours Worked.

2.5 Data Mining

We live in an age when information about our activities is being collected constantly.

The amount of data is so vast and varied that the traditional tools and ways of analyzing

such data are rapidly being surpassed in their capacity. It has simply become unfeasible

to meticulously look for patterns hidden within the mountains of data using traditional

statistics and specialized personnel [22]. There comes the need to devise artifacts and

systems to automate the extraction of hidden patterns within all the information there

Chapter 2 Giants’ Work 13

is available to us.

With all the tools available for data mining, sometimes data mining may just be con-

necting the output of one model to another using graphical tools [23]. Even so, the idea

remains the same. Data mining is about discovering new information hidden within the

data. It is the analysis phase within the process of Knowledge Discovery in Databases

(KDD). To that end, computer scientists have devised and the computer industry has

implemented various tools geared to ease the effort in analyzing data [22, 24]. In the

present work the focus is placed on neuro-fuzzy systems applied to data mining.

2.6 Neuro-Fuzzy Systems in Data Mining

The main uses of neuro-fuzzy systems in analytics are for clustering, regression and

classification[25, 26]. In clustering, the data is arranged in groups of similar items as

it is being processed. Therefore clustering is used to discover and learn unsuspected

associations within the data. Clustering uses mainly unsupervised learning. The goal of

a regression is to approximate a relation between two sets X and Y by mapping items

between X and Y. Regression uses generally supervised learning. Classification intends

to place items in a data set within a predefined class based on an assessment of its

features. Classification uses supervised learning[25, 27]. The present work implements

a neuro-fuzzy classifier for the variable “Productivity”.

Chapter 3

Methodology

The present work requires an experiment in system design. In experimental computer

science [28] the process of experimentation in system design has four phases: Idea,

system design, concrete implementation, and experimental evaluation.

Figure 3.1: Experimentation in System Design

The following sections explain how the first cycle in the experimentation in system design

process will be accomplished for the present work.

14

Chapter 3 Methodology 15

3.1 Idea

The idea is to explore the feasibility of using a neuro-fuzzy system to classify the produc-

tivity from the manufacturing sector in Mexico’s economy to facilitate its interpretation.

The INEGI BIE provides a convenient web-based interface to access the data made avail-

able. Through it there are up to 310659 [2] time series obtainable on various topics about

Mexico’s economic information as of November 2013. Given the diversity of attainable

data, it is necessary to study it and focus on what is of interest for a given project.

For the present work, the target is a set of five time series belonging to the monthly

survey for the manufacturing sector. Therefore, in the terms used in the BIE portal all

of them belong to the route theme “Manufacturas >Encuesta mensual de la industria

manufacturera (EMIM)”. Once at that level, the series are found following the given

routes:

1. Valor de ventas de los productos elaborados

2. Remuneraciones totales pagadas >Remuneraciones totales

3. Total de personal ocupado >Personal ocupado total

4. Dıas trabajados

5. Total de horas trabajadas >Total de horas trabajadas

For ease of reference, the following corresponding variables will be used for the rest of

this writing.

1. Value

2. Wages

3. Workforce

4. Days Worked

5. Hours Worked

The data for these variables is available at the BIE[2] by executing a query using the file

provided in Appendix B.1. The downloaded series have a range of January 2007 to June

2013 B.2. Figures 1.1 through 1.5 display the corresponding graphs for the variables.

Superimposing the graphs yields the system visualization in figure 3.2. This system is

oversimplified. Even so, it already illustrates how complex the analysis of productivity

is.

Chapter 3 Methodology 16

Figure 3.2: Variables

Chapter 3 Methodology 17

3.2 System Design

In data mining, neuro-fuzzy systems are used for clustering, regression and classification[25,

26].The present work uses a neuro-fuzzy system to tell (classify) if productivity is low,

medium, or high for a given month based on the variables value, wages, work force, days

worked, and hours worked.

3.3 Concrete Implementation

The neuro-fuzzy system will be implemented using the computer language Java on top

of the Neuroph framework [29]. The implementation will consist of three classes: Neu-

roFuzzyClassifier, NFCFactory, and MXMiner. The NueroFuzzyClassifier class will en-

capsulate a flexible implementation for a neuro-fuzzy classifier. The NFCFactory class

will be used to produce an instance of the NeuroFuzzyClassifier based on a productiv-

ityNFC.properties file holding the features defined for the neuro-fuzzy classifier. The

MXMiner class will perform the loading of the training set, training the system, and

imputing the data into the system.

3.4 Experimental Evaluation

The implemented system will be executed using the data in Appendix B.2. The results

obtained will tell us how good productivity was for a given month. Since the imple-

mentation of the neuro-fuzzy system in the present work is a prototype, the results can

be improved. The accuracy of the results depends on the information provided to the

system during the learning process. That includes the fuzzy rule constructs and the

training set. The more accurate the information infused into the system the more accu-

rate the results will be. The execution of the neuro-fuzzy system will conclude the first

cycle in experimentation in system design.

Chapter 4

Neuro-Fuzzy Classifying

Productivity

4.1 Introduction

The most basic definition of Productivity[30] in a production system is:

Productivity =Output

Input(4.1)

Definition of Productivity used in the present work

This is a simplified definition. Multifactor productivity ivolves one output and many

inputs[6, 7]. Still, for the present work the information that is necessary to understand

about productivity is that by definition, productivity is directly proportional to the

output and inversely proportional to the inputs in a production system. Therefore a high

output tends to improve productivity and a high input tends to decrease productivity.

4.2 Variables

In chapters 1 and 3 the information available at the INEGI1 BIE1 is discussed. The

chosen time series contain the data for the variables Value, Wages, Workforce, Days

Worked, and Hours Worked. In manufacturing, these variables can be classified as

follows:

18

Chapter 4 Neuro-Fuzzy Classifying Productivity 19

Figure 4.1: Variables used as Inputs and Outputs for productivity

Figure 3.2 presents the visualization of the inputs and output for the manufacturing

sector in the Mexican economy. The question about productivity in such system is then:

When is productivity low, medium, or high? The values low, medium, and high in this

question are the values for the fuzzy set for Productivity. The fuzzy sets for the other

variables are defined similarly. Thus the fuzzy sets for the system are:

Figure 4.2: Fuzzy Sets

Given that the delimiters for the fuzzy sets for the fuzzy functions are defined by the

available human knowledge, they are roughly based on the statistical quartiles for the

time series. Section 4.3 presents the membership functions for the system. The trapezoid

function is used for ease of implementation only four points are needed as the delimiters

for the functions. The data for the variable Wages for each month varies very slightly

with the exception of the month of December. Thus the delimiters for the membership

functions for Wages in figure 4.5 are closer in shape compared to the rest with the

exception of the membership functions for days worked. As is evident, the membership

functions for the number of Days Worked in figure 4.6 take up a rectangular shape.

This is due to the fact that the Days Worked for a given month is an integer. The

range for Days Worked in the data set is between 23 and 28 with most of the months

having 26 days worked.

Chapter 4 Neuro-Fuzzy Classifying Productivity 20

Table 4.1: Membership Functions

4.3 Membership Functions

Figure 4.3: MF: Value Figure 4.4: MF: Wages

Figure 4.5: MF: Workforce Figure 4.6: MF: Days Worked

Figure 4.7: MF: Hours Worked

Chapter 4 Neuro-Fuzzy Classifying Productivity 21

4.4 IF-THEN Constructs

As already established in equation 4.1, productivity is directly proportional to the out-

puts and inversely proportional to the inputs in a production system. Therefore, based

on that knowledge the antecedent parts of the fuzzy rules are constructed as shown in

figure 4.8. The consequent part of the rules will be learned by the neuro-fuzzy classifier

based on the desired output in the training set. See section 5.14.

1 IF value IS high AND wages IS medium;\2 IF value IS high AND wages IS low;\3 IF value IS medium AND wages IS high;\4 IF value IS medium AND wages IS medium;\5 IF value IS medium AND wages IS low;\6 IF value IS low AND wages IS high;\7 IF value IS low AND wages IS medium;\8 IF value IS low AND wages IS low;\9 \

10 IF value IS high AND work_force IS high;\11 IF value IS high AND work_force IS medium;\12 IF value IS high AND work_force IS low;\13 IF value IS medium AND work_force IS high;\14 IF value IS medium AND work_force IS medium;\15 IF value IS medium AND work_force IS low;\16 IF value IS low AND work_force IS high;\17 IF value IS low AND work_force IS medium;\18 IF value IS low AND work_force IS low;\19 \20 IF value IS high AND days_worked IS high;\21 IF value IS high AND days_worked IS medium;\22 IF value IS high AND days_worked IS low;\23 IF value IS medium AND days_worked IS high;\24 IF value IS medium AND days_worked IS medium;\25 IF value IS medium AND days_worked IS low;\26 IF value IS low AND days_worked IS high;\27 IF value IS low AND days_worked IS medium;\28 IF value IS low AND days_worked IS low;\29 \30 IF value IS high AND hours_worked IS high;\31 IF value IS high AND hours_worked IS medium;\32 IF value IS high AND hours_worked IS low;\33 IF value IS medium AND hours_worked IS high;\34 IF value IS medium AND hours_worked IS medium;\35 IF value IS medium AND hours_worked IS low;\36 IF value IS low AND hours_worked IS high;\37 IF value IS low AND hours_worked IS medium;\38 IF value IS low AND hours_worked IS low;

Figure 4.8: Fuzzy Rules Antecedents

4.5 Neuro-Fuzzy System Design

In the next page figure 4.9 is a pictorial representation of the neuro-fuzzy system topol-

ogy. The design implements the neuro-fuzzy classifier for productivity of the Mexican

manufacturing production system. The inputs are Wages, Work Force, Days Worked,

and Hours Worked; the output is Value. The fuzzy antecedents are implemented through

the neuron connections between layers 2 and 3. The neural network output is the clas-

sification of productivity based on the neural network input variables.

Chapter 4 Neuro-Fuzzy Classifying Productivity 22

Figure 4.9: Productivity Neuro-Fuzzy Classifier Design

Chapter 5

MXMiner Neuro-Fuzzy Classifier

1943 - Artificial Neuron

1958 - The Perceptron

1964 - MLP

1965 - Fuzzy Sets

1992 - MLP, FS, Class’n

1993 - ANFIS

1996 - NEFCLASS

2006 - NFR

2013 - MXMiner NFC

1940 1950 1960 1970 1980 1990 2000 2010 2020

Figure 5.1: Timeline - Neuro-Fuzzy Classifier.

5.1 Introduction

The implementation of the system is based on the idea exposed in the Neuro-fuzzy

Reasoner[21]. However, since the implementation for that work is particular to the prob-

lem described in [21], it was necessary to implement a more flexible neuro-fuzzy classifier

using the artificial intelligence Java library Neuroph[29]. The system involves four main

files: productivityNFC.properties, NeuroFuzzyClassifier.java, mxminer.properties, and

MXMiner.java. The entire content of these files is included in Appendix A. Following

there is a brief explanation about the purpose of portions from those files.

23

Chapter 5 MXMiner Neuro-Fuzzy Classifier 24

5.2 Neuro-Fuzzy Classifier

5.2.1 productivityNFC.properties A.1

This file is a standard Java properties file. It provides the specifications for the neuro-

fuzzy system. The first 5 lines define the labels for the layers.

1 #Layer labels 4 needed.2 nfc.layer.label.1=Input3 nfc.layer.label.2=Fuzzification4 nfc.layer.label.3=Rules5 nfc.layer.label.4=Output

Figure 5.2: Layer Labels

Lines 8-9 define the names for the input and output variables. These names will be

used through the rest of the property keys to define the fuzzy sets, membership function

delimeters, and fuzzy rule antecedents.

8 nfc.input_variables=value,wages,work_force,days_worked,hours_worked9 nfc.output_variable=productivity

Figure 5.3: Variable Names

Lines 12-17 define the fuzzy sets for the variables. The variable names previously defined

are used within the key exactly as they were spelled earlier.

12 nfc.fz.value=low,medium,high13 nfc.fz.wages=low,medium,high14 nfc.fz.work_force=low,medium,high15 nfc.fz.days_worked=low,medium,high16 nfc.fz.hours_worked=low,medium,high17 nfc.fz.productivity=low,medium,high

Figure 5.4: Fuzzy Sets

Chapter 5 MXMiner Neuro-Fuzzy Classifier 25

Lines 20-38 define the membership functions delimiters.

20 nfc.mf.delims.value.low=14275406,14697580,305349556,32756250421 nfc.mf.delims.value.medium=305349556,327562504,375477803,39769075122 nfc.mf.delims.value.high=375477803,397690751,47016501,474386752324 nfc.mf.delims.wages.low=13690175,15282811,26213287,2780592325 nfc.mf.delims.wages.medium=26213287,27805923,27720643,2931327926 nfc.mf.delims.wages.high=27720643,29313279,26431270,280239062728 nfc.mf.delims.work_force.low=1896401,1945143,3097305,314604729 nfc.mf.delims.work_force.medium=3097305,3146047,3224960,327370230 nfc.mf.delims.work_force.high=3224960,3273702,2286344,23350863132 nfc.mf.delims.days_worked.low=13,13,26,2633 nfc.mf.delims.days_worked.medium=26,26,27,2734 nfc.mf.delims.days_worked.high=27,27,39,393536 nfc.mf.delims.hours_worked.low=448819,465301,611156,62763837 nfc.mf.delims.hours_worked.medium=611156,627638,641690,65817238 nfc.mf.delims.hours_worked.high=641690,658172,780679,797161

Figure 5.5: MF Delimiters

Lines 44-82 define the fuzzy antecedents for the rules to be used in the system. The

consequent parts of the rules will be learned by the system through the training set.

44 IF value IS high AND wages IS medium;\45 IF value IS high AND wages IS low;\46 IF value IS medium AND wages IS high;\47 IF value IS medium AND wages IS medium;\48 IF value IS medium AND wages IS low;\49 IF value IS low AND wages IS high;\50 IF value IS low AND wages IS medium;\51 IF value IS low AND wages IS low;\52 \53 IF value IS high AND work_force IS high;\54 IF value IS high AND work_force IS medium;\55 IF value IS high AND work_force IS low;\56 IF value IS medium AND work_force IS high;\57 IF value IS medium AND work_force IS medium;\58 IF value IS medium AND work_force IS low;\59 IF value IS low AND work_force IS high;\60 IF value IS low AND work_force IS medium;\61 IF value IS low AND work_force IS low;\62 \63 IF value IS high AND days_worked IS high;\64 IF value IS high AND days_worked IS medium;\65 IF value IS high AND days_worked IS low;\66 IF value IS medium AND days_worked IS high;\67 IF value IS medium AND days_worked IS medium;\68 IF value IS medium AND days_worked IS low;\69 IF value IS low AND days_worked IS high;\70 IF value IS low AND days_worked IS medium;\71 IF value IS low AND days_worked IS low;\72 \73 IF value IS high AND hours_worked IS high;\74 IF value IS high AND hours_worked IS medium;\75 IF value IS high AND hours_worked IS low;\76 IF value IS medium AND hours_worked IS high;\77 IF value IS medium AND hours_worked IS medium;\78 IF value IS medium AND hours_worked IS low;\79 IF value IS low AND hours_worked IS high;\80 IF value IS low AND hours_worked IS medium;\81 IF value IS low AND hours_worked IS low;

Figure 5.6: Fuzzy Rule Antecedents

Chapter 5 MXMiner Neuro-Fuzzy Classifier 26

5.2.2 NeuroFuzzyClassifier.java A.2

This file contains the class implementing a neuro-fuzzy classifier based on the properties

provided on the productivityNFC.properties file.

Lines 54-66 construct the input (first) layer.

54 // Create Input Layer55 NeuronProperties neuronProperties = new NeuronProperties();56

57 String[] inputVariables = (new CSVParser(new StringReader(58 props.getProperty("nfc.input_variables")),csvs)).getLine();59

60 Layer inputLayer = LayerFactory.createLayer(inputVariables.length, neuronProperties);61 inputLayer.setLabel(props.getProperty("nfc.layer.label.1"));62

63 for(i = 0 ; i < inputVariables.length; i++){64 inputLayer.getNeuronAt(i).setLabel(inputVariables[i]);65 }66 this.addLayer(inputLayer);

Figure 5.7: Input Layer

Lines 68-102 construct the fuzzification layer. In this layer each neuron represents a

value in the fuzzy sets for the input variables. Each input neuron is connected to as

many neurons as there are values in the corresponding variable’s fuzzy set. If a variable

has three fuzzy values in its fuzzy set, then the neuron representing the input variable

will be connected to three fuzzification neurons. The transfer function for the neuron

implements the membership function for the fuzzy value.

68 //Create Fuzzification Layer69 neuronProperties.setProperty("transferFunction",TransferFunctionType.TRAPEZOID);70 Layer fuzzyLayer = LayerFactory.createLayer(0,neuronProperties);71 fuzzyLayer.setLabel(props.getProperty("nfc.layer.label.2"));72

73 String[] fuzzySet, mfDelims;74 double[] delims = new double[4];75 Neuron[] inputNeurons = inputLayer.getNeurons();76 Neuron fuzzyNeuron;77 Trapezoid mf;78

79 for(i = 0; i < inputNeurons.length; i++){80 fuzzySet = (new CSVParser(new StringReader(81 props.getProperty("nfc.fz."+inputNeurons[i].getLabel())),csvs)).getLine();82 for(String fuzzyValue : fuzzySet){83 mfDelims = (new CSVParser(new StringReader(84 props.getProperty("nfc.mf.delims."+inputVariables[i]+"."+fuzzyValue))85 ,csvs)).getLine();86 for(j = 0; j < delims.length; j++){87 delims[j] = Double.parseDouble(mfDelims[j]);88 }89 fuzzyNeuron = NeuronFactory.createNeuron(neuronProperties);90 fuzzyNeuron.setLabel(inputNeurons[i].getLabel()+"."+fuzzyValue);91

92 mf = (Trapezoid) fuzzyNeuron.getTransferFunction();93 mf.setLeftLow(delims[0]);94 mf.setLeftHigh(delims[1]);95 mf.setRightLow(delims[2]);96 mf.setRightHigh(delims[3]);97 fuzzyLayer.addNeuron(fuzzyNeuron);98

99 ConnectionFactory.createConnection(inputNeurons[i], fuzzyNeuron, 1);100 }101 }102 this.addLayer(fuzzyLayer);

Chapter 5 MXMiner Neuro-Fuzzy Classifier 27

Figure 5.8: Fuzzification Layer

Lines 104-156 build the rules layer. The nfc.fuzzy rules key in the productivityNFC.properties

file is a string with the character “;” as the delimiter for the rule antecedents. The rule

antecedents are implemented through the connections from the fuzzification neurons to

the neurons in this layer.

104 //Create Rules Layer105 NeuronProperties ruleNeuronProperties = new NeuronProperties(Neuron.class,106 WeightedSum.class, Linear.class);107 Layer rulesLayer = LayerFactory.createLayer(0,ruleNeuronProperties);108 rulesLayer.setLabel(props.getProperty("nfc.layer.label.3"));109110 CSVStrategy rcsvs = NeuroFuzzyClassifier.CSVS_RULES;111 CSVStrategy rpcsvs = NeuroFuzzyClassifier.CSVS_RULE_PARSER;112

113 //Rule tokens length must be 4x of the form:114 //IF <input variable> IS <fuzzy value> {AND <input variable> IS <fuzzy value> }*;115

116 String [] rules = (new CSVParser(new StringReader(117 props.getProperty("nfc.fuzzy_rules")),rcsvs)).getLine();118 String [] ruleTokens;119 String antecedentLabel;120 Neuron antecedent, ruleNeuron;121 Neuron fuzzyNeurons[] = fuzzyLayer.getNeurons();122 boolean found;123

124 for(String rule : rules){125 if(rule.length() == 0) continue;126

127 ruleTokens =(new CSVParser(new StringReader(rule),rpcsvs)).getLine();128 if(ruleTokens.length % 4 != 0){//The rule does not follow the required syntax129 continue;130 }131

132 ruleNeuron = NeuronFactory.createNeuron(ruleNeuronProperties);133 ruleNeuron.setLabel(rule);134

135 for(i = 0; i < (ruleTokens.length); i = i + 4 ){136 //Each neuron has a label <input variable>.<fuzzy value>137 antecedentLabel = ruleTokens[i+1]+"."+ruleTokens[i+3];138 j = 0;139 found = false;140 antecedent = null;141 while(j < fuzzyNeurons.length && !found){142 if(antecedentLabel.equals(fuzzyNeurons[j].getLabel())){143 found = true;144 antecedent = fuzzyNeurons[j];145 }146 j++;147 }148 if(found){149 ConnectionFactory.createConnection(antecedent, ruleNeuron, 1);150 }else{151 System.out.println("Orphan rule neuron. Check rules.");152 }153 }154 rulesLayer.addNeuron(ruleNeuron);155 }156 this.addLayer(rulesLayer);

Figure 5.9: Rules Layer

Lines 158-173 create the output layer. The neurons in this layer represent the fuzzy set

for the output variable.

158 // create the output layer159 neuronProperties = new NeuronProperties();160 neuronProperties.setProperty("transferFunction", TransferFunctionType.LINEAR);161 Layer outputLayer = LayerFactory.createLayer(0, neuronProperties);162 outputLayer.setLabel(props.getProperty("nfc.layer.label.4"));163

164 fuzzySet = (new CSVParser(new StringReader(165 props.getProperty("nfc.fz."+props.getProperty("nfc.output_variable"))),166 csvs)).getLine();167 for(String fuzzyValue : fuzzySet){

Chapter 5 MXMiner Neuro-Fuzzy Classifier 28

168 fuzzyNeuron = NeuronFactory.createNeuron(neuronProperties);169 fuzzyNeuron.setLabel(props.getProperty("nfc.output_variable")+"."+fuzzyValue);170 outputLayer.addNeuron(fuzzyNeuron);171 }172 this.addLayer(outputLayer);

Figure 5.10: Output Layer

5.3 Execution

The actual execution of analysis is encoded within the files mxminer.properties and

MXMiner.java.

5.3.1 mxminer.properties A.3

The two most meaningful keys in this properties file are mxminer.data.src.csv.training set.filename

and mxminer.data.src.csv.filename which are the files used for training the neuro-fuzzy

classifier and to contain the data to be analyzed. The contents of the training set and

the data are exposed in appendices B.3 and B.2.

1 #Training Data Set2 mxminer.data.src.csv.training_set.filename=Training_Set.csv3 #Data Source4 mxminer.data.src.csv.filename=BIE_Manufacturing.csv56 #Messages7 mxminer.msg.info.start=Begin Execution.8 mxminer.msg.info.finished=Finished execution.9 mxminer.msg.info.training=Training classifier.

10 mxminer.msg.info.done_training=Done Training.11 mxminer.msg.info.classifying=Classifying data set.12 mxminer.msg.info.done_classifying=Done classifying data set.

Figure 5.11: Training and Data sets

5.3.2 MXMiner.java A.4

This Java program executes the neuro-fuzzy system training, and the analysis of the

data. Lines 103-107 specify the parameters for the learning process withing the neural

network. The network is set to use backpropagation [31] for correcting the output error

as it processes the learning set in line 107. The learning function SigmoidDeltaRule [32]

is used in the system.

75 NFCFactory factory = new NFCFactory();76 NeuroFuzzyClassifier pnfc = factory.createProductivityNFC();77

78 double[] input = new double[pnfc.getInputNeurons().length];79 double[] output = new double[pnfc.getOutputNeurons().length];80

81 CSVParser csvp = new CSVParser(new FileReader(82 MXMiner.PROPERTIES.getProperty("mxminer.data.src.csv.training_set.filename")));

Chapter 5 MXMiner Neuro-Fuzzy Classifier 29

83 DataSet trainingSet = new DataSet(input.length, output.length);84

85 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.training"));86

87 String[] pLine = csvp.getLine();88 while(pLine != null){89 date = pLine[0];90 for(i = 1; i < pLine.length - output.length; i++){91 input[i - 1] = Double.parseDouble(pLine[i]);92 }93

94 for(i = input.length + 1; i < pLine.length; i++){95 output[i-(input.length + 1)] = Double.parseDouble(pLine[i]);96 }97

98 MXMiner.printIO(date, input, output);99

100 trainingSet.addRow(new DataSetRow(input,output));101 pLine = csvp.getLine();102 }103 SigmoidDeltaRule lr = new SigmoidDeltaRule();104 lr.setMaxError(0.001);105 lr.setMaxIterations(1000000);106 pnfc.setLearningRule(lr);107 pnfc.learn(trainingSet);108

109 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.done_training"));110

111 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.classifying"));112 csvp = new CSVParser(new FileReader(113 MXMiner.PROPERTIES.getProperty("mxminer.data.src.csv.filename")));114 pLine = csvp.getLine();//First Line contains the headers.115 pLine = csvp.getLine();116117

118 while(pLine != null){119 date = pLine[0];120 for(i = 1; i < pLine.length; i++){121 input[i-1] = Double.parseDouble(pLine[i]);122 }123

124 pnfc.setInput(input);125 pnfc.calculate();126 output = pnfc.getOutput();127 for(i=0; i < output.length; i++){128 output[i] = round(output[i],0);129 }130 MXMiner.printIO(date, input, output);131

132 pLine = csvp.getLine();133 }

Figure 5.12: Execution

Chapter 5 MXMiner Neuro-Fuzzy Classifier 30

5.4 Training Set

The training set is build up with arbitrarily selected data points. In this experiment, the

desired output is determined by visual inspection. The blue lines in figure 5.14 represent

the data points used in the training set. The network will learn the consequent part

of the fuzzy rules based on the desired output in this set. Therefore, the training set

accuracy and size has a direct impact on how well the network will classify a given input

set.

1 2007/02,263998702,24973475,3278289,26,619420,1,0,02 2007/03,310607965,27033069,3308320,28,685089,1,0,03 2007/08,308778193,26892918,3299133,28,688920,1,0,04 2007/10,310607965,27033069,3308320,28,685089,1,0,05 2008/02,298032052,26186693,3310715,25,641760,1,0,06 2009/02,271684225,24486493,3001100,25,557060,0,.90,.107 2009/07,298041996,25485422,2920772,27,603720,0,.20,.808 2010/02,313124956,24852226,2979297,24,564321,0,1,09 2010/05,338645819,26293934,3073525,26,612984,.30,.70,0

10 2011/03,391026808,29144214,3150472,26,655261,.60,.40,011 2011/10,412777215,28319083,3181799,27,648098,0,0,112 2012/01,406022459,28154321,3171539,26,639749,0,0,113 2012/06,441333819,30070146,3241250,26,659821,0,0,114 2013/06,441702297,30945010,3278432,26,650930,0,0,1

Figure 5.13: Training Set.csv

Chapter 5 MXMiner Neuro-Fuzzy Classifier 31

Figure 5.14: Training Set

Chapter 5 MXMiner Neuro-Fuzzy Classifier 32

5.5 Results

An execution of the Neuro-Fuzzy Classifier for Productivity based on the variables of

Value, Wages, Workforce, Days Worked, Hours Worked yields the output file in Ap-

pendinx B.4. The entire system is summarized in the following figure:

In the chart of figure 5.15 the lines with the numbers represent how strong productivity

belongs to each of the elements in the productivity fuzzy set. From top to bottom, the

first line is the fuzzy value of High the second line is for Medium and the bottom

line is for Low . Therefore, the output suggests that productivity was between medium

and high for the months of 2007/01 to 2008/02. In contrast, the months of 2013/03 to

2013/08 were high. Note worthy is the output for the months of 2012/05 and 2012/06

that suggests a strongly high productivity for that period. This system output is con-

sistent with the high ratio of outputs over inputs for the manufacturing sector for the

same period. Overall the output of the neuro-fuzzy classifier constructed for the present

work is consistent with the expected results based on a visual inspection on the graphed

data. Thus it facilitates the classification of productivity for a given month However,

the system can be improved in several ways detailed in section 6.2.

Chapter 5 MXMiner Neuro-Fuzzy Classifier 33

Figure 5.15: Sample Run Results

Chapter 6

Conclusions

In the field of information technology there is always a problem to solve and many

ways to solve it. In the present work the implementation for a neuro-fuzzy classifier for

Productivity to classify it as High , Medium , and Low is presented.

6.1 Achievements

The two main contributions in the present work are the NeuroFuzzyClassifier imple-

mentation and the fuzzy classification of manufacturing productivity. The NeuroFuzzy-

Classifier is an implementation of a feedforward neuro-fuzzy classifier based on the work

presented in the neuro-fuzzy reasoner work[21]. Compared to the neuro-fuzzy reasoner

the neuro-fuzzy classifier increases flexibility in implementation and allows for a wider

range of classification applications. The proposed classification of productivity based on

many imputs is a subject to be researched further. With that note, a neuro-fuzzy system

is presented as an approach to classify productivity in complex multi-factor systems.

6.2 Further Work

6.2.1 Neuro-Fuzzy rules

One of the ways to infuse pre-existing knowledge into a neuro-fuzzy system is through

the definition of the fuzzy rules. Better constructed fuzzy rules will yield better results.

To that end, an individual with vast understanding of the intricacies of the system to

be automated will define better fuzzy rules and contribute to its accuracy.

34

Chapter 6 Conclusions 35

6.2.2 Training set

On the same page as with fuzzy rules, the chosen training set is another way of embedding

knowledge into the neuro-fuzzy system. The desired output will be used to learn the

consequent part of the fuzzy rules. Thus if the training set contains precise information,

the system will be able to better cope with the data it analyzes. The available data set

has only 81 items. A subset of that data of 14 items were used for training. If more

data is made available a bigger training set can be selected and thus better train the

neuro-fuzzy classifier.

6.2.3 Output

In its present state, the neuro-fuzzy classifier outputs numbers less than 0 and greater

than 1. However the degree of membership to a fuzzy element is a value between 0 and

1 inclusive [13]. Therefore it is necessary to find a way to normalize the output.

6.2.4 Automated Data Extraction

Given the advances in the information technology now-a-days we are presented with

a variety of tools to process information faster. One of those tools is web services.

The INEGI BIE publishes a WebServices Descriptor. When this descriptor is used to

generate a web services client the classes in Appendix C are created. Those classes

come in handy when automating the Extraction Transformation and Loading of the

information available in the BIE. However, due to scope of the experiment and time

constraints, for the present work it was necessary to download the data using the web

GUI instead of constructing a web services client and execute an extract-transform-load

process automatically.

6.3 Closing Remarks

Through out the present work, the goal is to explore the use of the existing tools geared

towards analytics. Neuro-fuzzy systems is the focus on this work because of their adapt-

ability and learning capabilities. Neuro-fuzzy systems present a good opportunity to

analyze data using the way humans express quantities and use the learning capacity of

neural networks to store information and use that information to adapt to their purposes.

Given how flexible neuro-fuzzy systems are, there may just be a sea of applications wat-

ing to be discovered. For the time being, an application of the neuro-fuzzy system to

facilitate the classification of productivity has been presented.

Appendix A

Implementation

36

Appendix A Implementation 37

A.1 productivityNFC.properties

1 #Layer labels 4 needed.2 nfc.layer.label.1=Input3 nfc.layer.label.2=Fuzzification4 nfc.layer.label.3=Rules5 nfc.layer.label.4=Output67 #Data Variables.8 nfc.input_variables=value,wages,work_force,days_worked,hours_worked9 nfc.output_variable=productivity

1011 #Fuzzy Sets for each variable.12 nfc.fz.value=low,medium,high13 nfc.fz.wages=low,medium,high14 nfc.fz.work_force=low,medium,high15 nfc.fz.days_worked=low,medium,high16 nfc.fz.hours_worked=low,medium,high17 nfc.fz.productivity=low,medium,high1819 #Membership functions delimiters20 nfc.mf.delims.value.low=14275406,14697580,305349556,32756250421 nfc.mf.delims.value.medium=305349556,327562504,375477803,39769075122 nfc.mf.delims.value.high=375477803,397690751,47016501,474386752324 nfc.mf.delims.wages.low=13690175,15282811,26213287,2780592325 nfc.mf.delims.wages.medium=26213287,27805923,27720643,2931327926 nfc.mf.delims.wages.high=27720643,29313279,26431270,280239062728 nfc.mf.delims.work_force.low=1896401,1945143,3097305,314604729 nfc.mf.delims.work_force.medium=3097305,3146047,3224960,327370230 nfc.mf.delims.work_force.high=3224960,3273702,2286344,23350863132 nfc.mf.delims.days_worked.low=13,13,26,2633 nfc.mf.delims.days_worked.medium=26,26,27,2734 nfc.mf.delims.days_worked.high=27,27,39,393536 nfc.mf.delims.hours_worked.low=448819,465301,611156,62763837 nfc.mf.delims.hours_worked.medium=611156,627638,641690,65817238 nfc.mf.delims.hours_worked.high=641690,658172,780679,7971613940 #Fuzzy rules of the form:41 #IF <input variable> IS <fuzzy value> {AND <input variable> IS <fuzzy value> }*;42 nfc.fuzzy_rules=\43 IF value IS high AND wages IS high;\44 IF value IS high AND wages IS medium;\45 IF value IS high AND wages IS low;\46 IF value IS medium AND wages IS high;\47 IF value IS medium AND wages IS medium;\48 IF value IS medium AND wages IS low;\49 IF value IS low AND wages IS high;\50 IF value IS low AND wages IS medium;\51 IF value IS low AND wages IS low;\52 \53 IF value IS high AND work_force IS high;\54 IF value IS high AND work_force IS medium;\55 IF value IS high AND work_force IS low;\56 IF value IS medium AND work_force IS high;\57 IF value IS medium AND work_force IS medium;\58 IF value IS medium AND work_force IS low;\59 IF value IS low AND work_force IS high;\60 IF value IS low AND work_force IS medium;\61 IF value IS low AND work_force IS low;\62 \63 IF value IS high AND days_worked IS high;\64 IF value IS high AND days_worked IS medium;\65 IF value IS high AND days_worked IS low;\66 IF value IS medium AND days_worked IS high;\67 IF value IS medium AND days_worked IS medium;\68 IF value IS medium AND days_worked IS low;\69 IF value IS low AND days_worked IS high;\70 IF value IS low AND days_worked IS medium;\71 IF value IS low AND days_worked IS low;\72 \73 IF value IS high AND hours_worked IS high;\74 IF value IS high AND hours_worked IS medium;\75 IF value IS high AND hours_worked IS low;\76 IF value IS medium AND hours_worked IS high;\77 IF value IS medium AND hours_worked IS medium;\78 IF value IS medium AND hours_worked IS low;\79 IF value IS low AND hours_worked IS high;\80 IF value IS low AND hours_worked IS medium;\81 IF value IS low AND hours_worked IS low;

Appendix A Implementation 38

A.2 NeuroFuzzyClassifier.java

1 package edu.mxminer;23 import java.io.StringReader;4 import java.util.Properties;56 import org.apache.commons.csv.CSVParser;7 import org.apache.commons.csv.CSVStrategy;89 import org.neuroph.core.Layer;

10 import org.neuroph.core.NeuralNetwork;11 import org.neuroph.core.Neuron;12 import org.neuroph.core.input.WeightedSum;13 import org.neuroph.core.transfer.Linear;14 import org.neuroph.core.transfer.Trapezoid;15 import org.neuroph.nnet.learning.LMS;16 import org.neuroph.util.ConnectionFactory;17 import org.neuroph.util.LayerFactory;18 import org.neuroph.util.NeuralNetworkFactory;19 import org.neuroph.util.NeuralNetworkType;20 import org.neuroph.util.NeuronFactory;21 import org.neuroph.util.NeuronProperties;22 import org.neuroph.util.TransferFunctionType;23 /**24 * This class constructs a Neuro-Fuzzy Classifier based on the properties defined25 * in a file following the Java properties file standard. An example can be found26 * in the file productivityNFC.properties27 *28 * @author [email protected] */30 public class NeuroFuzzyClassifier extends NeuralNetwork <LMS>{31 private static final long serialVersionUID = 1L;32

33 /**34 * CSV Strategies for parsing CSV Strings35 */36 private static final CSVStrategy CSVS_DEFAULT = new CSVStrategy(’,’,’"’,’#’);37 private static final CSVStrategy CSVS_RULES = new CSVStrategy(’;’,’"’,’#’);38 private static final CSVStrategy CSVS_RULE_PARSER = new CSVStrategy(’ ’,’"’,’#’);39

40 /**41 * Buids up the NeuralNetwork containing the Neuro-Fuzzy system to be used42 * as a neuro-fuzzy classifier based on the properties provided.43 * @param props properties containing the specifications for the neuro-fuzzy44 * system.45 */46 public void loadFromProperties(Properties props){47 int i, j;//iterator variables.48 try{49 CSVStrategy csvs = NeuroFuzzyClassifier.CSVS_DEFAULT;50

51 // set network type52 this.setNetworkType(NeuralNetworkType.MULTI_LAYER_PERCEPTRON);53

54 // Create Input Layer55 NeuronProperties neuronProperties = new NeuronProperties();56

57 String[] inputVariables = (new CSVParser(new StringReader(58 props.getProperty("nfc.input_variables")),csvs)).getLine();59

60 Layer inputLayer = LayerFactory.createLayer(inputVariables.length, neuronProperties);61 inputLayer.setLabel(props.getProperty("nfc.layer.label.1"));62

63 for(i = 0 ; i < inputVariables.length; i++){64 inputLayer.getNeuronAt(i).setLabel(inputVariables[i]);65 }66 this.addLayer(inputLayer);67

68 //Create Fuzzification Layer69 neuronProperties.setProperty("transferFunction",TransferFunctionType.TRAPEZOID);70 Layer fuzzyLayer = LayerFactory.createLayer(0,neuronProperties);71 fuzzyLayer.setLabel(props.getProperty("nfc.layer.label.2"));72

73 String[] fuzzySet, mfDelims;74 double[] delims = new double[4];75 Neuron[] inputNeurons = inputLayer.getNeurons();76 Neuron fuzzyNeuron;77 Trapezoid mf;78

79 for(i = 0; i < inputNeurons.length; i++){80 fuzzySet = (new CSVParser(new StringReader(81 props.getProperty("nfc.fz."+inputNeurons[i].getLabel())),csvs)).getLine();82 for(String fuzzyValue : fuzzySet){83 mfDelims = (new CSVParser(new StringReader(84 props.getProperty("nfc.mf.delims."+inputVariables[i]+"."+fuzzyValue))85 ,csvs)).getLine();86 for(j = 0; j < delims.length; j++){87 delims[j] = Double.parseDouble(mfDelims[j]);88 }89 fuzzyNeuron = NeuronFactory.createNeuron(neuronProperties);90 fuzzyNeuron.setLabel(inputNeurons[i].getLabel()+"."+fuzzyValue);91

Appendix A Implementation 39

92 mf = (Trapezoid) fuzzyNeuron.getTransferFunction();93 mf.setLeftLow(delims[0]);94 mf.setLeftHigh(delims[1]);95 mf.setRightLow(delims[2]);96 mf.setRightHigh(delims[3]);97 fuzzyLayer.addNeuron(fuzzyNeuron);98

99 ConnectionFactory.createConnection(inputNeurons[i], fuzzyNeuron, 1);100 }101 }102 this.addLayer(fuzzyLayer);103

104 //Create Rules Layer105 NeuronProperties ruleNeuronProperties = new NeuronProperties(Neuron.class,106 WeightedSum.class, Linear.class);107 Layer rulesLayer = LayerFactory.createLayer(0,ruleNeuronProperties);108 rulesLayer.setLabel(props.getProperty("nfc.layer.label.3"));109110 CSVStrategy rcsvs = NeuroFuzzyClassifier.CSVS_RULES;111 CSVStrategy rpcsvs = NeuroFuzzyClassifier.CSVS_RULE_PARSER;112

113 //Rule tokens length must be 4x of the form:114 //IF <input variable> IS <fuzzy value> {AND <input variable> IS <fuzzy value> }*;115

116 String [] rules = (new CSVParser(new StringReader(117 props.getProperty("nfc.fuzzy_rules")),rcsvs)).getLine();118 String [] ruleTokens;119 String antecedentLabel;120 Neuron antecedent, ruleNeuron;121 Neuron fuzzyNeurons[] = fuzzyLayer.getNeurons();122 boolean found;123

124 for(String rule : rules){125 if(rule.length() == 0) continue;126

127 ruleTokens =(new CSVParser(new StringReader(rule),rpcsvs)).getLine();128 if(ruleTokens.length % 4 != 0){//The rule does not follow the required syntax129 continue;130 }131

132 ruleNeuron = NeuronFactory.createNeuron(ruleNeuronProperties);133 ruleNeuron.setLabel(rule);134

135 for(i = 0; i < (ruleTokens.length); i = i + 4 ){136 //Each neuron has a label <input variable>.<fuzzy value>137 antecedentLabel = ruleTokens[i+1]+"."+ruleTokens[i+3];138 j = 0;139 found = false;140 antecedent = null;141 while(j < fuzzyNeurons.length && !found){142 if(antecedentLabel.equals(fuzzyNeurons[j].getLabel())){143 found = true;144 antecedent = fuzzyNeurons[j];145 }146 j++;147 }148 if(found){149 ConnectionFactory.createConnection(antecedent, ruleNeuron, 1);150 }else{151 System.out.println("Orphan rule neuron. Check rules.");152 }153 }154 rulesLayer.addNeuron(ruleNeuron);155 }156 this.addLayer(rulesLayer);157

158 // create the output layer159 neuronProperties = new NeuronProperties();160 neuronProperties.setProperty("transferFunction", TransferFunctionType.LINEAR);161 Layer outputLayer = LayerFactory.createLayer(0, neuronProperties);162 outputLayer.setLabel(props.getProperty("nfc.layer.label.4"));163

164 fuzzySet = (new CSVParser(new StringReader(165 props.getProperty("nfc.fz."+props.getProperty("nfc.output_variable"))),166 csvs)).getLine();167 for(String fuzzyValue : fuzzySet){168 fuzzyNeuron = NeuronFactory.createNeuron(neuronProperties);169 fuzzyNeuron.setLabel(props.getProperty("nfc.output_variable")+"."+fuzzyValue);170 outputLayer.addNeuron(fuzzyNeuron);171 }172 this.addLayer(outputLayer);173

174 ConnectionFactory.fullConnect(rulesLayer, outputLayer);175 NeuralNetworkFactory.setDefaultIO(this);176 this.setLearningRule(new LMS());177 }catch(Exception e){178 e.printStackTrace();179 }180 }181 }

Appendix A Implementation 40

A.3 mxminer.properties

1 #Training Data Set2 mxminer.data.src.csv.training_set.filename=Training_Set.csv3 #Data Source4 mxminer.data.src.csv.filename=BIE_Manufacturing.csv56 #Messages7 mxminer.msg.info.start=Begin Execution.8 mxminer.msg.info.finished=Finished execution.9 mxminer.msg.info.training=Training classifier.

10 mxminer.msg.info.done_training=Done Training.11 mxminer.msg.info.classifying=Classifying data set.12 mxminer.msg.info.done_classifying=Done classifying data set.

Appendix A Implementation 41

A.4 MXMiner.java

1 package edu.mxminer;23 import java.io.FileInputStream;4 import java.io.FileReader;5 import java.math.BigDecimal;6 import java.util.Properties;78 import org.apache.commons.csv.CSVParser;9 import org.neuroph.core.data.DataSet;

10 import org.neuroph.core.data.DataSetRow;11 import org.neuroph.nnet.learning.*;12 /**13 * This class instantiates a NeuroFuzzyClassifier trains it based on the14 * mxminer.data.src.csv.training_set.filename. Then analyzes the data in the15 * mxminer.data.src.csv.filename.16 *17 * @author [email protected] */19 public class MXMiner{20 /**21 * Properties filename.22 */23 private static final String PROPERTIES_FILE_NAME = "mxminer.properties";24

25 /**26 * Application Properties.27 */28 private static final Properties PROPERTIES;29

30 /**31 * Static initialization.32 */33 static{34 PROPERTIES = new Properties();35 try{36 PROPERTIES.load(new FileInputStream(MXMiner.PROPERTIES_FILE_NAME));37 }catch(Exception e){38 e.printStackTrace();39 }40 }41

42 private static void printIO(String date, double[] input, double[] output){43 StringBuilder sb = new StringBuilder();44 sb.append(date).append(",");45 sb.append(arrayToStringBuilder(input));46 sb.append(",").append(arrayToStringBuilder(output));47 System.out.println(sb.toString());48 }49

50 private static StringBuilder arrayToStringBuilder(double[] array){51 StringBuilder sb = new StringBuilder();52 if(array != null && array.length > 0){53 for(int i = 0; i < array.length-1; i++){54 sb.append(array[i]).append(",");55 }56 sb.append(array[array.length-1]);57 }58 return sb;59 }60

61 public static double round(double value, int places) {62 if (places < 0) throw new IllegalArgumentException();63

64 BigDecimal bd = new BigDecimal(value);65 bd = bd.setScale(places, BigDecimal.ROUND_HALF_UP);66 return bd.doubleValue();67 }68

69 public static void main(String[] args ){70 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.start"));71

72 try {73 int i;74 String date;75 NFCFactory factory = new NFCFactory();76 NeuroFuzzyClassifier pnfc = factory.createProductivityNFC();77

78 double[] input = new double[pnfc.getInputNeurons().length];79 double[] output = new double[pnfc.getOutputNeurons().length];80

81 CSVParser csvp = new CSVParser(new FileReader(82 MXMiner.PROPERTIES.getProperty("mxminer.data.src.csv.training_set.filename")));83 DataSet trainingSet = new DataSet(input.length, output.length);84

85 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.training"));86

87 String[] pLine = csvp.getLine();88 while(pLine != null){89 date = pLine[0];90 for(i = 1; i < pLine.length - output.length; i++){

Appendix A Implementation 42

91 input[i - 1] = Double.parseDouble(pLine[i]);92 }93

94 for(i = input.length + 1; i < pLine.length; i++){95 output[i-(input.length + 1)] = Double.parseDouble(pLine[i]);96 }97

98 MXMiner.printIO(date, input, output);99

100 trainingSet.addRow(new DataSetRow(input,output));101 pLine = csvp.getLine();102 }103 SigmoidDeltaRule lr = new SigmoidDeltaRule();104 lr.setMaxError(0.001);105 lr.setMaxIterations(1000000);106 pnfc.setLearningRule(lr);107 pnfc.learn(trainingSet);108

109 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.done_training"));110

111 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.classifying"));112 csvp = new CSVParser(new FileReader(113 MXMiner.PROPERTIES.getProperty("mxminer.data.src.csv.filename")));114 pLine = csvp.getLine();//First Line contains the headers.115 pLine = csvp.getLine();116117

118 while(pLine != null){119 date = pLine[0];120 for(i = 1; i < pLine.length; i++){121 input[i-1] = Double.parseDouble(pLine[i]);122 }123

124 pnfc.setInput(input);125 pnfc.calculate();126 output = pnfc.getOutput();127 for(i=0; i < output.length; i++){128 output[i] = round(output[i],0);129 }130 MXMiner.printIO(date, input, output);131

132 pLine = csvp.getLine();133 }134

135 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.done_classifying"));136

137 } catch (Exception e) {138 e.printStackTrace();139 }140 System.out.println(MXMiner.PROPERTIES.getProperty("mxminer.msg.info.finished"));141 }142 }

Appendix B

Data Sets

B.1 BIE c20131110114800.txt

1 20072 20133 ,222030,218898,228990,227946,225162,4 a5 ap6 v7 Todo8 False9 False

B.2 BIE Manufacturing.csv

1 "Period","Value","Wages","Work Force","Days worked","Hours worked"2 "2007/01",265871675,25172106,3291091,28,6567503 "2007/02",263998702,24973475,3278289,26,6194204 "2007/03",300636026,26716185,3286320,28,6734005 "2007/04",281769622,26054783,3281394,27,6296056 "2007/05",305278703,26737916,3296105,28,6711147 "2007/06",305562113,26390991,3275204,28,6591518 "2007/07",288834753,25376873,3277529,28,6544269 "2007/08",308778193,26892918,3299133,28,688920

10 "2007/09",292154844,25864843,3300073,27,65432411 "2007/10",310607965,27033069,3308320,28,68508912 "2007/11",300047312,26471132,3299723,27,65954013 "2007/12",277996670,33090879,3277176,28,60957714 "2008/01",300751930,26293068,3305799,27,66560615 "2008/02",298032052,26186693,3310715,25,64176016 "2008/03",306349057,26957981,3307268,27,64221417 "2008/04",325122721,27562404,3298995,26,66817318 "2008/05",327394287,27230048,3299519,27,66767819 "2008/06",328782141,27082204,3269197,26,65512520 "2008/07",326554448,26914270,3264595,27,67432921 "2008/08",320795490,26615340,3260806,27,65711222 "2008/09",314156993,26822137,3231608,26,64681123 "2008/10",343472740,27714549,3201666,27,67395024 "2008/11",310335067,26066483,3159720,26,61495025 "2008/12",290632233,32915736,3093774,26,57218126 "2009/01",268634708,25244248,3046084,27,59269327 "2009/02",271684225,24486493,3001100,25,55706028 "2009/03",299988787,25789790,2965733,26,58714229 "2009/04",281915496,25621929,2943663,26,57434530 "2009/05",279742797,24796980,2934755,27,56677531 "2009/06",287433371,25057454,2926165,26,58241232 "2009/07",298041996,25485422,2920772,27,60372033 "2009/08",302537145,24784922,2934465,26,58688634 "2009/09",310531110,25431704,2959633,26,59459335 "2009/10",333414297,26056372,2981907,26,62311336 "2009/11",320418549,25301422,2998887,26,589032

43

Appendix B Data Sets 44

37 "2009/12",318664456,33362030,2987857,26,57575938 "2010/01",309015381,25055729,2966063,26,58522539 "2010/02",313124956,24852226,2979297,24,56432140 "2010/03",345307016,27335872,3002055,27,61780341 "2010/04",326811124,26525066,3033240,26,60757942 "2010/05",338645819,26293934,3073525,26,61298443 "2010/06",352534600,26896039,3068929,26,62600644 "2010/07",339512327,26886968,3095700,27,64260445 "2010/08",355259052,26909978,3114016,27,63430046 "2010/09",348683606,27405315,3125982,26,63704047 "2010/10",352331814,27126456,3139467,27,64209648 "2010/11",350903744,27028506,3129807,26,62265449 "2010/12",344398267,34855546,3112302,26,60432850 "2011/01",345958741,26728127,3102121,26,61965551 "2011/02",341518120,26576823,3120594,24,59271752 "2011/03",391026808,29144214,3150472,26,65526153 "2011/04",357781166,27731620,3162669,25,61404954 "2011/05",376677787,27743063,3166547,27,63861555 "2011/06",381902897,28734739,3172268,26,65322856 "2011/07",373437027,27676239,3167570,27,63812557 "2011/08",393696126,28698604,3171124,27,65576358 "2011/09",396569509,28802392,3184327,26,64870059 "2011/10",412777215,28319083,3181799,27,64809860 "2011/11",411275503,28635623,3179145,26,63524561 "2011/12",396355597,36385165,3158912,26,61054662 "2012/01",406022459,28154321,3171539,26,63974963 "2012/02",400622565,28683264,3180790,25,62207964 "2012/03",432107430,30517276,3200542,26,66477665 "2012/04",401091778,29262928,3217021,26,62407066 "2012/05",432941650,30176955,3237353,26,66717867 "2012/06",441333819,30070146,3241250,26,65982168 "2012/07",423222610,29820748,3236974,27,65738769 "2012/08",429542249,30580464,3241400,27,68233870 "2012/09",406021015,29387218,3253697,25,64582271 "2012/10",434966750,29942844,3257610,27,67964572 "2012/11",428907802,30455156,3257521,26,65942573 "2012/12",398068950,37227588,3245709,26,60384074 "2013/01",416846561,30336538,3238170,26,65843375 "2013/02",398064498,29330063,3249984,24,61776176 "2013/03",415569568,30867138,3264899,26,64029677 "2013/04",423175878,30751826,3281684,26,66094778 "2013/05",430993977,31792104,3291010,26,67921179 "2013/06",441702297,30945010,3278432,26,65093080 "2013/07",436312913,31636887,3291829,27,67516381 "2013/08",447308814,31560790,3299679,26,685197

B.3 Training Set.csv

1 2007/02,263998702,24973475,3278289,26,619420,1,0,02 2007/03,310607965,27033069,3308320,28,685089,1,0,03 2007/08,308778193,26892918,3299133,28,688920,1,0,04 2007/10,310607965,27033069,3308320,28,685089,1,0,05 2008/02,298032052,26186693,3310715,25,641760,1,0,06 2009/02,271684225,24486493,3001100,25,557060,0,.90,.107 2009/07,298041996,25485422,2920772,27,603720,0,.20,.808 2010/02,313124956,24852226,2979297,24,564321,0,1,09 2010/05,338645819,26293934,3073525,26,612984,.30,.70,0

10 2011/03,391026808,29144214,3150472,26,655261,.60,.40,011 2011/10,412777215,28319083,3181799,27,648098,0,0,112 2012/01,406022459,28154321,3171539,26,639749,0,0,113 2012/06,441333819,30070146,3241250,26,659821,0,0,114 2013/06,441702297,30945010,3278432,26,650930,0,0,1

Appendix B Data Sets 45

B.4 Sample Run Results

1 Begin Execution.2 Training classifier.3 2007/02,2.63998702E8,2.4973475E7,3278289.0,26.0,619420.0,1.0,0.0,0.04 2007/03,3.10607965E8,2.7033069E7,3308320.0,28.0,685089.0,1.0,0.0,0.05 2007/08,3.08778193E8,2.6892918E7,3299133.0,28.0,688920.0,1.0,0.0,0.06 2007/10,3.10607965E8,2.7033069E7,3308320.0,28.0,685089.0,1.0,0.0,0.07 2008/02,2.98032052E8,2.6186693E7,3310715.0,25.0,641760.0,1.0,0.0,0.08 2009/02,2.71684225E8,2.4486493E7,3001100.0,25.0,557060.0,0.0,0.9,0.19 2009/07,2.98041996E8,2.5485422E7,2920772.0,27.0,603720.0,0.0,0.2,0.8

10 2010/02,3.13124956E8,2.4852226E7,2979297.0,24.0,564321.0,0.0,1.0,0.011 2010/05,3.38645819E8,2.6293934E7,3073525.0,26.0,612984.0,0.3,0.7,0.012 2011/03,3.91026808E8,2.9144214E7,3150472.0,26.0,655261.0,0.6,0.4,0.013 2011/10,4.12777215E8,2.8319083E7,3181799.0,27.0,648098.0,0.0,0.0,1.014 2012/01,4.06022459E8,2.8154321E7,3171539.0,26.0,639749.0,0.0,0.0,1.015 2012/06,4.41333819E8,3.0070146E7,3241250.0,26.0,659821.0,0.0,0.0,1.016 2013/06,4.41702297E8,3.094501E7,3278432.0,26.0,650930.0,0.0,0.0,1.017 Done Training.18 Classifying data set.19 2007/01,2.65871675E8,2.5172106E7,3291091.0,28.0,656750.0,-1.0,2.0,2.020 2007/02,2.63998702E8,2.4973475E7,3278289.0,26.0,619420.0,1.0,2.0,2.021 2007/03,3.00636026E8,2.6716185E7,3286320.0,28.0,673400.0,-1.0,2.0,2.022 2007/04,2.81769622E8,2.6054783E7,3281394.0,27.0,629605.0,0.0,3.0,2.023 2007/05,3.05278703E8,2.6737916E7,3296105.0,28.0,671114.0,-1.0,2.0,2.024 2007/06,3.05562113E8,2.6390991E7,3275204.0,28.0,659151.0,-1.0,2.0,2.025 2007/07,2.88834753E8,2.5376873E7,3277529.0,28.0,654426.0,-1.0,2.0,2.026 2007/08,3.08778193E8,2.6892918E7,3299133.0,28.0,688920.0,-1.0,2.0,2.027 2007/09,2.92154844E8,2.5864843E7,3300073.0,27.0,654324.0,-1.0,2.0,2.028 2007/10,3.10607965E8,2.7033069E7,3308320.0,28.0,685089.0,-1.0,2.0,2.029 2007/11,3.00047312E8,2.6471132E7,3299723.0,27.0,659540.0,-1.0,2.0,2.030 2007/12,2.7799667E8,3.3090879E7,3277176.0,28.0,609577.0,0.0,2.0,0.031 2008/01,3.0075193E8,2.6293068E7,3305799.0,27.0,665606.0,-1.0,2.0,2.032 2008/02,2.98032052E8,2.6186693E7,3310715.0,25.0,641760.0,0.0,1.0,2.033 2008/03,3.06349057E8,2.6957981E7,3307268.0,27.0,642214.0,0.0,3.0,2.034 2008/04,3.25122721E8,2.7562404E7,3298995.0,26.0,668173.0,1.0,2.0,3.035 2008/05,3.27394287E8,2.7230048E7,3299519.0,27.0,667678.0,0.0,3.0,3.036 2008/06,3.28782141E8,2.7082204E7,3269197.0,26.0,655125.0,1.0,2.0,3.037 2008/07,3.26554448E8,2.691427E7,3264595.0,27.0,674329.0,1.0,4.0,3.038 2008/08,3.2079549E8,2.661534E7,3260806.0,27.0,657112.0,0.0,3.0,3.039 2008/09,3.14156993E8,2.6822137E7,3231608.0,26.0,646811.0,1.0,2.0,3.040 2008/10,3.4347274E8,2.7714549E7,3201666.0,27.0,673950.0,1.0,4.0,2.041 2008/11,3.10335067E8,2.6066483E7,3159720.0,26.0,614950.0,1.0,2.0,2.042 2008/12,2.90632233E8,3.2915736E7,3093774.0,26.0,572181.0,0.0,1.0,0.043 2009/01,2.68634708E8,2.5244248E7,3046084.0,27.0,592693.0,0.0,3.0,1.044 2009/02,2.71684225E8,2.4486493E7,3001100.0,25.0,557060.0,0.0,1.0,1.045 2009/03,2.99988787E8,2.578979E7,2965733.0,26.0,587142.0,1.0,2.0,1.046 2009/04,2.81915496E8,2.5621929E7,2943663.0,26.0,574345.0,1.0,2.0,1.047 2009/05,2.79742797E8,2.479698E7,2934755.0,27.0,566775.0,0.0,3.0,1.048 2009/06,2.87433371E8,2.5057454E7,2926165.0,26.0,582412.0,1.0,2.0,1.049 2009/07,2.98041996E8,2.5485422E7,2920772.0,27.0,603720.0,0.0,3.0,1.050 2009/08,3.02537145E8,2.4784922E7,2934465.0,26.0,586886.0,1.0,2.0,1.051 2009/09,3.1053111E8,2.5431704E7,2959633.0,26.0,594593.0,1.0,2.0,1.052 2009/10,3.33414297E8,2.6056372E7,2981907.0,26.0,623113.0,2.0,2.0,1.053 2009/11,3.20418549E8,2.5301422E7,2998887.0,26.0,589032.0,1.0,2.0,2.054 2009/12,3.18664456E8,3.336203E7,2987857.0,26.0,575759.0,1.0,1.0,1.055 2010/01,3.09015381E8,2.5055729E7,2966063.0,26.0,585225.0,1.0,2.0,1.056 2010/02,3.13124956E8,2.4852226E7,2979297.0,24.0,564321.0,0.0,2.0,1.057 2010/03,3.45307016E8,2.7335872E7,3002055.0,27.0,617803.0,2.0,4.0,1.058 2010/04,3.26811124E8,2.6525066E7,3033240.0,26.0,607579.0,2.0,3.0,2.059 2010/05,3.38645819E8,2.6293934E7,3073525.0,26.0,612984.0,2.0,2.0,1.060 2010/06,3.525346E8,2.6896039E7,3068929.0,26.0,626006.0,2.0,3.0,1.061 2010/07,3.39512327E8,2.6886968E7,3095700.0,27.0,642604.0,1.0,3.0,1.062 2010/08,3.55259052E8,2.6909978E7,3114016.0,27.0,634300.0,1.0,4.0,1.063 2010/09,3.48683606E8,2.7405315E7,3125982.0,26.0,637040.0,1.0,3.0,2.064 2010/10,3.52331814E8,2.7126456E7,3139467.0,27.0,642096.0,1.0,4.0,1.065 2010/11,3.50903744E8,2.7028506E7,3129807.0,26.0,622654.0,2.0,3.0,1.066 2010/12,3.44398267E8,3.4855546E7,3112302.0,26.0,604328.0,2.0,2.0,0.067 2011/01,3.45958741E8,2.6728127E7,3102121.0,26.0,619655.0,2.0,3.0,1.068 2011/02,3.4151812E8,2.6576823E7,3120594.0,24.0,592717.0,1.0,2.0,1.069 2011/03,3.91026808E8,2.9144214E7,3150472.0,26.0,655261.0,0.0,3.0,1.070 2011/04,3.57781166E8,2.773162E7,3162669.0,25.0,614049.0,2.0,3.0,1.071 2011/05,3.76677787E8,2.7743063E7,3166547.0,27.0,638615.0,1.0,4.0,2.072 2011/06,3.81902897E8,2.8734739E7,3172268.0,26.0,653228.0,1.0,2.0,1.073 2011/07,3.73437027E8,2.7676239E7,3167570.0,27.0,638125.0,1.0,4.0,2.074 2011/08,3.93696126E8,2.8698604E7,3171124.0,27.0,655763.0,0.0,4.0,1.075 2011/09,3.96569509E8,2.8802392E7,3184327.0,26.0,648700.0,1.0,3.0,1.076 2011/10,4.12777215E8,2.8319083E7,3181799.0,27.0,648098.0,0.0,2.0,1.077 2011/11,4.11275503E8,2.8635623E7,3179145.0,26.0,635245.0,0.0,1.0,1.078 2011/12,3.96355597E8,3.6385165E7,3158912.0,26.0,610546.0,2.0,3.0,1.079 2012/01,4.06022459E8,2.8154321E7,3171539.0,26.0,639749.0,0.0,1.0,1.080 2012/02,4.00622565E8,2.8683264E7,3180790.0,25.0,622079.0,0.0,1.0,0.081 2012/03,4.3210743E8,3.0517276E7,3200542.0,26.0,664776.0,0.0,0.0,2.082 2012/04,4.01091778E8,2.9262928E7,3217021.0,26.0,624070.0,1.0,2.0,0.083 2012/05,4.3294165E8,3.0176955E7,3237353.0,26.0,667178.0,0.0,0.0,2.084 2012/06,4.41333819E8,3.0070146E7,3241250.0,26.0,659821.0,0.0,0.0,2.085 2012/07,4.2322261E8,2.9820748E7,3236974.0,27.0,657387.0,-1.0,1.0,1.086 2012/08,4.29542249E8,3.0580464E7,3241400.0,27.0,682338.0,0.0,1.0,2.087 2012/09,4.06021015E8,2.9387218E7,3253697.0,25.0,645822.0,0.0,0.0,1.088 2012/10,4.3496675E8,2.9942844E7,3257610.0,27.0,679645.0,0.0,1.0,2.089 2012/11,4.28907802E8,3.0455156E7,3257521.0,26.0,659425.0,0.0,0.0,2.090 2012/12,3.9806895E8,3.7227588E7,3245709.0,26.0,603840.0,1.0,1.0,1.091 2013/01,4.16846561E8,3.0336538E7,3238170.0,26.0,658433.0,0.0,0.0,2.092 2013/02,3.98064498E8,2.9330063E7,3249984.0,24.0,617761.0,1.0,1.0,1.093 2013/03,4.15569568E8,3.0867138E7,3264899.0,26.0,640296.0,1.0,1.0,1.094 2013/04,4.23175878E8,3.0751826E7,3281684.0,26.0,660947.0,0.0,0.0,1.095 2013/05,4.30993977E8,3.1792104E7,3291010.0,26.0,679211.0,0.0,0.0,1.096 2013/06,4.41702297E8,3.094501E7,3278432.0,26.0,650930.0,0.0,0.0,1.097 2013/07,4.36312913E8,3.1636887E7,3291829.0,27.0,675163.0,0.0,1.0,1.098 2013/08,4.47308814E8,3.156079E7,3299679.0,26.0,685197.0,0.0,1.0,1.099 Done classifying data set.

100 Finished execution.

Appendix C

Generated Web Services Client

C.1 Package mx.org.inegi.sistemas.bie

Table C.1: Interface Summary

Interface Description

WebServiceBackFillAjaxHttpGet This class was generated by Apache CXF 2.7.6 2013-08-24T12:44:14.182-05:00WebServiceBackFillAjaxHttpPost This class was generated by Apache CXF 2.7.6 2013-08-24T12:44:14.106-05:00WebServiceBackFillAjaxSoap This class was generated by Apache CXF 2.7.6 2013-08-24T12:44:14.157-05:00

46

Appendix C Generated Web Services Client 47

Table C.2: Class Summary

Class Description

AddAllChilds Java class for anonymous complex type.AddAllChildsResponse Java class for anonymous complex type.AgregaNodos Java class for anonymous complex type.AgregaNodosBuscador Java class for anonymous complex type.AgregaNodosBuscadorResponse Java class for anonymous complex type.AgregaNodosResponse Java class for anonymous complex type.AgregaSeriesCar Java class for anonymous complex type.AgregaSeriesCarResponse Java class for anonymous complex type.ArrayOfSerieValores2005Periodo Java class for ArrayOfSerieValores2005Periodo complex type.ArrayOfString Java class for ArrayOfString complex type.BuildPanelContains Java class for anonymous complex type.BuildPanelContainsResponse Java class for anonymous complex type.CheckCarSeries Java class for anonymous complex type.CheckCarSeriesResponse Java class for anonymous complex type.ConsultaCuadro Java class for anonymous complex type.ConsultaCuadroResponse Java class for anonymous complex type.DatosXSerie Java class for anonymous complex type.DatosXSerieResponse Java class for anonymous complex type.ExportaIQY Java class for anonymous complex type.ExportaIQYResponse Java class for anonymous complex type.ExportaMetadato Java class for anonymous complex type.ExportaMetadatoResponse Java class for anonymous complex type.ExportaSeries Java class for anonymous complex type.ExportaSeriesResponse Java class for anonymous complex type.ExtraeMetaDato Java class for anonymous complex type.ExtraeMetaDatoResponse Java class for anonymous complex type.FillYearsControls Java class for anonymous complex type.FillYearsControlsResponse Java class for anonymous complex type.FrecuenciasActivas Java class for anonymous complex type.FrecuenciasActivasResponse Java class for anonymous complex type.GuardaSeriesSelecciondas Java class for anonymous complex type.GuardaSeriesSelecciondasResponse Java class for anonymous complex type.InterSeriesSelecToCar Java class for anonymous complex type.InterSeriesSelecToCarResponse Java class for anonymous complex type.JerarquiaDownTemaSerie Java class for anonymous complex type.JerarquiaDownTemaSerieResponse Java class for anonymous complex type.LimpiaSeriesSeleccionadas Java class for anonymous complex type.LimpiaSeriesSeleccionadasResponse Java class for anonymous complex type.ObjectFactory This object contains factory methods.SeleccionaPorSerie Java class for anonymous complex type.SeleccionaPorSerieResponse Java class for anonymous complex type.SeleccionaTodasSeries Java class for anonymous complex type.SeleccionaTodasSeriesResponse Java class for anonymous complex type.SeleccionaTodo Java class for anonymous complex type.SeleccionaTodoResponse Java class for anonymous complex type.SerieInformation Java class for anonymous complex type.SerieInformationResponse Java class for anonymous complex type.SeriesGrafica Java class for anonymous complex type.SeriesGraficaResponse Java class for anonymous complex type.SeriesParaConsulta Java class for anonymous complex type.SeriesParaConsultaResponse Java class for anonymous complex type.SeriesSeleccionadas Java class for anonymous complex type.SeriesSeleccionadasResponse Java class for anonymous complex type.SerieValores2005Periodo Java class for SerieValores2005Periodo complex type.SesionesBuscador Java class for anonymous complex type.SesionesBuscadorResponse Java class for anonymous complex type.SetSystemAccess Java class for anonymous complex type.SetSystemAccessFast Java class for anonymous complex type.SetSystemAccessFastResponse Java class for anonymous complex type.SetSystemAccessResponse Java class for anonymous complex type.TotalElementosdelaRama Java class for anonymous complex type.TotalElementosdelaRamaResponse Java class for anonymous complex type.WebServiceBackFillAjax This class was generated by Apache CXF 2.7.6 2013-08-24T12:44:14.205-05:00

Bibliography

[1] INEGI, “Instituto nacional de estadıstica y geografıa.” [Online]. Available:

http://www.inegi.org.mx/

[2] ——, “Banco de informacion economica.” [Online]. Available: http://www.inegi.

org.mx/sistemas/bie/

[3] ——, “Laboratorio de analisis de datos.” [Online]. Available: http://www.inegi.org.

mx/est/contenidos/proyectos/accesomicrodatos/default lab analisis datos.aspx

[4] ——, “Laboratorio de analisis de datos - usuarios de organismos internacionales,

instituciones academicas o de investigacion.” [Online]. Available: http://www.

inegi.org.mx/est/contenidos/proyectos/accesomicrodatos/default lad ia.aspx

[5] IBM SPSS Neural Networks 20, IBM, 2011.

[6] P. Schreyer and D. Pilat, “Measuring productivity,” OECD Economic studies,

vol. 33, no. 2001/2, pp. 127–170, 2001.

[7] W. J. Stevenson and M. Hojati, Operations management. McGraw-Hill/Irwin

Boston, 2007, vol. 8.

[8] W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous

activity,” The Bulletin of Mathematical Biophysics, vol. 5, no. 4, pp. 115–133, 1943.

[9] F. Rosenblatt, “The perceptron: a probabilistic model for information storage and

organization in the brain,” Psychological Review, vol. 65, no. 6, pp. 386–408, Nov.

1958.

[10] ——, Two theorems of statistical separability in the perceptron. United States

Department of Commerce, 1958.

[11] M. Minsky and P. Seymour, “Perceptrons.” 1969.

[12] J. Leboeuf Pasquier, Programacio Basada en Redes Neuronales. Amate Editorial,

2006.

48

Bibliography 49

[13] L. A. Zadeh, “Fuzzy sets,” Information and control, vol. 8, no. 3, pp. 338–353, 1965.

[14] J. Leboeuf Pasquier, Programacio Basada en Logica Difusa. Amate Editorial,

2006.

[15] J. M. Keller and D. J. Hunt, “Incorporating fuzzy membership functions into the

perceptron algorithm,” Pattern Analysis and Machine Intelligence, IEEE Transac-

tions on, no. 6, pp. 693–699, 1985.

[16] S. K. Pal and S. Mitra, “Multilayer perceptron, fuzzy sets, and classification,”

Neural Networks, IEEE Transactions on, vol. 3, no. 5, pp. 683–697, 1992.

[17] J.-S. Jang, “Anfis: Adaptive-network-based fuzzy inference system,” Systems, Man

and Cybernetics, IEEE Transactions on, vol. 23, no. 3, pp. 665–685, 1993.

[18] D. Nauck and R. Kruse, “Nefclassmdash; a neuro-fuzzy approach for the classifi-

cation of data,” in Proceedings of the 1995 ACM symposium on applied computing.

ACM, 1995, pp. 461–465.

[19] D. Nauck, U. Nauck, and R. Kruse, “Generating classification rules with the neuro-

fuzzy system nefclass,” in Fuzzy Information Processing Society, 1996. NAFIPS.

1996 Biennial Conference of the North American. IEEE, 1996, pp. 466–470.

[20] D. D. Nauck, “Fuzzy data analysis with nefclass,” in IFSA World Congress and

20th NAFIPS International Conference, 2001. Joint 9th, vol. 3. IEEE, 2001, pp.

1413–1418.

[21] Z. Sevarac, “Neuro fuzzy reasoner for student modeling,” in Advanced Learning

Technologies, 2006. Sixth International Conference on. IEEE, 2006, pp. 740–744.

[22] I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and

techniques. Morgan Kaufmann, 2005.

[23] T. Khabaza, “Hard hats for data miners: Myths and pitfalls of data mining,”

Business intelligence, data warehousing and analytics editorial from DMReview,

2005.

[24] M. Kantardzic, Data mining: concepts, models, methods, and algorithms. John

Wiley & Sons, 2011.

[25] S. Mitra, S. K. Pal, and P. Mitra, “Data mining in soft computing framework: A

survey,” IEEE transactions on neural networks, vol. 13, no. 1, pp. 3–14, 2002.

[26] E. Hullermeier, “Fuzzy methods in machine learning and data mining: Status and

prospects,” Fuzzy Sets and Systems, vol. 156, no. 3, pp. 387–406, 2005.

Bibliography 50

[27] J. Vieira, F. M. Dias, and A. Mota, “Neuro-fuzzy systems: a survey,” in 5th WSEAS

NNA International Conference on Neural Networks and Applications, Udine, Italia,

2004.

[28] D. G. Feitelson, “Experimental computer science: The need for a cultural change,”

Internet version: http://www. cs. huji. ac. il/˜ feit/papers/exp05. pdf, 2006.

[29] Contributors, “Neuroph.” [Online]. Available: http://neuroph.sourceforge.net/

index.html

[30] M. Rogers, The definition and measurement of productivity. Melbourne Institute

of Applied Economic and Social Research, 1998.

[31] R. Hecht-Nielsen, “Theory of the backpropagation neural network,” in Neural Net-

works, 1989. IJCNN., International Joint Conference on. IEEE, 1989, pp. 593–605.

[32] B. Widrow and M. A. Lehr, “30 years of adaptive neural networks: perceptron,

madaline, and backpropagation,” Proceedings of the IEEE, vol. 78, no. 9, pp. 1415–

1442, 1990.