Data Mining. Data Mining Taxonomy Predictive Method - …predict the value of a particular attribute… Descriptive Method - …foundation of human-interpretable

Data Mining

Data Mining Taxonomy

Predictive Method

- …predict the value of a particular attribute…

Descriptive Method

- …foundation of human-interpretable patterns that describe the data…

Overview

Introduction Data Mining Taxonomy Data Mining Models and Algorithms Quick Wins with Data Mining Privacy-Preserving Data Mining

Definition of Data Mining

“…The non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data…”

Fayyad, Piatetsky-Shapiro, Smyth [1996]

Overview


Data Mining Taxonomy

Descriptive Models- Clustering - Association

Creation of different customer segments,

unrelated products that are bought together (market basket analysis).

Predictive Models- Classification- Regression

customer’s likelihood of switching to a competitor,

an insurance claim’s likelihood of being fraudulent,

the likelihood someone will place a catalog order,

the revenue a customer will generate during the next year

Classification & Regression

Classification:…aim to identify the characteristics that

indicate the group to which each case belongs…

Two Crows Corporation Regression:…uses existing values to forecast what

other values will be… Two Crows Corporation

Clustering & Association

Clustering:…divides a database into different groups……find groups that are very different from each

other, with similar members…. Two Crows Corporation

Association:…involve determinations of affinity-how

frequently two or more things occur together…

Two Crows Corporation

Deviation Detection & Pattern Discovery

Deviation Detection:

…discovering most significant changes in data from previously measured or normative values…

V. Kumar, M. Joshi, Tutorial on High Performance Data Mining.

Sequential Pattern Discovery:

…process of looking for patterns and rules that predict strong sequential dependencies among different events…

V. Kumar, M. Joshi, Tutorial on High Performance Data Mining.

Overview


Data Mining Models & Algorithms

Neural Networks Decision Trees Rule Induction K-nearest Neighbor Logistic regression Discriminant Analysis

Neural Networks

- efficiently model large and complex problems;- may be used in classification problems or for

regressions;- Starts with input layer => hidden layer => output

layer

1

2

3

4

5

6

Inputs Output

Hidden Layer

Neural Networks (cont.)

- can be easily implemented to run on massively parallel computers;

- can not be easily interpret;- require an extensive amount of training time;- require a lot of data preparation (involve very

careful data cleansing, selection, preparation, and pre-processing);

- require sufficiently large data set and high signal-to noise ratio.

Decision Trees (cont.)

- handle very well non-numeric data;- work best when the predictor

variables are categorical;

Decision Trees

-a way of representing a series of rules that lead to a class or value;

-basic components of a decision tree: decision node, branches and leaves;

Income>40,000

Job>5 High Debt

Low Risk High Risk High Risk Low Risk

No Yes

YesNo Yes No

Rule Induction

- method of deriving a set of rules to classify cases;

- generate a set of independent rules which do not necessarily form a tree;

- may not cover all possible situations;- may sometimes conflict in their

predictions.

K-nearest neighbor

- decides in which class to place a new case by examining some number of the most similar cases or neighbors;

- assigns the new case to the same class to which most of its neighbors belong;

X X x

X Y x

X N X

XY

Artificial Neural Networks

Introduction

What is neural computing/neural networks?The brain is a remarkable computer. It interprets imprecise information from

the senses at an incredibly high speed.

Introduction

• A good example is the processing of visual information: a one-year-old baby is much better and faster at recognising objects, faces, and other visual features than even the most advanced AI system running on the fastest super computer.

• Most impressive of all, the brain learns

(without any explicit instructions) to create the internal representations that make these skills possible

Biological Neural Systems The brain is composed of approximately 100

billion (1011) neurons

Schematic drawing of two biological neurons connected by synapses

Dendrites

Synapse

Axon

A typical neuron collects signals from other neurons through a host of fine structures called dendrites.

The neuron sends out spikes of electrical activity through a long, thin strand known as an axon, which splits into thousands of branches.

At the end of the branch, a structure called a synapse converts the activity from the axon into electrical effects that inhibit or excite activity in the connected neurons.

When a neuron receives excitatory input that is sufficiently large compared with its inhibitory input, it sends a spike of electrical activity down its axon.

Learning occurs by changing the effectiveness of the synapses so that the influence of one neuron on the other changes

What is a Neural Net?

A neural net simulates some of the learning functions of the human brain. It can recognize patterns and "learn." You can use it to forecast and make smarter business decisions. It can also serve as an "expert system" that simulates the thinking of an expert and can offer advice. Unlike conventional rule-based artificial-intelligence software, a neural net extracts expertise from data automatically - no rules are

required. In other words through the use of a trial and error

method the system “learns” to become an “expert” in the field the user gives it to study.

Components Needed: In order for a neural network to learn it needs 2

basic components:• Inputs

• Which consists of any information the expert uses to determine his/her final decision or outcome.

• Outputs• Which are the decisions or outcome arrived at by the expert

that correspond to the inputs entered.

How does a neural network learn?

A neural network learns by determining the relation between the inputs and outputs.

By calculating the relative importance of the inputs and outputs the system can determine such relationships.

Through trial and error the system compares its results with the expert provided results in the data until it has reached an accuracy level defined by the user. With each trial the weight assigned to the inputs is

changed until the desired results are reached.


Artificial neurons are analogous to their biological

inspirers

Here the neuron is actually a processing unit, it calculates the weighted sum of the input signal to the neuron to generate the activation signal a, given by

f

a y

x 1

x

x

2

N

w

w

w

1

2

N

An artificial neuron

a w xi ii

N

1

where wi is the strength of the synapse connected to

the neuron, xi is an input feature to the neuron


The activation signal is passed through a transform function to produce the output of the neuron, given by

The transform function can be linear, or non-linear, such as a threshold or sigmoid function [more later …].

For a linear function, the output y is proportional to the activation

signal a. For a threshold function, the output y is set at one of two levels, depending on whether the activation signal a is greater than or less than some threshold value. For a sigmoid function, the output y varies continuously as the activation signal a changes.

y f a ( )


Artificial neural network models (or simply neural networks) are typically composed of interconnected units or artificial neurons. How the neurons are connected depends on some specific task that the neural network performs.

Two key features of neural networks distinguish them from any other sort

of computing developed to date:

Neural networks are adaptive, or trainable Neural networks are naturally massively parallel

These features suggest the potential for neural network systems capable of learning, autonomously improving their own performance, adapting automatically to changing environments, being able to make decisions at high speed and being fault tolerant.

Neural Network Architectures

Feed-forward single layered networks

Feed-forward multi-layer networks

Recurrent networks

Neural Network Applications

Speech/Voice recognition Optical character recognition Face detection/Recognition Pronunciation (NETtalk) Stock-market prediction Navigation of a car Signal processing/Communication Imaging/Vision ….

Documents

Data Mining. Data Mining Taxonomy Predictive Method - …predict the value of a particular attribute… Descriptive Method - …foundation of human-interpretable