20
Induction and Decision Trees

Induction and Decision Trees

  • Upload
    ophira

  • View
    53

  • Download
    0

Embed Size (px)

DESCRIPTION

Induction and Decision Trees. Artificial Intelligence. The design and development of computer systems that exhibit intelligent behavior. What is intelligence? Turing test: Developed in 1950 by Alan Turing (pioneer in computer science) Computer and human in one room - PowerPoint PPT Presentation

Citation preview

Page 1: Induction and Decision Trees

Induction and Decision Trees

Page 2: Induction and Decision Trees

Artificial Intelligence

The design and development of computer systems that exhibit intelligent behavior.

What is intelligence?

Turing test: Developed in 1950 by Alan Turing (pioneer in computer

science) Computer and human in one room Human “interrogator” in another room Interrogator asks questions...human OR computer answers If interrogator cannot tell whether the human or the computer

is answering, then the computer is “intelligent”

Page 3: Induction and Decision Trees

Classification of AI Systems

Knowledge Representation Systems Capture existing expert knowledge and use it to consult end-users

and provide decision support Main types: Rule-based expert systems, Case-base reasoning systems, Frame-based

knowledge systems, Semantic networks

Machine Learning Algorithms that use mathematical or logical techniques for finding

patterns in data and discovering or creating new knowledge Main types: Artificial neural networks, genetic algorithms, inductive decision

trees, Naïve Bayesian algorithms, Clustering and pattern-recognition algorithms

Data mining involves primarily a “machine learning” form of AI

Page 4: Induction and Decision Trees

Data Mining

Textbook definition: Knowledge discovery in databasesUsing statistical, mathematical, AI, and

machine learning techniques to extract useful information and subsequent knowledge from large databases

Key point: identifying patterns in large data sets

Page 5: Induction and Decision Trees

5

Microsoft SQL Server Data Mining Algorithms

Decision TreesNaïve BayesianClusteringSequence ClusteringAssociation RulesNeural NetworkTime Series

Page 6: Induction and Decision Trees

Decision Trees for Machine Learning

Based on Inductive Logic

Three types of logical structures commonly used in AI systems:DeductionAbduction Induction

Page 7: Induction and Decision Trees

Deduction

Premise (rule): if p then q

Fact (axiom, observation): p

Conclude: q

This is classical logic (Modus Ponens). If the rule is correct, and the fact is correct, then you know that the conclusion will be correct.

We are given the rule

Page 8: Induction and Decision Trees

Abduction

Premise (rule): if p then q

Fact (axiom, observation): q

Conclude: p

This form of reasoning is a logical fallacy called “affirming the consequent” (Post hoc ergo propter hoc). The conclusion may be wrong, but it is a plausible explanation of the fact, given the rule. Useful for diagnostic tasks.

We are given the rule

Page 9: Induction and Decision Trees

Induction

1. Observe p and q together

2. .

3. .

4. .

n. Observe p and q together

Conclude: if p then q

This is stereotypical thinking…highly error prone.

We create the rule

Page 10: Induction and Decision Trees

Example – Baby in the kitchen

Page 11: Induction and Decision Trees

ID3 Decision Tree Algorithm

“Iterative Dichotomizer”

Developed by Ross Quinlan (1979)

This is the basis for many commercial induction products

The goal of this algorithm is to find rules resulting in YES or NO values. (Therefore, the output of generated rules have 2 possible outcomes)

ID3 generates a tree, where each path of the tree represents a rule. The leaf node is the THEN part of the rule, and the nodes leading to it are the ANDS of attribute-value combinations in the IF part of the rule.

Page 12: Induction and Decision Trees

ID3 Algorithm

Starting Point: an empty tree (this tree will eventually represent the final

rules created by ID3) a recordset of data elements (e.g. records from a database) a set attributes (fields), each with some finite number of

possible values NOTE: one of the attributes is the “decision” field, with a

YES or NO value (or some other 2-valued option...GOOD/BAD, HIGH/LOW, WIN/LOSE, etc.)

Output: a tree, where each path of the tree represents a rule

Page 13: Induction and Decision Trees

ID3 algorithmIf all records in your recordset are positive (i.e. have YES values for their decision attribute), create a YES node and stop (end recursion)

If all records in your recordset are negative, create a NO node and stop (end recursion)

Select the attribute that best discriminates among the records (using an entropy function)

Create a tree-node representing that attribute, with n branches, where n is the number of values for the selected attribute

Divide the records of the recordset into subsets subrecordset 1, subrecordset 2, ..., subrecordset n corresponding with each value of the selected attribute

Recursively apply the algorithm to each subrecordset i, with reduced attribute set (don’t include already used attributes further down the path)

Page 14: Induction and Decision Trees

Calculating Entropy

Entropy = mixture, chaos

We want to pick the attribute with the lowest entropy: ideally, a particular value for the input attribute

leads to ALL yes or ALL no in the outcome attribute…or come as close to this as possible

An attribute’s entropy =

Where n is the total number of possible values for the attributeand xi is the ith value

Page 15: Induction and Decision Trees

Baby’s RecordSet of Oven-Touching Experiences

Page 16: Induction and Decision Trees

ID3 Applied to Baby-in-the-Kitchen

Which attribute to start with? Based on Entropy measure (assuming log base 2),Touch stove entropy = 0.918Mom in kitchen entropy = 1.0To see this, note that:

Probability of touching stove leading to ouch is .67, and not leading to ouch is .33

.67 * .33 = .22

Probability of mom being in kitchen leading to ouch is .5 and mom being in kitchen not leading to ouch is also .5

.5 * .5 = .25

Page 17: Induction and Decision Trees

Applying the Touch Stove Attribute

Page 18: Induction and Decision Trees

Recurse …apply the Mom in Kitchen attribute where needed

Page 19: Induction and Decision Trees

Resulting decision rules

If Touch_Oven = No then BOO_BOO = No

If Touch_Oven = Yes and Mom_In_Kitchen = Yes then BOO_BOO = Yes

If Touch_Oven = Yes and Mom_In_Kitchen = No then BOO_BOO = No

Page 20: Induction and Decision Trees

Now we’ll do this with Microsoft SQL Server