12
Tutorial 4

Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

Embed Size (px)

Citation preview

Page 1: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

Tutorial 4

Page 2: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

Association rule miningGoal: Find all rules that satisfy the user-

specified minimum support (minsup) and minimum confidence (minconf).

Assume all data are categorical.No good algorithm for numeric data.Initially used for Market Basket Analysis to

find how items purchased by customers are related.

Page 3: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

Association ruleIF A B

Support (AB)=#of tuples containing both (A,B)

Total # of tuples

IF A B Confidence (AB)=

#of tuples containing both (A,B)Total # of tuples containing A

Page 4: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

The Apriori algorithmThe best known algorithm.Two steps:

Find all itemsets that have minimum support (frequent itemsets, also called large itemsets).

Use frequent itemsets to generate rules.

Page 5: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

Example Five transactions from a supermarket

List of Items T id

Egg,Butter,Baby Powder,Bread,Umbrella

1

Butter,Baby Powder 2

egg,Butter,Milk 3

Butter,egg,chicken 4

egg,Milk,Coca-Cola 5

Page 6: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

Minimum supportSupport Item

4/5 Egg

2/5 Baby powder

1/5 Umberilla

2/5 Milk

1/5 Bread

1/5 Chicken

1/5 Coca-Cola

4/5 Butter

• Minimum support=2/5= 40%

Support Item

4/5 Egg

2/5 Baby powder

2/5 Milk

4/5 Butter

Page 7: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

exampleSupport Item

1/5 Egg,baby powder

2/5 Egg,milk

3/5 Egg,butter

0 Baby powder,milk

2/5 baby powder,Butter

1/5 Milk,butter

Support Item

2/5 Egg,milk

3/5 Egg,butter

2/5 baby powder,Butt

er

Page 8: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

exampleSupport Item

2/5 Egg,milk

3/5 Egg,butter

2/5 baby powder,Butt

er

{Egg, Milk{ , }Egg, butter } {Egg,Milk,butter}

After that check all possible pairs in L2: {Egg,Milk} ok

{Egg,Butter } ok {Milk,butter } No

Remove it

Page 9: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

cont

Confidence Suport A Support(A,B) Rules

75% 80% 60% Egg Butter

50% 80% 40% Egg Milk

50% 80% 40% Butter Baby Powder

75% 80% 60% Butter Egg

100% 40% 40% Milk Egg

100% 40% 40% Baby Powder Butter

• Minimum support=2/5= 40% min confidence=70%

Page 10: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

ResultsEgg ButterSupport: 60% confidence:75%

Butter EggSupport: 60% confidence:75%

Milk EggSupport: 40% confidence:100%

Baby Powder ButterSupport: 40% confidence:100%

Page 11: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

Insert the same example to weka.Try the same example in Weka, insert

marketing-list.csv

Page 12: Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data

Reference:“Association Rules Apriori Algorithm”,

https://dspace.ist.utl.pt/bitstream/2295/55704/1/licao_9.pdf