20
Elnaz Delpisheh York University Department of Computer Science and Engineering June 23, 2022 Identifying Interesting Association Rules with Genetic Algorithms

Identifying Interesting Association Rules with Genetic Algorithms

  • Upload
    herbst

  • View
    51

  • Download
    6

Embed Size (px)

DESCRIPTION

Identifying Interesting Association Rules with Genetic Algorithms. Elnaz Delpisheh York University Department of Computer Science and Engineering October 11, 2014. Data mining. Too much data. Data. Data Mining. I = {i 1 ,i 2 ,...,i n } is a set of items . - PowerPoint PPT Presentation

Citation preview

Page 1: Identifying Interesting Association Rules with Genetic Algorithms

Elnaz DelpishehYork University

Department of Computer Science and Engineering

April 20, 2023

Identifying Interesting Association Rules with Genetic

Algorithms

Page 2: Identifying Interesting Association Rules with Genetic Algorithms

Data mining

2

Data

Data Mining

Association rules

Too much data

•I = {i1,i2,...,in} is a set of items.•D = {t1,t2,...,tn} is a transactional database.•ti is a nonempty subset of I.•An association rule is of the form AB, where A and B are the itemsets, A⊂ I, B⊂ I, and A∩B=∅ .•Apriori algorithm is mostly used for association rule mining.•{milk, eggs}{bread}.

Page 3: Identifying Interesting Association Rules with Genetic Algorithms

Apriori Algorithm

TID List of item IDs

T100

I1,I2,I3

T200

I2, I4

T300

I2, I3

T400

I1,I2,I4

T500

I1, I3

T600

I2, I3

T700

I1, I3

T800

I1, I2, I3, I5

T900

I1, I2, I3

3

Page 4: Identifying Interesting Association Rules with Genetic Algorithms

Apriori Algorithm (Cont.)

4

Page 5: Identifying Interesting Association Rules with Genetic Algorithms

Association rule mining

5

Too many

association rules

Data

Data Mining

Association rules

Too much data

Page 6: Identifying Interesting Association Rules with Genetic Algorithms

Interestingness criteria

6

Comprehensibility.Conciseness.Diversity.Generality.Novelty.Utility....

Page 7: Identifying Interesting Association Rules with Genetic Algorithms

Interestingness measures

Subjective measuresData and the user’s prior knowledge are considered.Comprehensibility, novelty, surprisingness, utility.

Objective measuresThe structure of an association rule is considered.Conciseness, diversity, generality, peculiarity.Example: Support

It represents the generality of a rule. It counts the number of transactions containing both A and

B.

7

Page 8: Identifying Interesting Association Rules with Genetic Algorithms

Drawbacks of objective measuresDetabase-dependence

Lack of knowledge about the databaseThreshold dependence

SolutionMultiple database reanalysis

Problemo Large number of disk I/O

Detabase-independence

8

Page 9: Identifying Interesting Association Rules with Genetic Algorithms

Genetic algorithm-based learning (ARMGA )1. Initialize population2. Evaluate individuals in population3. Repeat until a stopping criteria is met

A. Select individuals from the current population

B. Recombine them to obtain more individualsC. Evaluate new individualsD. Replace some or all the individuals of the

current population by off-springs

4. Return the best individual seen so far

9

Page 10: Identifying Interesting Association Rules with Genetic Algorithms

ARMGA ModelingGiven an association rule XYRequirement

Conf(XY) > Supp(Y)

Aim is to maximise

10

Page 11: Identifying Interesting Association Rules with Genetic Algorithms

ARMGA EncodingMichigan Strategy

Given an association k-rule XY, where X,Y⊂I, I is a set of items I=i1,i2,..., in, and X∩Y=∅.

For example{A1,...,Aj}{Aj+1,...,Ak}

11

Page 12: Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Encoding (Cont.)

12

The aforementioned encoding highly depends on the length of the chromosome.

We use another type of encoding:Given a set of items {A,B,C,D,E,F}Association rule ACFB is encoded as follows

00A11B00C01D11E00F00: Item is antecedent11: Item is consequence01/10: Item is absent

Page 13: Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Operators

SelectCrossoverMutation

13

Page 14: Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Operators-SelectSelect(c,ps): Acts as a filter of the

chromosomeC: ChromosomePs: pre-specified probability

14

Page 15: Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Operators-CrossoverThis operation uses a two-point strategy

15

Page 16: Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Operators-Mutate

16

Page 17: Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Initialization

17

Page 18: Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Algorithm

18

Page 19: Identifying Interesting Association Rules with Genetic Algorithms

Empirical studies and EvaluationImplement the entire procedure using

Visual C++Use WEKA to produce interesting

association rulesCompare the results

19

Page 20: Identifying Interesting Association Rules with Genetic Algorithms

20