8/3/2019 Ppt Devika
1/15
PATTERN SEQUENCE
MINING
Presented By:
DEVIKA MITTAL
O915CS081019
8/3/2019 Ppt Devika
2/15
CONTENTS
Some terminology
association rule
sequential pattern
sequence Database support
What is Sequential Pattern Mining?
Challenges ?Algorithms
Applications
8/3/2019 Ppt Devika
3/15
association rule
the rule can be Buy A=)Buy B.
mining does not take the time stamp into
account,
NOTE:
If we take time stamp into account then we
can get more accurate and useful rules
such as: Buy A implies Buy B within a week,or usually people Buy A every week.
make more sound decisions.
8/3/2019 Ppt Devika
4/15
Sequential pattern
It is a sequence of itemsets that frequently
occurred in a specific order, all items in the
same itemsets are supposed to have the
same transaction time value or within a timegap.
transactions of a customer are together
viewed as a sequence
8/3/2019 Ppt Devika
5/15
Sequence Database
sequence database S is shown with minsupport= 2
set ofitems in the database is {aa,b,c,,d,e,f,g}
A sequence {a,(abc)(ac)d(cf)}
has five elements.
It is also a 9 sequence
since there are 9 instance
in sequence
Sequence Id Sequence
10 {a,(abc)(ac)d(cf)}
20 {(ad)c,(bc)(ae)}
30 {(ef)(ab)(df)cb}
40 {eg(af)cbc}
8/3/2019 Ppt Devika
6/15
Support Support, a customer support a sequence s if
s is contained in the correspondingcustomer-sequence, the support of sequence s is
dened as the fraction of customers whosupport this sequence.
Support(s) = Number of support customers
Total number of customers
8/3/2019 Ppt Devika
7/15
What Is Sequential Pattern
Mining? Given a set of sequences, find the complete set of
frequentsubsequences.
sequential pattern mining is trying to find the
relationships between occurrences of sequential
events, to find if there exist any
specific order of the occurrences.
Sequential pattern mining is the process
of extracting certain sequential patterns
whose support exceed a predefined
minimal support threshold.
8/3/2019 Ppt Devika
8/15
Example..
From a book store's transaction databasehistory, we can find the frequent sequential
purchasing patterns,
for example 80% customers who brought thebook Database Management typically boughtthe book Data Warehouse and then broughtthe book Web Information System with
certain time gap.
8/3/2019 Ppt Devika
9/15
Types:
string mining:
used in biology, to examine gene and proteinsequences
primarily concerned with sequences with a singlemember at each position.
Itemset mining: used more often in marketing
concerned with multiple-symbols at each position.
popular approach to text mining.
http://en.wikipedia.org/wiki/Genehttp://en.wikipedia.org/wiki/Proteinhttp://en.wikipedia.org/wiki/Text_mininghttp://en.wikipedia.org/wiki/Text_mininghttp://en.wikipedia.org/wiki/Proteinhttp://en.wikipedia.org/wiki/Gene8/3/2019 Ppt Devika
10/15
Challenges on Sequential Pattern
Mining A huge number of possible sequential patterns are
hidden in databases
A mining algorithm should
find the complete set of patterns, whenpossible, satisfying the minimum support
(frequency) threshold
be highly efficient, scalable, involving only asmall number of database scans
be able to incorporate various kinds ofuser-
specific constraints
8/3/2019 Ppt Devika
11/15
Sequential Pattern Mining
Algorithms Apriori-based Approaches
GSP
SPADE
sequential pattern mining methodsfollow the
methodology of Apriori encounters problems when a sequence
database is large Pattern-Growth-based Approaches
FreeSpan PrefixSpan
substantially reduces the size of projected databases and leadsto efficient processing.
8/3/2019 Ppt Devika
12/15
Applications
Applications of sequential pattern mining
Customer shopping sequences: First buy computer, then CD-ROM, and then digital
camera, within 3 months.
Medical treatments, natural disasters (e.g.,
earthquakes), science & eng. processes, stocks
and markets, etc.
Telephone calling patterns, Weblog click streams
DNA sequences and gene structures
8/3/2019 Ppt Devika
13/15
CONCLUSION:
Still more improvements are likely to be done.
Balance and more clarity for results.
More research is needed.
In essence, the database need a way to store
more pages, combat data, and still provide (or
attempt to provide) pertinent results.
8/3/2019 Ppt Devika
14/15
THANK YOU
8/3/2019 Ppt Devika
15/15
ANY QUERY..???