If you can't read please download the document
Mining Change Events in Large Datasets
Embed Size (px)
Citation preview
- 1.Hashmat Rohian Jiashu Zhao
2.
- Discover patterns whose frequency dramatically changes over
time or any other dimension (FP mining extension)
- Discover new rules associating changes (Financial markets)
- Predict changes in one variable based on the changes in another
dimensions (Outbreak detection)
3.
- Design practical and useful approach to discovering novel and
interesting change knowledge from large databases
- Analyze and present the knowledge mined in a clear and coherent
manner
- Evaluate the knowledge based on a gold standard
4.
- Qian's CPD(Change Point Detection) Algorithm
- Improved CPD1 { Divide and Conquer }
-
- Using Divide & Conquer with global ratios
- Improved CPD2 { Divide and Conquer }
-
- Using Divide & Conquer with local ratios
- The Kolmogorov-Smirnov test (KS-test)
5.
- k-itemsets (itensets with k items) are used to explore (k+1)-
itemsets from transactional databases
- First, the set of frequent 1-itemsets is found (denoted
L1)
- L1 is used to find L2, the set of frquent 2-itemsets
- L2 is used to find L3, and so on, until no frequent k-itemsets
can be found
- Generate strong association rules from the frequent
itemsets
6.
-
- the rate of change of the rate of change
7. 8. 9. 10. 11. 12.
- A stock market index is a method of measuring a section of the
stock market. We use 27 stock market indices.
13. 14. 15.
- Statistical tools are more accurate for CPD
- Binary points produce robust change points
- The transitional ratio and the slope change measures have very
similar results
- Local change point estimation based on true and false points
produce consistent measure
- Both transitional ratio and slope robust for noisy or
incomplete datasets
16.
- Use binary data for CPD and real data for change measure
- Use regression to predict changes in one dimension using
variables
- Incorporate our system in the FP mining
- Apply our methods on other real datasets
- Make our system more efficient and automated
17.