Upload
jain
View
223
Download
0
Embed Size (px)
Citation preview
8/9/2019 Content DM
1/10
UNIT Chapter in Book Topic Page
Number
UNIT 1
1
INTRODUCTION
What is Data Mining? 1
Motivating Challenges 2
The origins of data mining 4
Data Mining Tasks 6
2 DATA 1
Types of Data Attribute and
Measurement
19
Types of Data
Sets
23
Data Quality Measurement &
data Collection
Issues
36
Issues Related
To Application
43
UNIT 2 2 DATA 2
Data Preprocessing Aggregation 45
Sampling 47
Dimensionality
Reduction
50
Feature Subset
Selection
52
Feature
Creation
55
Discretization &
Binarization
57
Variable
Transformation
63
Measures of Similarity and Dissimilarity Basics 66
Similarity &
Dissimilarity
between Simple
Attributes
67
Dissimilarities
Between Data
Objects
69
Similarities
Between Data
Objects
72
Examples if
ProximityMeasures
73
Issues in
Proximity
Calculation
80
Selecting The
Right Proximity
Measure
83
8/9/2019 Content DM
2/10
UNIT CHAPTER IN BOOK TOPIC CONTENT PAGE NO.
UNIT
3
4 CLASSIFICATION
Preliminaries 146
General approach to
solving a classification
problem
148
Decision tree induction How a Decision Tree Works 150
How To Build A Decision
Tree
151
Method for expressing
attribute test conditions
155
Measure for selecting the
best split
158
Algorithm for decision tree
induction
164
An example : web robot
detection
166
Characteristics Of decision
tree induction
168
5 CLASSIFICATIONRule-based classifier How a rule based classifier
works
207
Rule ordering schemes 211
How to build a rule based
classifier
212
Direct methods for rule
extraction
213
Indirect method for rule
extraction
221
Characteristics of rule basedclassifier
223
Nearest-neighbor
classifier
Algorithm 223
Characteristics Of Nearest
Neighbor Classifier
225
8/9/2019 Content DM
3/10
UNIT - 4
6 ASSOCIATION ANALYSIS Problem Definition 328
Frequent Itemset
generation
The Apriori
Principal
333
Frequent Itemset
Generation in the
Apriori
Algorithm
335
Candidate
Generation and
Pruning
338
Support
Counting
342
ComputationalComplexity
345
Rule Generation Confidence
Based Pruning
350
Rule Generation
in Apriori
Algorithm
350
An Example:
CongressionalVoting Records
352
Compact
representation of
frequent itemsets
Maximal
Frequent
Itemsets
354
Closed Frequent
Itemsets
355
Alternative
methods for
generating
frequent itemsets
359
UNIT - 5
FP-Growth
algorithm
FP Tree
Representation
363
Frequent Itemset
Generation in FP
366
8/9/2019 Content DM
4/10
Growth
Algorithm
Evaluation of
association
patterns
Objective
Measures of
Interestingness
371
Measure beyond
pairs of
Objective
measures of
Interestingness
binary variables
382
Simsons
Paradox
384
Effect of skewed
support
distribution
386
ASSOCIATION ANALYSIS
2:
Sequential
patterns.
Problem
Formulation
429
Sequential
Pattern
Discovery
431
Timing
Constraints
436
Alternative
Counting
Schemes
439
UNIT CHAPTER IN BOOK TOPIC CONTENT PAGE NO.
UNIT - 6
CLUSTER ANALYSIS Overview What Is Cluster Analysis 490
Different Types of 491
8/9/2019 Content DM
5/10
Clustering
Different Types of Clusters 493
K-means The basic K-means
Algorithm
497
K-means: Additional issues 506
Bisecting K-Means 508
K-Means and Different
Types of Cluster
510
Strength and Weaknesses 510
K-means as an
Optimization Problem
513
Agglomerative
hierarchical
clustering
Basic Agglomerative
Hierarchical Clustering
Algorithm
516
Specific Techniques 518
The Launce-Williams
Formula for Cluster
524
Key issue in Hierarchical
Clustering
524
Strength & Weakness 526
DBSCAN Traditional Density:
Center-Based Approach
527
The DBSCAN Algorithm 528
Strengths and Weaknesses 530
Overview of
Cluster
Evaluation
Overview 533
Unsupervised Cluster
Evaluation Using Cohesion
and Separation
536
Unsupervised Cluster
Evaluation Using
Proximity Matrix
542
8/9/2019 Content DM
6/10
Unsupervised Evaluation
of Hierarchical Clustering
544
Determining the correct
Number of Clusters
546
Clustering Tendency 547
Supervised Measures of
Cluster Validity
548
Assessing the Significance
of Cluster Validity
Measures
553
Hours
UNIT CHAPTER IN BOOK 2 TOPIC CONTENT PAGE NO.
UNIT - 7
FURTHER TOPICS IN
DATA MINING
Multidimensional
analysis and
descriptive
mining of
complex data
objects
Generalization of
Structured Data
592
Aggregation and
Approximation
in Spatial and
Multimedia Data
Generalization
593
Generalization ofObject Identifiers
and
Class/subclass
Hierarchies
594
Generalization of
Class
Composition
Hierarchies
595
Construction andMining of Object
Cubes
596
Generalization
Based Mining of
Plan Databases
by Divide and
596
8/9/2019 Content DM
7/10
Conquer
Spatial data
mining
Spatial data Cube
Construction and
Spatial OLAP
601
Mining Spatial
Association and
Co-location
Patterns
605
Spatial
Clustering
Methods
606
Spatial
Classification
and Spatial TrendAnalysis
606
Mining Raster
Databases
607
Multimedia data
mining
Similarity Search
in Multimedia
Data
608
Multidimensional
Analysis of
Multimedia Data
609
Classification
and Predication
Analysis of
Multimedia Data
611
Mining
Association in
Multimedia Data
612
Audio & VideoData Mining
613
Text mining Text Data
Analysis and
Information
Retrieval
615
8/9/2019 Content DM
8/10
Dimensionality
Reduction for
Text
621
Text Mining
Approach
624
Mining the WWW Mining the Web
page layout
structure
628-630
Mining the Web
link Structure to
Identify
Authoritative
Web Pages
631
Mining
Multimedia Data
on the Web
637
Automatic
Classification of
Web Documents
638
Web Usage
Mining
640
UNIT CHAPTER IN BOOK TOPIC CONTENT PAGE NO.
UNIT - 8
APPLICATIONS Data mining
applications
Data mining for
Financial Data
Analysis
649
Retail Industry 651
Telecommunication
Industry
652
Biological Data
Analysis
654
Other Scientific
Application
657
8/9/2019 Content DM
9/10
Intrusion Detection 658
Data mining
system products
and research
prototypes
How to Choose a
Data mining
System
660
Examples of
Commercial Data
Mining Systems
663
Additional
themes on Data
mining
Theoretical
Foundation of Data
Mining
665
Statistical Data
Mining
666
Visual and AudioData Mining
667
Data Mining
Privacy and Data
Security
670
Social impact of
Data mining
Ubiquitous and
Invisible Data
Mining
675
Data Mining
Privacy and Data
Security
678
Trends in Data
mining
681
TEXT BOOKS:
1. Introduction to Data Mining - Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Pearson
Education, 2007
2. Data Mining Concepts and Techniques - Jiawei Han and Micheline Kamber, 2
nd
Edition,Morgan Kaufmann, 2006.
REFERENCE BOOKS:
1. Insight into Data Mining Theory and Practice - K.P.Soman, Shyam Diwakar, V.Ajay, PHI, 2006.
8/9/2019 Content DM
10/10