Transcript
Page 1: CSC 466:      Knowledge Discovery From Data

CSC 466: Knowledge Discovery From Data

Alex DekhtyarDepartment of Computer Science Cal Poly

New Computer Science Elective

Page 2: CSC 466:      Knowledge Discovery From Data

Outline

Why?

What?

How?

Discussion

Page 3: CSC 466:      Knowledge Discovery From Data

Why?

Information Retrieval

Page 4: CSC 466:      Knowledge Discovery From Data

Why?

Text Classification? Link Analysis?

Page 5: CSC 466:      Knowledge Discovery From Data

Why?

Recommender Systems

Page 6: CSC 466:      Knowledge Discovery From Data

Why?

Market Basket Analysis. Purchasing trends analysis.

Page 7: CSC 466:      Knowledge Discovery From Data

Why?

Data Warehouse… and so much more…

Page 8: CSC 466:      Knowledge Discovery From Data

Why?

Link Analysis

Page 9: CSC 466:      Knowledge Discovery From Data

Why?

Cluster Analysis

Page 10: CSC 466:      Knowledge Discovery From Data

Buzzwords

Data warehousing Data mining

Information filtering

Recommender SystemsInformation retrieval

Text classification

Web mining

OLAP Cluster Analysis

Market basket analysis

Page 11: CSC 466:      Knowledge Discovery From Data

Why?

As professionals, hobbyists and consumers students constantly interact with intelligent information management technologies

This is moving into the realm of undergraduate-level knowledge

Page 12: CSC 466:      Knowledge Discovery From Data

@Calstate.edu

CSU Fullerton: CPSC 483 Data Mining and Pattern Recognition

CSU LA: CS 461 Machine Learning CS 560 Advanced Topics in Artificial Intelligence

CSU Northridge: 595DM Data Mining

CSU Sacramento: CSC 177. Data Warehousing and Data Mining

CSU SF: CSC 869 - Data Mining

CSU San Marcos: CS475 Machine Learning CS574 Intelligent Information Retrieval

Page 13: CSC 466:      Knowledge Discovery From Data

What?

Undergraduate course

Informed consumers Professionals

OLAP/Data Warehousing

Data Mining

Collaborative Filtering

Information Retrieval

1 quarter = 10 weeks

Knowledge Discoveryfrom Data

Page 14: CSC 466:      Knowledge Discovery From Data

What? (goals) Understand KDD technologies @ consumer

level Understand basic types of

Data mining Information filtering Information retrieval

techniques Use KDD to analyze information Implement KDD algorithms Understand/appreciate societal impacts

Page 15: CSC 466:      Knowledge Discovery From Data

What? (syllabus in a nutshell) Intro (data collections, measurement): 2 lectures Data Warehousing/OLAP: 2 lectures Data Mining:

Association Rule Mining: 3 lectures Classification: 3 lectures Clustering: 3 lectures

Collaborative Filtering/Recommendations: 2 lectures Information Retrieval: 4 lectures

19 lectures

(= spring quarter)CSC 466, Spring 2009 quarter

Page 16: CSC 466:      Knowledge Discovery From Data

How? (Alex’s ideas) Learn-by-doing....

Labs: work with existing software, analyze data, interpret

Labs: small groups, implement simple KDD techniques Project: groups, find interesting data, analyze it…

Need to incorporate “societal issues”: privacy vs. data access, etc… Students to make informed choices

Lectures Breadth over depth do a follow-up CSC 560 (grad. DB topics class)

Page 17: CSC 466:      Knowledge Discovery From Data

How?

TODO List:

Find data for labs and projects Investigate open source mining/retrieval software Figure out the textbook

(Web Data Mining by Bing Liu is promising)

Page 18: CSC 466:      Knowledge Discovery From Data

How?

This slide intentionally left blank


Recommended