15
Data Mining Process By Prithvi Raj, Vineesha,Varun, Swaroop, Saranya

Data mining process

Embed Size (px)

DESCRIPTION

Data Mining process explained in brief and simple terminology.

Citation preview

Page 1: Data mining process

Data Mining ProcessByPrithvi Raj, Vineesha,Varun, Swaroop, Saranya

Page 2: Data mining process

Content Layout• Introduction: What is Data Mining?

– Invention: Why is Data Mining?

– Evolution of database technologies

• Applications

• Steps Involved

• Data Mining Techniques

• Data Mining Tools

Page 3: Data mining process

Introduction

On a commercial POV, a lot of data is being collected and stored

• Web data, e-commerce• Grocery stores billings• Bank/ Credit card transactions• Railway Bookings etc.

These minor data entries can be very important at times (crime investigations, return of products etc)

Page 4: Data mining process

DefinitionMany Definitions

Extraction of implicit, previously unknown and potentially useful information from data

Exploration and analysis, by automatic or semi-automatic means, of large quantities of data in order

to discover meaningful patterns

Page 5: Data mining process

What is data mining?Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) information or patterns from data in large databases.

Alternative names: Knowledge Discovery in Databases(KDD), knowledge extraction, data/ pattern analysis, data archeology, data dredging, information harvesting, business intelligence etc.

Page 6: Data mining process

What is not data mining?

What is NOT data mining? What is Data Mining?

• Look up phone numbers in phone directory

• Query a web search engine for information about “ Amazon”

• Certain names are more prevalent in certain areas ( srinu, venkat, harsha)• Group together similar documents returned by search engine according to their context( Amazon forest, Amazon.com)

Page 7: Data mining process

“History of data mining”Necessity is the Mother of Invention

Data explosion problem

• Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories.

“Solution”

Data Mining– Extraction of Interesting

Knowledge from data in large databases

(rules, regularities, patterns constraints)

Page 8: Data mining process

Evolution of Database Technology• 1960s:

– Data collection, database creation, IMS and network DBMS

• 1970s:– Relational data model, relational DBMS implementation

• 1980s:– RDBMS, advanced data models and application oriented DBMS

• 1990s – present:– Data mining and data warehousing, multimedia databases and web databases

Page 9: Data mining process

Origins of DATA MINING

• Draw ideas from machine learning/AL, pattern recognition, statistics and database systems

• Traditional techniques may be unsuitable due to– Enormity of data

– High dimensionality of data

– Heterogeneous distributed nature of data

Page 10: Data mining process

Applications of data mining:

Industry Application

Finance Card transaction analysis

Insurance Claims, Fraud analysis

Telecommunication Call recordings

Transport Logistics management

Consumer goods Promotion analysis

Scientific research Image, Video, Speech

Utilities Power usage analysis

Page 11: Data mining process

Business intelligence (BI) refers to applications and technologies which are used to gather, provide access to, and analyze data and information about their company operations.

Diaphragm showing usage of data mining in making decisions and business intelligence

BUSINESS INTELLIGENCE

Page 12: Data mining process

Steps involved in Data Mining• Data integration

• Data selection

• Data cleaning

• Data transformation

• Data mining

• Pattern evaluation

• Knowledge presentation

Page 13: Data mining process

Data Mining Techniques

• Classification and Prediction– Focused hiring

• Cluster analysis– Market segmentation

• Outlier analysis– Fraud detection

• Association analysis– Market basket analysis

• Evolution analysis

Page 14: Data mining process

Data Mining Tools

• Microsoft SQL Server 2005

• Microsoft SQL Server 2008

• Oracle Data Mining

• DB Miner

Page 15: Data mining process

By Prithvi Raj, Vineesha,Varun, Swaroop, Saranya