Upload
prithvi-vuppala
View
134
Download
5
Embed Size (px)
DESCRIPTION
Data Mining process explained in brief and simple terminology.
Citation preview
Data Mining ProcessByPrithvi Raj, Vineesha,Varun, Swaroop, Saranya
Content Layout• Introduction: What is Data Mining?
– Invention: Why is Data Mining?
– Evolution of database technologies
• Applications
• Steps Involved
• Data Mining Techniques
• Data Mining Tools
Introduction
On a commercial POV, a lot of data is being collected and stored
• Web data, e-commerce• Grocery stores billings• Bank/ Credit card transactions• Railway Bookings etc.
These minor data entries can be very important at times (crime investigations, return of products etc)
DefinitionMany Definitions
Extraction of implicit, previously unknown and potentially useful information from data
Exploration and analysis, by automatic or semi-automatic means, of large quantities of data in order
to discover meaningful patterns
What is data mining?Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) information or patterns from data in large databases.
Alternative names: Knowledge Discovery in Databases(KDD), knowledge extraction, data/ pattern analysis, data archeology, data dredging, information harvesting, business intelligence etc.
What is not data mining?
What is NOT data mining? What is Data Mining?
• Look up phone numbers in phone directory
• Query a web search engine for information about “ Amazon”
• Certain names are more prevalent in certain areas ( srinu, venkat, harsha)• Group together similar documents returned by search engine according to their context( Amazon forest, Amazon.com)
“History of data mining”Necessity is the Mother of Invention
Data explosion problem
• Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories.
“Solution”
Data Mining– Extraction of Interesting
Knowledge from data in large databases
(rules, regularities, patterns constraints)
Evolution of Database Technology• 1960s:
– Data collection, database creation, IMS and network DBMS
• 1970s:– Relational data model, relational DBMS implementation
• 1980s:– RDBMS, advanced data models and application oriented DBMS
• 1990s – present:– Data mining and data warehousing, multimedia databases and web databases
Origins of DATA MINING
• Draw ideas from machine learning/AL, pattern recognition, statistics and database systems
• Traditional techniques may be unsuitable due to– Enormity of data
– High dimensionality of data
– Heterogeneous distributed nature of data
Applications of data mining:
Industry Application
Finance Card transaction analysis
Insurance Claims, Fraud analysis
Telecommunication Call recordings
Transport Logistics management
Consumer goods Promotion analysis
Scientific research Image, Video, Speech
Utilities Power usage analysis
Business intelligence (BI) refers to applications and technologies which are used to gather, provide access to, and analyze data and information about their company operations.
Diaphragm showing usage of data mining in making decisions and business intelligence
BUSINESS INTELLIGENCE
Steps involved in Data Mining• Data integration
• Data selection
• Data cleaning
• Data transformation
• Data mining
• Pattern evaluation
• Knowledge presentation
Data Mining Techniques
• Classification and Prediction– Focused hiring
• Cluster analysis– Market segmentation
• Outlier analysis– Fraud detection
• Association analysis– Market basket analysis
• Evolution analysis
Data Mining Tools
• Microsoft SQL Server 2005
• Microsoft SQL Server 2008
• Oracle Data Mining
• DB Miner
By Prithvi Raj, Vineesha,Varun, Swaroop, Saranya