24
Data Mining Tools Alexandra Bahl Ray Bichard Wes Griffin Kitty Roberts

Data Mining Tools Knowledge Seeker 4.5

  • Upload
    tommy96

  • View
    1.727

  • Download
    3

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools

Alexandra Bahl

Ray Bichard

Wes Griffin

Kitty Roberts

Page 2: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

2

Presentation Outline

Overview of Data Mining

Introduction to KnowledgeSeeker

Major Competitors

Current Applications

Introduction to Key Terms

Interactive Demonstration

Summary

Questions & Answers

Wes Griffin

Ray Bichard

Kitty Roberts

Alex Bahl

Page 3: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

3

What is Data Mining?

“Data mining is a continuous, iterative process. It involves the use of software, sound methodology, and human creativity

to achieve new insight through the exploration of data to uncover patterns,

relationships, anomalies and dependencies.”

Page 4: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

4

Data Mining History

Data Collection and Database Creation 1960s

Database Management Systems1970s -early 1980s

Advanced DBMSmid 1980s - present

New Generation Integrated Information Systems

Web based DBMS1990-presentDW and Data Mining

late 1980s - present

Page 5: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

5

Data Mining Architecture

Graphical User Interface

Pattern Evaluation

Data Mining Engine

Database or DW server

DatabaseData

Warehouse

KnowledgeBase

Data CleaningData Integration

Filtering

Page 6: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

6

What is KnowledgeSeeker?

A data analysis, data mining package

Enables users to quickly analyze and understand the relationships between variables in a data set.

First generation data mining tool

Most widely used “decision tree” data mining analytical tool

Price per copy: $4750.00 USD

Page 7: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

7

What is KnowledgeSeeker?

Produced by ANGOSS Software Corporation, who focus “solely” on data mining software.

Offer training and consulting services

Produce data mining add-ins which accepts data from all major databases

Works with popular query and reporting, spreadsheet, statistical and OLAP & ROLAP tools.

Page 8: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

8

The KnowledgeSeeker Process

Define business goal

Prepare the data

Analyze the data

Page 9: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

9

The KnowledgeSeeker Process

Define business goal What question needs answered?

What type of analysis will be performed?

What functionalities does the business require?

Page 10: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

10

The KnowledgeSeeker Process

Prepare the dataConsider the various factors that could influence the outcome.

Examine database to identify those data fields which provide measurements of potential dependencies.

Create subset of the database containing only those data fields.

Page 11: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

11

The KnowledgeSeeker Process

Analyze the data Automatically scans all the fields in the data set, summarizes the statistically significant patterns and relationships among the fields, and displays the result as a graphical decision tree, or as a knowledge base of rules.

Page 12: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

12

KnowledgeSeeker Pulsepoints

ADVANTAGESADVANTAGES

Easy to use

Powerful

Scalability

Flexibility

DISADVANTAGESDISADVANTAGES Less than impressive GUI

Page 13: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

13

Company Software

Clementine 6.0

Enterprise Miner 3.0

Intelligent Miner

Major Competitors

Page 14: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

14

Company Software

Mineset 3.1

Darwin

Scenario

Major Competitors

Page 15: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

15

Current Applications

Manufacturing Used by the R.R. Donnelly & Sons commercial printing company to improve process control, cut costs and increase productivity.

Used extensively by Hewlett Packard in their United States manufacturing plants as a process control tool both to analyze factors impacting product quality as well as to generate rules for production control systems.

Page 16: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

16

Current Applications

Auditing Used by the IRS to combat fraud, reduce risk, and increase collection rates.

Finance Used by the Canadian Imperial Bank of Commerce (CIBC) to create models for fraud detection and risk management.

Page 17: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

17

Current Applications

CRM

Telephony Used by US West to reduce churning and increase customer loyalty for a new voice messaging technology.

Page 18: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

18

Current Applications

Marketing Used by the Washington Post to improve their direct mail targeting and to conduct survey analysis.

Health Care Used by the Oxford Transplant Center to discover factors affecting transplant survival rates.

Used by the University of Rochester Cancer Center to study the effect of anxiety on chemotherapy-related nausea.

Page 19: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

19

More Customers

Page 20: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

20

Introduction to Key Terms

Dependent / Independent variables

Root node / nodes

Decision tree

Splits

Clustering

Page 21: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

21

Page 22: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

22

Questions

1. What percentage of people in the test group have high blood pressure with these characteristics: 66-year-old male regular smoker that has low to moderate salt consumption?

2. Do the risk levels change for a male with the same characteristics who quit smoking? What are the percentages?

3. If you are a 2% milk drinker, how many factors are still interesting?

4. Knowing that salt consumption and smoking habits are interesting factors, which one has a stronger correlation to blood pressure levels?

5. Grow an automatic tree. Look to see if gender is an interesting factor for 55-year-old regular smoker who does not each cheese?

Page 23: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

23

Summary

Data mining has evolved into knowledge discovery

KnowledgeSeeker provides rapid data anaylsis

KnowledgeSeeker is flexible and inexpensive

KnowledgeSeeker is easy to use

Page 24: Data Mining Tools Knowledge Seeker 4.5

Data Mining Tools: KnowledeSeeker 4.5

24