Data Mining: Implementation of Data Mining Techniques using RapidMiner software

Data Mining: Implementation of Data Mining Techniques using

RapidMiner softwarePrepared by

Mohammed Kharma

Definitions review

• Cluster: A collection of data objects– similar (or related) to one another within the

same group– dissimilar (or unrelated) to the objects in other

groups• Cluster analysis– Finding similarities between data according to the

characteristics found in the data and grouping similar data objects into clusters

Clustering Methods

• Partitioning : – Unsupervised learning algorithms, Construct various

partitions and then evaluate them by some criterion, e.g., minimizing the sum of square errors

– Typical methods: k-means, k-medoids• Hierarchical : – Create a hierarchical decomposition of the set of

data (or objects) using some criterion– Typical methods: Diana, Agnes, BIRCH, ROCK,

CAMELEON

Illustration & compression of 2 clustering technique using Rapidminer tool and Java

application

illustrate of 2 clustering technique using Rapidminer tool and Java

• K-means algorithm: We performed two test

1. Using java program: program parameters K = 2;Data: 22 2123 2024 2225 33 2

K-means Clustering• Input: the number of clusters K and the collection of n

instances• Output: a set of k clusters that minimizes the squared error

criterion• Method:– Arbitrarily choose k instances as the initial cluster centers– Repeat• (Re)assign each instance to the cluster to which the

instance is the most similar, based on the mean value of the instances in the cluster• Update cluster means (compute mean value of the

instances for each cluster)– Until no change in the assignment

• Squared Error Criterion– E = ∑i=1 k ∑ pЄCi |p-mi|2 – where mi are the cluster means and p are points in clusters

The result K-Means-java program

The result of K-Means-RapidMiner

Continued-The result of K-Means-RapidMiner

K-medoids• Input: the number of clusters K and the collection of n

instances• Output: A set of k clusters that minimizes the sum of the

dissimilarities of all the instances to their nearest medoids• Method:– Arbitrarily choose k instances as the initial medoids– Repeat• (Re)assign each remaining instance to the cluster with

the nearest medoid• Randomly select a non-medoid instance, or• Compute the total cost, S, of swapping Oj with Or• If S<0 then swap Oj with Or to form the new set of k

medoids– Until no change

The result of k-medoids-RapidMiner

Java Live Demo:http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html

Comparison

The results of both algorithms are the sameBoth require K to be specified in the

inputK-medoids is less influenced by outliers in the

dataBoth methods assign each instance exactly to

one cluster

»Thank you

Data Mining: Implementation of Data Mining Techniques using RapidMiner software

Data & Analytics

Practical Data Mining - PHM Society · using RapidMiner Download RapidMiner from Rapid-i.com and install . Honeywell Confidential. ... -Association rules -Frequent item set mining

DATA MINING MODEL PERFORMANCE OF SALES ALGORITHMS BASED ON RAPIDMINER ... · Data Mining Workflows of the best Predictive Models: data pre-processing and prediction optimisation The

Data Warehouse & Mining Lab Manual - WordPress.com · 2018-09-27 · RapidMiner tool for data mining solution. Prof. Almas Ansari Page 7 Data Warehouse & Mining Lab Manual 2018 CONTENTS

Rapidminer - paginas.fe.up.ptpaginas.fe.up.pt/~ec/files_1112/lab05-lab.pdf · Rapidminer ‐i.com/ Open‐Source Data Mining with the Java Software RapidMiner “RapidMiner is the

A Comparison of Contemporary Data Mining Tools words: data . mining, data mining ... and license models, where every criteria is further explained [6]. ... RapidMiner consists of RapidMiner

TEXT MINING WITH RAPIDMINER - Ertek Projectsertekprojects.com/ftp/ertek_et_al_Chapter_03_v22_RapidMiner.pdf · 1.2 Association Mining of Text ... 1.41 Filtering rules ... The data

CIDM4350 Section 70: Data Mining Methods Fall 2018 Dr. Liang …€¦ · use data mining tools such as RapidMiner to perform data mining tasks; 4. apply data mining techniques to

Open Source Data Warehouse und Business Intelligence ... · Data Mining in Pentaho mit Weka • Vorgehensweise bei der Modellerstellung ähnlich wie bei RapidMiner • Gespeichertes

PERBANDINGAN 3 METODE DALAM DATA MINING …eprints.ums.ac.id/37496/9/Naskah Publikasi.pdf · diimplementasikan menggunakan aplikasi RapidMiner, yang nantinya akan dilakukan analisis

Data Mining with Background Knowledge from the Web - Introducing the RapidMiner Linked Open Data Extension

RM World 2014: Mining financial markets with RapidMiner

(4AP 6EAP) - ut · • RapidMiner: free open‐source software for knowledge discovery, data mining, and machine learning also featuring data stream mining, learning time‐varying

DATA MINING - scholar.cu.edu.egscholar.cu.edu.eg/zeini/files/data_mining.pdf · Graphical Representation of Association Rules Using RapidMiner ... NAHED TAHA DATA MINING: AN INTRODUCTION

Oberseminar Data Mining - Startseite — HTWK Fakultät ... Data Mining Systeme und Tools zum Data Mining: RapidMiner Folie 2 von 56 Motivation Ricardo Hofmann, Matthias Neubert jLeipzig

Mining the Web of Linked Data with RapidMiner

RapidMiner Overview - aptusdatalabs.com · The RapidMiner Data Science Platform Lightning Fast Real Data Science, Code Optional Seamless Deployment, Management & Collaboration Simplified,

week 06 - lab.ppt - UPpaginas.fe.up.pt/~ec/files_1011/week 06 - lab.pdf · Rapidminer ‐i.com/ Open‐Source Data Mining with the Java Software RapidMiner “RapidMiner is the world‐wide

TEXT MINING WITH RAPIDMINER - Mahesh Goyani...TEXT MINING WITH RAPIDMINER G. Ertek, D. Tapucu, and I. Arın Sabancı University, Istanbul, Turkey The goal of this chapter is to introduce

Practical Data Mining with RapidMiner Studio 6dataminingtrend.com/2014/wp-content/uploads/2015/04/week12_2p.pdf · Practical Data Mining with RapidMiner Studio 6 (data)3 base|warehouse|mining

Data Mining Studie 2013 | Praxistest & Benchmarking€¦ · RapidMiner fälschlicherweise als nominal er-kannt [siehe Abb.29]. Dieser Umstand lässt sich in RapidMiner