19
Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

Embed Size (px)

Citation preview

Page 1: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

Advanced Network Database Lab

Kaggle Competition

Prudential Life Insurance Assessment

Can you make buying life insurance easier?

Page 2: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

2

Registration

• Site: https://www.kaggle.com/competitions

• Account: IKDD1(Group Number)

Page 3: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

3

Prudential

• Prudential Financial, Inc.• An American Fortune Global 500 and Fortune 500 company• https://www.prulife.com.tw/page/index.htm• $ 30,000

Page 5: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

5

Data Attribute

Page 6: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

6

Data Attribute

• Nominal type• Numbers may be used to represent the variables but the numbers do

not have numerical value or relationship.

Page 7: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

7

Classification

Page 8: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

8

Prediction

Page 9: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

9

Decision Tree

Page 10: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

10

Sklearn – Python tool

• Simple and efficient tools for data mining and data analysis!

• Decision tree url : http://scikit-learn.org/stable/modules/tree.html

Page 11: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

11

Homework 1

• Registration

• Apply a simple algorithm to build the classifier

• To predict the "Response" variable for each Id in the test set

• Submit the result to Kaggle

• Deadline: next Thursday (12/31)

Page 12: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

12

Homework 2

• Improve your prediction results

• Oral report

• Deadline: next Thursday (1/7)

Page 13: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

13

Homework 3 (Final project)

• Try different algorithms to build the best classifier

• Submit the result to Kaggle

Page 14: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

14

Final project

• Deadline: 1/14 23:59

• Submission: • Submit the results to kaggle• Email your project to [email protected]• Project file content:

• code • prediction result • report

Page 15: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

15

Report

• The details of the your best method

• The description of the methods that you tried

• The important attributes or surprised features you found

Page 16: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

16

Grading

• Homework 1: 20%

• Homework 2: 10%

• Final Project : 70%• The ranking: 20%• Algorithm and coding : 25%• Report: 25%

Page 17: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

XGBoost

• General purpose gradient boosting library, including generalized linear model and gradient boosted decision tree

• SITE: http://dmlc.ml/

Page 18: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

tslm

• A linear model with time series components

• SITE: http://www.inside-r.org/packages/cran/forecast/docs/tslm

Page 19: Advanced Network Database Lab Kaggle Competition Prudential Life Insurance Assessment Can you make buying life insurance easier?

H2o.randomForest

• Random Forest (RF) is a powerful classification tool. When given a set of data, RF generates a forest of classification trees, rather than a single classification tree. Each of these trees generates a classification for a given set of attributes. The classification from each H2O tree can be thought of as a vote; the most votes determines the classification.

• SITE: http://docs.h2o.ai/h2oclassic/datascience/rf.html