23
partykit: A Toolbox for Recursive Partytioning Torsten Hothorn Ludwig-Maximilians-Universit¨ at M¨ unchen Achim Zeileis WU Wirtschaftsuniversit¨ at Wien

partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

partykit:

A Toolbox for Recursive Partytioning

Torsten Hothorn

Ludwig-Maximilians-Universitat Munchen

Achim Zeileis

WU Wirtschaftsuniversitat Wien

Page 2: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Trees in R

The Machine Learning task view lists the following tree-related packages

• rpart (CART)

• tree (CART)

• mvpart (multivariate CART)

• knnTree (nearest-neighbor-based trees)

• RWeka (J4.8, M5′, LMT)

• LogicReg (Logic Regression)

• BayesTree (with Bayesian flavor)

• TWIX (with extra splits)

• party (conditional inference trees, model-based trees)

• ...

In addition, some packages implementing ensemble methods are based

on trees as well, for example

2009-07-09 1

Page 3: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Trees in R

• randomForest (CART-based random forests)

• randomSurvivalForest (for censored responses)

• party (conditional random forests)

• gbm (tree-based gradient boosting)

• mboost (model-based and tree-based gradient boosting)

• ...

These packages deal with very similar problems. However, similar or

even the same tasks are implemented differently by each package. No

code reuse takes place.

2009-07-09 2

Page 4: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Missing in Action

Moreover, important methods in this field, mostly from the statistics

literature, are missing from the above list, for example

• CHAID

• FACT

• QUEST

• GUIDE

• LOTUS

• CRUISE

• ...

2009-07-09 3

Page 5: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Adding more Trees?

It is hard to come up with implementations of new tree-based algorithms,

because one needs to take care not only of fitting the tree but also of

• representing fitted trees,

• printing trees,

• plotting trees, and

• computing predictions from trees.

2009-07-09 4

Page 6: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Wouldn’t it be nice to...

hansi

... have a common infrastructure for

• representing fitted trees,

• printing trees,

• plotting trees, and

• computing predictions from trees?

2009-07-09 5

Page 7: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

R-Forge Package partykit

hansi

offers a unified approach to

• representing fitted trees,

• printing trees,

• plotting trees, and

• computing predictions from trees.

2009-07-09 6

Page 8: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Unified Representation of Trees

The most important classes in partykit are

partysplit: binary, multiway and functional splits in ordered and

unordered variables,

partynode: inner nodes containing splits, surrogate splits, kid nodes, and

terminal nodes containing models or constant fits,

party: trees (nodes and data).

The API essentially consists of the corresponding constructors.

2009-07-09 7

Page 9: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Methods

Coercing:

as.party.rpart, as.party.J48, as.party.pmmlTreeModel

Inspecting:

print.party, plot.party, both are extensible

Computing:

predict.party, subset methods

2009-07-09 8

Page 10: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: Pretty rpart Trees

R> data("GlaucomaM", package = "ipred")R> library("rpart")R> g_rpart <- rpart(Class ~ ., data = GlaucomaM)R> print(g_rpart)n= 196

node), split, n, loss, yval, (yprob)* denotes terminal node

1) root 196 98 glaucoma (0.50000000 0.50000000)2) varg< 0.209 76 6 glaucoma (0.92105263 0.07894737) *3) varg>=0.209 120 28 normal (0.23333333 0.76666667)

6) mhcg>=0.1695 7 0 glaucoma (1.00000000 0.00000000) *7) mhcg< 0.1695 113 21 normal (0.18584071 0.81415929)14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) *15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

30) tms>=-0.0655 32 8 normal (0.25000000 0.75000000)60) eas< 0.4455 7 2 glaucoma (0.71428571 0.28571429) *61) eas>=0.4455 25 3 normal (0.12000000 0.88000000) *

31) tms< -0.0655 62 3 normal (0.04838710 0.95161290) *

2009-07-09 9

Page 11: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: Pretty rpart Trees

R> plot(g_rpart)R> text(g_rpart)

|varg< 0.209

mhcg>=0.1695

vars< 0.064tms>=−0.0655

eas< 0.4455

glaucoma

glaucomaglaucoma

glaucoma normal normal

2009-07-09 10

Page 12: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: Pretty rpart Trees

R> library("partykit")R> g_party <- as.party(g_rpart)R> print(g_party, header = FALSE)[1] root| [2] varg < 0.209: glaucoma (n = 76, err = 7.9%)| [3] varg >= 0.209| | [4] mhcg >= 0.1695: glaucoma (n = 7, err = 0.0%)| | [5] mhcg < 0.1695| | | [6] vars < 0.064: glaucoma (n = 19, err = 47.4%)| | | [7] vars >= 0.064| | | | [8] tms >= -0.0655| | | | | [9] eas < 0.4455: glaucoma (n = 7, err = 28.6%)| | | | | [10] eas >= 0.4455: normal (n = 25, err = 12.0%)| | | | [11] tms < -0.0655: normal (n = 62, err = 4.8%)

Number of inner nodes: 5Number of terminal nodes: 6

2009-07-09 11

Page 13: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: Pretty rpart Trees

R> plot(g_party, type = "simple")

varg

1

< 0.209 >= 0.209

terminalnode

2

mhcg

3

>= 0.1695 < 0.1695

terminalnode

4

vars

5

< 0.064 >= 0.064

terminalnode

6

tms

7

>= −0.0655 < −0.0655

eas

8

< 0.4455>= 0.4455

terminalnode

9terminal

node

10

terminalnode

11

2009-07-09 12

Page 14: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: Pretty rpart Trees

R> plot(g_party, type = "extended")

varg

1

< 0.209 >= 0.209

Node 2 (n = 76)

norm

algl

auco

ma

0

0.2

0.4

0.6

0.8

1

mhcg

3

>= 0.1695 < 0.1695

Node 4 (n = 7)

norm

algl

auco

ma

0

0.2

0.4

0.6

0.8

1

vars

5

< 0.064 >= 0.064

Node 6 (n = 19)

norm

algl

auco

ma

0

0.2

0.4

0.6

0.8

1

tms

7

>= −0.0655 < −0.0655

eas

8

< 0.4455 >= 0.4455

Node 9 (n = 7)

norm

algl

auco

ma

0

0.2

0.4

0.6

0.8

1Node 10 (n = 25)

norm

algl

auco

ma

0

0.2

0.4

0.6

0.8

1Node 11 (n = 62)

norm

algl

auco

ma

0

0.2

0.4

0.6

0.8

1

2009-07-09 13

Page 15: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: Fast Predictions

R> g_J48 <- J48(Class ~ ., data = GlaucomaM)R> g_party <- as.party(g_J48)R> nd <- GlaucomaM[rep(1:nrow(GlaucomaM), 100),]R> system.time(p1 <- predict(g_J48, newdata = nd))

user system elapsed1.468 0.156 1.623

R> system.time(p2 <- predict(g_party, newdata = nd))user system elapsed

0.096 0.000 0.097R> table(p1, p2)

p2p1 glaucoma normal

glaucoma 9900 0normal 0 9700

2009-07-09 14

Page 16: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: CHAID

partykit makes it ‘easy’ to implement new or old tree algorithms.

My students implemented CHAID as a software project last winter.

I wanted them to focus on the algorithm, not on technical details.

They only had to implement variable selection, splitting and stopping

rules, partykit takes care of the rest.

The resulting package CHAID is available from R-Forge and consists of

only 362 lines of R code and was validated against SPSS 15.

2009-07-09 15

Page 17: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: CHAID

Fit a CHAID tree to BreastCancer data from package mlbench:

R> data("BreastCancer", package = "mlbench")R> library("CHAID")R> b_chaid <- chaid(Class ~ Cl.thickness + Cell.size + Cell.shape ++ Marg.adhesion + Epith.c.size + Bare.nuclei ++ Bl.cromatin + Normal.nucleoli + Mitoses,+ data = BreastCancer)

R> plot(b_chaid)

2009-07-09 16

Page 18: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: CHAID

Cell.size

1

1 2 34 5, 6, 7, 8, 9, 10

Normal.nucleoli

2

1, 2, 3, 5, 6, 8, 94, 7, 10

Bare.nuclei

3

1, 2, 3, 4, 6, 7, 8, 9, 105

Node 4 (n = 362)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1Node 5 (n = 8)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1Node 6 (n = 3)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1

Cell.shape

7

1, 23, 4, 5, 6, 7, 8, 9, 10

Node 8 (n = 32)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1Node 9 (n = 13)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1

Cell.shape

10

1, 2, 34, 5, 6, 7, 8, 9, 10

Bare.nuclei

11

1, 2, 5, 6, 7, 8, 93, 4, 10

Node 12 (n = 21)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1Node 13 (n = 12)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1Node 14 (n = 19)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1Node 15 (n = 38)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1

Marg.adhesion

16

12, 3, 4, 5, 6, 7, 8, 9, 10

Node 17 (n = 11)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1

Normal.nucleoli

18

1, 3, 4, 5, 6, 7, 8, 9, 102

Node 19 (n = 159)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1Node 20 (n = 5)

mal

igna

ntbe

nign

0

0.2

0.4

0.6

0.8

1

2009-07-09 17

Page 19: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: CHAID

Benchmark experiment misclassification error CHAID vs. CTREE(Bachelor thesis by Stefanie Thiemichen)

CHAID / CTREE

Vehicle

Soybean

Sonar

Ionosphere

Glaucoma

Glass

Diabetes

BreastCancer

2 4 6

●●

● ●●●● ●● ●

● ●●● ●●

● ●● ●●● ●● ●●

● ● ●● ●● ●● ●●●●● ● ● ●● ●●●

●● ●●

●●●●● ●●

● ●● ● ●●● ●●● ● ●● ●● ● ●●● ● ●●●● ●● 1.168 (1.138, 1.199)

1.099 (1.09, 1.109)

1.000 (0.986, 1.014) *

1.107 (1.083, 1.131)

1.121 (1.096, 1.146)

1.080 (1.062, 1.098) *

0.504 (0.49, 0.519)

1.105 (1.095, 1.114)

2009-07-09 18

Page 20: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: PMML & SPSS

Toy PMML example:

<PMMLversion="3.1"xmlns="http://www.dmg.org/PMML-3_1"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.dmg.org/PMML-3_1 pmml-3-1.xsd"><Header

copyright="Copyright (c) 1989-2006 by SPSS Inc. All rights reserved."><Application

name="SPSS for Microsoft Windows"version="15.0.1"/>

<Timestamp/></Header><DataDictionary

numberOfFields="4"><DataField

...

R> pm <- pmmlTreeModel("pmml.xml")R> plot(pm)

2009-07-09 19

Page 21: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: PMML & SPSS

X1

1

2 1 3

X2

2

1 2, 3

terminalnode

3terminal

node

4

terminalnode

5terminal

node

6

2009-07-09 20

Page 22: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

Example: PMML & SPSS

R> nd <- data.frame(X1 = gl(3, 10), X2 = sample(gl(3, 10)))R> predict(pm, newdata = nd)1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 261 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

27 28 29 301 1 1 1

Levels: 1 2

2009-07-09 21

Page 23: partykit: A Toolbox for Recursive Partytioning · 2010. 5. 18. · 14) vars< 0.064 19 9 glaucoma (0.52631579 0.47368421) * 15) vars>=0.064 94 11 normal (0.11702128 0.88297872)

What’s next?

Once a stable version of partykit is available from CRAN, we hope that useRs willstart

• adding new tree algorithms,

• adding new ensemble methods,

• adding coercing methods from/to other packages,

• adding new plot functionality, and generally

• will use our kit to party harder.

2009-07-09 22