20

 · 30 4 0 36 3 1 42 2 0 54 7 0 12 1 0 18 8 1 24 6 0 30 4 0 42 2 0 30 0 35 3 1 40 7 0 45 5 1 6 0 1 36 3 1 54 7 0 rid : record id cid : class label SP : split point P1 P0 P0 P1 P1

Embed Size (px)

Citation preview

SP

0 1

0 1

CLASS 1

10 1520 1230 1640 1 050 0060 1470 03

40 1 050 0060 1470 03

H 0 00H 1

3 0H

40 1 050 0070 03

H 0 00H 1

3 0H

10 1520 1230 16

60 14

11

0010

1

1H 215H

(a) LR

LR

LR

LR

0.490.43

0.340.21

0.400.49

0.430.49

CM

CM

gini_split

gini_split

2 31 1

0 33 1

0 13 3

3 40 0

LR

2 41 0

LR

1 35 4

LR

0 23 2

LR

0 03 4

gini_split

CM

C

0.34

rid cidvalue

B

3 20 2

0 1

A < 35

CLASS 0 CLASS 1

B

Avalue rid cid

B=H B=C

B

A B

BA

(b)

cid : class label

SP : split pointrid : record id

R : right child (child 1)

CM : count matrix

A B

A

01234 56

Arid50

2040

601030

70

B cid

HHHC

CH

H

6 1C15H1

3 0H1H 2

4C

0H 1H 0 0

H

14C

14C

6 1C

L : left child (child 0)

value rid cid value rid cid

5 1 010 8 115 4 020 6 025 2 030 0 135 3 1

740 0545 1

15 4 020 6 025 2 0

10 8 15 1 0

6 0 11 012

1818024 60430

36 3 10242

054 7

1 0121818024 604300242

30 0 135 3 1

740 0545 1

6 0 136 3 1

054 7

rid : record idcid : class label

SP : split point

P1

P0 P0

P1P1

B2

B1 B2

P0SP

SP

B1 < 12.5

B1 < 27.5

P1

P0P1

P0B1 B2

P0

P1

B1

B2 < 51.0

1548

1548

Agerid cidvalue

P0

P1

P2

3840

58

13

4 70

00011

10

070

50

33282419 2

58

6

1

SP

24542 0098816 33 0149241 192

3 126146 384 94766 505678 64911

15303213683897672 24

405828

0

11

0

110

70Salary Age cidrid

-01234 5678

--------

(0,L) (2,L) (2,L)

comm

P0

P1

P2

P0

P1

P2

(1,L)

(1,L) (2,L) (0,R)

(2,L) (0,R) (1,R)

(0,L) (2,L)

comm

P0

P1

P2

P2P1P0

P2P1P0

P2P1P0

P0

P1

P2

L

R

L L

L L

R RL

comm

P0

P1

P2

P2P1P02 5 8

1 3 6

0 4 7

P2P1P0

P0

P1

P2

Salaryvalue rid cid

P0

P1

P2

0

1

1368387

245424924164911947669767298816

12614663

54 820 0

1

000111

P0

P1

P2

Node Tablekidrid

(b)(a)

153032

(c)

(d)

P0

(0,R) (0,R) (1,R)

(1,L) (2,L)

P1 P2

P1P0

P2

(1,L)1 2 3 4 5 6 870

L L L L L LR R R

P0 P1 P2

rid

kid

Node Table

hash buffers

update

8 6 7

4 5 3

0 2 1

retrieve

intermediate value buffers

L

R

L L

L R

L RL

result buffers

2 2 2

1 0 0

0 1 1

rid : global record idcid : class label

SP : split pointPi : Processor i

kid : child number

R : right child (child 1)

enquiry buffers intermediate index buffers

2 1 0

2 0 1

2 0 1

L : left child (child 0)

value rid cid value rid cid

5 1 010 8 115 4 020 6 025 2 030 0 135 3 1

740 0545 1 063 6

01561849

42 2 00735

28 5 11021

4 0147 3 1

15 4 020 6 025 2 0

10 8 15 1 0 4 014

42 2 018490156063 6

30 0 135 3 1

740 0545 1

7 3 11021

28 5 10735

P1

P1

P0

P0B1 B2

SP

P1

P0 P0

P0

P1

P1P1

B1 B2

B1 B2

P0SP

SP

B2 < 31.5B1 < 12.5

B1 < 27.5

CLASS 0

CLASS 1Age < 21

CLASS 1

0

1

64911947669767298816

54 8

000

24542 0 0

Salary

P1

P0

1

00033

2824 5

8

50 4 00070

Age

P0

P2

P1

Age

584038

7 163 1

1

P2

P1

Salary

153032136838126146

763 1

11

P2

Salary49241 2 1 P0

Age19 2 1 P0

Age

1

00033

282419 2

58

1

50 4 00070

Salary

0

1

245424924164911947669767298816

54 820 0

1

000

P0 P0

P1P1

P2

Salary < 112481

Agerid cidvalue

P0

P1

P2

3840

58

13

4 70

00011

10

070

50

33282419 2

58

6

1

Salaryvalue rid cid

P0

P1

P2

0

1

1368387

245424924164911947669767298816

126146

15303263

54 820 0

1

000111

SP3 32 1

0 05 4

23 3

1

LR

0 10 0

12

LR

0 10 0

12

cid : class label

SP : split pointrid : record id

Pi : Processor i

R : right child (child 1)L : left child (child 0)

kid : child number

P0

P0

P1

P1

P2

P2

SP

Node Tablekidrid

Node Table

01234 5678

LR

RRRRRR

R

01234 5678

LLL

LL

L

R

R

R

Count Matrices

LR

LR

0 1

LR

LR

0 1

LR

0 1

0 1 0 1local global

0 02 1

01 2

0

0

20

40

60

80

100

120

0 20 40 60 80 100 120 140

Para

llel R

untim

e (

Seconds)

Number of Processors

0.2m0.4m0.8m1.6m3.2m6.4m

0

20

40

60

80

100

0 20 40 60 80 100 120 140

Para

llel R

untim

e (

Seconds)

Number of Processors

100 k/proc50 k/proc25 k/proc

12.5 k/proc6.25 k/proc

3.125 k/proc

0

5

10

15

20

0 20 40 60 80 100 120 140Mem

ory

Requirem

ents

per

pro

cessor

(in m

illio

n b

yte

s)

Number of Processors

0.2m0.4m0.8m1.6m3.2m6.4m

(a) (b)

(c)

X : Salary : Node Table Entries: Age

0

1

2

3

4

5

6

7

8

0 1 2 3 4 5 6 7 8

P1

P0

P2

X

X

X

X

X

X

X

X

X