28
© Jesse Davis 2006 © Jesse Davis 2006 View Learning Extended: View Learning Extended: Learning New Tables Learning New Tables Jesse Davis Jesse Davis 1 , Elizabeth Burnside , Elizabeth Burnside 1 , , David Page David Page 1 , Vítor Santos Costa , Vítor Santos Costa 2 1 University of Wisconsin-Madison University of Wisconsin-Madison USA USA 2 Federal University of Rio de Janeiro Federal University of Rio de Janeiro Brasil Brasil

© Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

View Learning Extended:View Learning Extended:Learning New TablesLearning New Tables

Jesse DavisJesse Davis11, Elizabeth Burnside, Elizabeth Burnside11, , David PageDavid Page11, Vítor Santos Costa, Vítor Santos Costa22

11University of Wisconsin-MadisonUniversity of Wisconsin-MadisonUSAUSA

22Federal University of Rio de JaneiroFederal University of Rio de JaneiroBrasilBrasil

Page 2: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

1 P1 5/02 No 0.03 RU4 B

2 P1 5/04 Yes 0.05 RU4 M

3 P1 5/04 No 0.04 LL3 B

4 P2 6/00 No 0.02 RL2 B … … … … … … …

Abnormality Patient Date Calcification … Mass Loc Benign/ Fine/Linear Size Malignant

View Learning FrameworkView Learning Framework[Davis et al. IJCAI05][Davis et al. IJCAI05]

Learn fields predictive of Learn fields predictive of target concepttarget concept

Page 3: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

1 P1 5/02 No 0.03 No RU4 B

2 P1 5/04 Yes 0.05 Yes RU4 M

3 P1 5/04 No 0.04 No LL3 B

4 P2 6/00 No 0.02 No RL2 B … … … … … … … …

Abnormality Patient Date Calcification … Mass Increase Loc Benign/ Fine/Linear Size in size Malignant

Extend SchemaExtend Schema

IncreaseIn Size

No

Yes

No

No…

Page 4: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Integrated Search for New Integrated Search for New FieldsFields

[Landwehr et al. AAAI 2005, Davis et al. ECML 2005][Landwehr et al. AAAI 2005, Davis et al. ECML 2005]

Old approach: Old approach: Step 1 use ILP to learn new fieldsStep 1 use ILP to learn new fields Step 2 learn statistical modelStep 2 learn statistical model

Score As You Use (SAYU):Score As You Use (SAYU): Combine steps 1 and 2Combine steps 1 and 2 Score new field by how much it helps statistical Score new field by how much it helps statistical

modelmodel

Parallel development: nFOIL Parallel development: nFOIL

Page 5: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Relevant Intermediate Relevant Intermediate ConceptsConcepts

Advisedby(Student,Professor)

ta_for(Student,Professor)

ta(Student,Class) teach(Professor,Class)

coauthor(Person,Person)paper(Person,Ref)

Goal: Automatically generate and Automatically generate and incorporate intermediate conceptsincorporate intermediate concepts

Page 6: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Limitations to Our Old WorkLimitations to Our Old Work

Previously View Learning adds new Previously View Learning adds new fieldsfields

More expressive to learn predicates More expressive to learn predicates not approximations to target conceptnot approximations to target concept represent new tablesrepresent new tables

Solution: Extend SAYUSolution: Extend SAYU

Page 7: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

VISTAVISTA Algorithm Algorithm

VView iew

IInvention throughnvention through

SScoring coring

TTables withables with

AAggregationggregation

Page 8: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Distinguished types [id, patient, visit]

Algorithm IllustrationAlgorithm Illustration

p1(id,id)

p1/2

Rule 14 Rule N

ClassValue

…Score = 0.0

20.12

0.10

0.15

0.35

Rule 1Rule 2Rule 3

p2/1

p2(patient)

:-sameStudy(Id1,Id2):-historyOfBC(Patient):-hadBiopsy(_,Patient)

BackgroundKnowledge

Page 9: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Algorithm Details Algorithm Details

Learn predicates withLearn predicates with Target predicate arityTarget predicate arity Target predicate arity + 1Target predicate arity + 1

Moded language Moded language

Breadth first search over clause bodiesBreadth first search over clause bodies

Page 10: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Count AggregationCount Aggregation

1

2

3

4

5

6

Id Count

density_increase(density_increase(AA,,BB) :- density() :- density(AA,,D1D1),), prior_mammogram_same_loc(prior_mammogram_same_loc(AA,,BB), ), density(density(BB,,D2D2), ), D1D1 > > D2D2. .

0

1

0

0

1

2

Count

1 P1 5/02 low RU4 B

2 P1 5/04 high RU4 M

3 P1 5/04 none LL3 B

4 P2 6/00 none RL2 B

5 P2 6/02 low RL2 B

6 P2 9/03 high RL2 M

… … … … … …

Id Patient Date … Mass Loc Benign/ Density Malignant

Page 11: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

LinkageLinkage

Distinguished variable may not correspond Distinguished variable may not correspond to example keyto example key

p1(p1(PatientPatient) :-) :-

historyOfBC(historyOfBC(PatientPatient), hadBiopsy(), hadBiopsy(PatientPatient).).

Above rule adds a field to Patient tableAbove rule adds a field to Patient table

Q: How do we score p1?Q: How do we score p1?

Page 12: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Linkage ExampleLinkage Example

P1 No

P2 Yes

P3 No

P4 No

P5 Yes

P6 No

Patient Family History

p1(p1(PatientPatient) :- historyOfBC() :- historyOfBC(PatientPatient), ), hadBiopsy(_, hadBiopsy(_,PatientPatient).).

1 P1 5/02 low RU4 B

2 P1 5/04 high RU4 M

3 P1 5/04 none LL3 B

4 P2 6/00 none RL2 B

5 P2 6/02 low RL2 B

6 P2 9/03 high RL2 M

… … … … … …

Id Patient Date … Mass Loc Benign/ Density Malignant

Page 13: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

1 P1 5/02 low RU4 B

2 P1 5/04 high RU4 M

3 P1 5/04 none LL3 B

4 P2 6/00 none RL2 B

5 P2 6/02 low RL2 B

6 P2 9/03 high RL2 M

… … … … … …

Id Patient Date … Mass Loc Benign/ Density Malignant

P1 No No

P2 Yes Yes

P3 No No

P4 No No

P5 Yes No

P6 No No

… … …

Patient Family p1 History

Linkage ExampleLinkage Example

p1(p1(PatientPatient) :- historyOfBC() :- historyOfBC(PatientPatient), ), hadBiopsy(_, hadBiopsy(_,PatientPatient).).

Page 14: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Linkage ExampleLinkage Example

p1(p1(PatientPatient) :- historyOfBC() :- historyOfBC(PatientPatient), ), hadBiopsy(_, hadBiopsy(_,PatientPatient).).

1 P1 5/02 low RU4 B No

2 P1 5/04 high RU4 M No

3 P1 5/04 none LL3 B No

4 P2 6/00 none RL2 B Yes

5 P3 6/02 low RL2 B Yes

6 P4 9/03 high RL2 M Yes

… … … … … … …

Id Patient Date … Mass Loc Benign/ p1 Density Malignant

Page 15: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

New Features in VISTANew Features in VISTA

User declares set of User declares set of distinguished typesdistinguished types that appear in clause headthat appear in clause head

Allow Allow reuse of learned predicatereuse of learned predicate

Count aggregationCount aggregation

Linkage permits learning predicates with:Linkage permits learning predicates with: Higher arity than target (new tables)Higher arity than target (new tables) Different types than target Different types than target

Page 16: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

ExperimentExperiment

Q: Does VISTA or SAYU perform better?Q: Does VISTA or SAYU perform better?

Page 17: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

DatasetsDatasets

Cora (5 x 2 fold cross validation)Cora (5 x 2 fold cross validation)[McCallum et al. 00, Kok & Domingos 05][McCallum et al. 00, Kok & Domingos 05]

UW-CSE (5 fold cross validation)UW-CSE (5 fold cross validation)[Richardson & Domingos 04][Richardson & Domingos 04]

Mammography (10 fold cross Mammography (10 fold cross validation)validation)[Davis et al. 05][Davis et al. 05]

Page 18: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Area Under Precision-Recall Area Under Precision-Recall CurveCurve

Generate wholeGenerate wholePR CurvePR Curve

Area Under PR for Area Under PR for Recall > 0.5Recall > 0.5P

reci

sion

Recall

Page 19: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Cora

00.10.20.30.40.50.60.70.80.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Recall

Pre

cisi

on

VISTA

SAYU

Page 20: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

UW-CSE

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Recall

Pre

cisi

on

VISTA

SAYU

Page 21: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Mammography

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Recall

Pre

cisi

on

VISTA

SAYU

Page 22: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

UW-CSE

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

Av

era

ge

Are

a U

nd

er

PR

Cu

rve VISTA

SAYUMLN

MLN data from Singla & Domingos AAAI 2005

Page 23: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Related Topic: Predicate Related Topic: Predicate InventionInvention

Cigol: Muggleton & Buntine (1988)Cigol: Muggleton & Buntine (1988)

CHILLIN: Zelle & Mooney (1994)CHILLIN: Zelle & Mooney (1994)

FOIL-PILFS: Craven & Slattery (2001)FOIL-PILFS: Craven & Slattery (2001)

SLR: Popescul & Ungar (2004)SLR: Popescul & Ungar (2004)

Page 24: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Related Work: Feature Related Work: Feature ConstructionConstruction

Pompe & Kononenko, ILP’95Pompe & Kononenko, ILP’95

Srinivasan & King, ILP’97Srinivasan & King, ILP’97

Perlich & Provost, KDD’03Perlich & Provost, KDD’03

Knobbe, de Haas & Siebes, PKDD’01Knobbe, de Haas & Siebes, PKDD’01

Page 25: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Future WorkFuture Work

Further investigate benefits of VISTAFurther investigate benefits of VISTA Linkage as jumping deeper into search spaceLinkage as jumping deeper into search space Reuse of predicates Reuse of predicates

Extensions to VISTAExtensions to VISTA NegationNegation DisjunctionDisjunction Stochastic searchStochastic search

Comparisons to other SRL systemsComparisons to other SRL systems

Page 26: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

ConclusionsConclusions

VISTAVISTA adds capabilities adds capabilities Add fields to tables other than target relationAdd fields to tables other than target relation Learn new relationsLearn new relations

VISTAVISTA empirically empirically Better Cora (p-value < 0.001)Better Cora (p-value < 0.001) Almost better on UW-CSE (p-value < 0.06)Almost better on UW-CSE (p-value < 0.06) No worse on Mammography (p-value < 0.94) No worse on Mammography (p-value < 0.94)

Page 27: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

AcknowledgementsAcknowledgements

Mark CravenMark Craven Jude ShavlikJude Shavlik Inês DutraInês Dutra Mark GoadrichMark Goadrich Irene OngIrene Ong Trevor WalkerTrevor Walker

Raghu Raghu RamakrishnanRamakrishnan

Rich MaclinRich Maclin Lisa TorreyLisa Torrey Jan StruyfJan Struyf Allison HollowayAllison Holloway

This work was partially supported by Air Force grant F30602-01-2-0571

Page 28: © Jesse Davis 2006 View Learning Extended: Learning New Tables Jesse Davis 1, Elizabeth Burnside 1, David Page 1, Vítor Santos Costa 2 1 University of

© Jesse Davis 2006© Jesse Davis 2006

Thank You!Thank You!

Questions?Questions?