1/23 Learning from positive examples Main ideas and the particular case of CProgol4.2 Daniel...

Learning from positive examples

Main ideas and the particular case of CProgol4.2

Daniel Fredouille, CIG talk,11/2005

What is it all about?

• Symbolic machine learning.• Learning from positive examples instead

of positive and negative examples.• The talk contains two parts:

1. General ideas and tactics to learn from positives.

2. How the particular ILP system CProgol 4.4 of S. Muggleton (1997) deals with positive only learning

Disclaimer

• This talk has not been extracted from a survey or any article in particular: this is more like a patchwork of my experiences in the domain and how I interpret them.

• Feel free to criticize: I would like feedback on these ideas since I never shared them before.

• I would really appreciate comments on the

slides with the ? sign.

Definitions

Concept space Instances space

Inferred concept C’

Positive/Negative example of CTarget concept C

• Is more general / less specific than• The concept space is usually partially ordered with this relation

Positive and Negative Learning

Possibility 1: Discrimination of classes• Characterise the difference in the pos/neg examples• No model of the positive concept !

Positive and Negative Learning

Possibility 2: Characterisation of a class• Use neg. examples to prevent over-generalisation• Needs neg. examples “close” to the concept border

Positive Only Learning

Aim: Characterisation of a class

Choice ?

Positive Only Learning

• Two strategies:1. Bias in the search space: choosing a space

with a (very) strong structure.

2. Bias in the evaluation function: choose a concept with a compromise between:– Generality/specificity of the concept– Coverage of the positives by the concept– Complexity of the hypothesis representing the

concept

Search space bias approach

• Main idea: consider strongly organised concept spaces

• Possible inference algorithm:– Select the concept the least general covering all

examples.– The constraints on the search space ensures there is

only one such concept.

Trivial example (generally not useful), “tree organisation”:

Search space bias approach

• Advantages: – Strong theoretical convergence results possible.– Can lead to (very) fast inference algorithms.

• Drawback:– Not available for all concepts spaces!– Theorem: super-finite classes of concepts are not

inferable in the limit this way (Gold 69).Super-finite = contains all concepts covering a finite number of examples and at least one concept covering an infinity.

Heuristic Approach

• Scoring making a compromise between:1. Specificity of the concept

2. Coverage of the positives by the concept

3. Complexity of the concept

• Implementations: – Ad-hoc measure of points 1, 2, 3 and combination in

a formulae, e.g.: Score = Coverage + Specificity – Complexity

– Minimum Message Length ideas (~MDL)

Heuristic Approach: Ad-hoc implementation

• Elements of the score– Coverage: counting covered instances– Specificity: measure of the “proportion” of instances of

the space covered– Complexity: the size of the concept representation

(e.g., number of rules)• Advantages:

– Usually easy to implement– Usually provides parameters to tune the compromise

• Disadvantage: – No theory– Bias not always clear– How to combine coverage/specificity/complexity?

Heuristic Approach: MML implementation

Examples

Hyp. Examples classes ¦ Hyp classes

0100101001011010101110101

Canal Examples and classesHyp.

00101101010111011101101

Examplesand classes ¦ Hyp

MML for discrimination

MML for characterisation

Gain = number of bits needed to send the message without compression – number of bits needed to send the message with compression.

Heuristic Approach: MML implementation

• Advantages:– Some theoretical justifications in Kolmogorov/

Solomonov/ Ockam/ Bayes/ Chaitin works.– Absolute and meaningful score.

• Disadvantage:– Limit of the theory: the optimal code can NOT

be computed !– Difficult implementation:

the choices of the encoding creates the inference biases, this is not very intuitive.

Positive only learning in ILP with CProgol4.2

Positive only learning in ILP• The following is not a survey! This is from what I

already encountered but I have not looked for further references.

• MML implementations– Muggleton [88]– Srinivasan, Muggleton, Bain [93]– Stahl [96]

• Other implementations:– Muggleton CProgol4.2 [97]– Heuristic had-hoc method– Somehow based on MML, but the implementation

details makes it quite different.

CProgol4.2 uses Bayes

DH DI DI ¦h

h H i I

Score: P(h ¦ E) = P(h) * P(E ¦ h) / P(E) • Fixing distributions and computing P(h), P(E ¦ h), P(E)

Assumptions for the distributions

• P(h) = e- size(h)

– Large theories are less probable than small ones

– size(h) = sum over the rules ci of h of the number of literals in the body of ci

• P(E ¦ h) = ΠeE DI¦h(e) = ΠeE DI (e) / DI (h)

– Assumption that DI and DH gives DI¦h

– Independence assumption between examples

Replacing in Bayes

• P(h ¦ E) =

e- size(h) * [ ΠeE DI (e) / DI (h) ] / P(E)

• As we want to compare hypotheses:= [e- size(h) / DI (h)|E|] * Cste1

• Take the log: ln(P(h ¦ E)) = -size(h) + |E| * ln(1/DI (h)) + Cste2

• We still have to compute DI (h) ...

DI (h): weight of h in the instance set

• Computing DI:

– Using a stochastic logic program S trained with the BK to model DI (not included in the talk)

• Computing DI(h):

– Generate R instances from DI

– h covers r of them

– DI (h) = (r+1) / (R+2)H

Formulae for a whole theory covering E

• ln(P(h ¦ E)) = -size(h) - |E| * ln((r+1)/(R+2)) + C2

Complexity SpecificityCoverage

Estimation of final theory score from a partially inferred theory:• ln(P(h’ ¦ E)) =

|E|/p * size(h’) - |E| * ln( |E|/p * (r’+1)/(R+2)) + C3

Final evaluation

• Suppression of |E| and C2:– f(h’) = size(h’) /p + ln(p) - ln(|E| * (r’+1)/(R+2))

• Possible boost of positives with k:– size(h’)/(k*p) + ln(k*p) - ln( |E|*(r’+1)/(R+2) )

• The formulae is not written anywhere (the above one is my best guess !).

• The papers are hard to understand• But it seems to work ...

Complexity SpecificityCoverage

Conclusion

• Learning from positives only is a real challenge and methods from positive and negatives can hardly be adapted.

• Some nice theoretical frameworks exist. • When it gets to implementing heuristic

frameworks:– The theory is often lost in approximations and choices

of implementation.– Useful systems can be created but tuning and

understanding the biases have to be considered as very important stages of inference.

1/23 Learning from positive examples Main ideas and the particular case of CProgol4.2 Daniel...

Documents

Finante CIG

THE WELLINGTON - London Bay HomesBedroom 3 10'-0" Cig Closet Bath 3 10'-0' Cig Bath 2 IO'-O" Cig Closet Bedroom 2 10'-0" Cig 10'-0" Cig. Loft 12-9"x15'-2" 10'£" Cig Sitting 9'-5"x10'-10"

Positive examples from UW CES Extended Term & Promotion ...wyoextension.org/employee_resources/forms/etp-webinar-handout2.… · Positive examples from UW CES Extended Term & Promotion

Positive Examples Negative Examples Flute Positive Examples Negative Examples Trombone Flute

Engleza1 Cig

Cohomogeneity one Riemannian manifolds of non-positive ...1.1.1. Examples of Riemannian manifolds of non-positive curvature The simplest examples of Riemannian manifolds of non-positive

Schoolwide Positive Behavior Support: Systems Needed · PDF fileSchoolwide Positive Behavior Support: Systems Needed to Change ... Purpose of Presentation ... – Examples of systems

Le CIG BoursIer AsTrA de ssQ Groupe sur Le mArChé! CIG

Today we will compute examples (of positive integer powers of nonnegative integers) as repeated multiplication. compute=to solve positive integer =whole

CIG Project

Learning from positive and unlabeled examples in biology

Disclosure - J-STOP · Cig-Marijuana Marijuana only Cig-Other Illicit Cig-Alcohol Marij-Other… Cig-Alc-Marij All Four Other Illicit only Alc-Marijuana % Self-reported Use Combination

CIG Credit Clearing Comparative Analysis v1.0 29-June-11 v2=/CIG Credit... · the CIG appointed the Credit Clearing Comparative Analysis Working Group (“sub-group”). The CIG sub-group

A-Faster R-CNN: Generating Hard Positive Examples via ... · A-Faster R-CNN: Generating Hard Positive Examples via Adversary for Traffic Sign Detection Stephanie Milani and Christoph

Examples of Positive Bioenergy and Water Relationships · Examples of positive bioenergy and water relationships: Key Messages Producing electricity from biomass from terrestrial

A Simple Probabilistic Approach to Learning from Positive and Unlabeled Examples

Positive Examples of Coexistence from the History of ... · Positive Examples of Coexistence from the History of Peoples and States of the South Caucasus. Collection of papers by

3DRotSymRicciSolitonsmath.duke.edu/~bryant/3DRotSymRicciSolitons.pdf · positive curvature. I prove that there is a 1-parameter family of complete expanding examples with positive

Cig bureaupresentatie

Lanai 16'-0 x Cig. Leisure :R00Jii - - Lanai llñ Up Powder Bath i Cig ...€¦ · Powder Bath i Cig. Garage x 32'* 7-6" Cig. Nook Cig. .tchen Pan Ârt Nichè Ðinirté to Cig. Living