30
1 Instance Based Learning Ata Kaban The University of Birmingham

Instance Based Learning

  • Upload
    lesa

  • View
    56

  • Download
    0

Embed Size (px)

DESCRIPTION

Instance Based Learning. Ata Kaban The University of Birmingham. Today we learn: K-Nearest Neighbours Locally weighted regression Case-based reasoning Lazy and eager learning. Instance-based learning. One way of solving tasks of approximating discrete or real valued target functions - PowerPoint PPT Presentation

Citation preview

Page 1: Instance Based Learning

1

Instance Based Learning

Ata KabanThe University of Birmingham

Page 2: Instance Based Learning

2

Today we learn: K-Nearest Neighbours Locally weighted regression Case-based reasoning Lazy and eager learning

Page 3: Instance Based Learning

3

Instance-based learning

One way of solving tasks of approximating discrete or real valued target functions

Have training examples: (xn, f(xn)), n=1..N. Key idea:

– just store the training examples– when a test example is given then find the

closest matches

Page 4: Instance Based Learning

4

1-Nearest neighbour:Given a query instance xq,

• first locate the nearest training example xn

• then f(xq):= f(xn) K-Nearest neighbour:

Given a query instance xq, • first locate the k nearest training examples • if discrete values target function then take

vote among its k nearest nbrs else if real valued target fct then take the mean of the f values of the k nearest nbrs

kxf

xfk

i iq

1)(

:)(

Page 5: Instance Based Learning

5

The distance between examples

We need a measure of distance in order to know who are the neighbours

Assume that we have T attributes for the learning problem. Then one example point x has elements xt , t=1,…T.

The distance between two points xi xj is often defined as the Euclidean distance:

T

ttjtiji xxd

1

2][),( xx

Page 6: Instance Based Learning

6

Voronoi Diagram

Page 7: Instance Based Learning

7

Characteristics of Inst-b-Learning

An instance-based learner is a lazy-learner and does all the work when the test example is presented. This is opposed to so-called eager-learners, which build a parameterised compact model of the target.

It produces local approximation to the target function (different with each test instance)

Page 8: Instance Based Learning

8

When to consider Nearest Neighbour algorithms?

Instances map to points in Not more then say 20 attributes per

instance Lots of training data Advantages:

– Training is very fast– Can learn complex target functions– Don’t lose information

Disadvantages:– ? (will see them shortly…)

n

Page 9: Instance Based Learning

9

twoone

four

three

five six

seven Eight ?

Page 10: Instance Based Learning

10

Training dataNumber Lines Line types Rectangles Colours Mondrian?

1 6 1 10 4 No

2 4 2 8 5 No

3 5 2 7 4 Yes

4 5 1 8 4 Yes

5 5 1 10 5 No

6 6 1 8 6 Yes

7 7 1 14 5 No

Number Lines Line types Rectangles Colours Mondrian?

8 7 2 9 4

Test instance

Page 11: Instance Based Learning

11

Keep data in normalised formOne way to normalise the data ar(x) to a´r(x) is

t

ttt

xxx

'

attributestofmeanx thr

attributestofdeviationndardsta tht

Page 12: Instance Based Learning

12

Normalised training dataNumber Lines Line

types Rectangles Colours Mondrian?

1 0.632 -0.632 0.327 -1.021 No

2 -1.581 1.581 -0.588 0.408 No

3 -0.474 1.581 -1.046 -1.021 Yes

4 -0.474 -0.632 -0.588 -1.021 Yes

5 -0.474 -0.632 0.327 0.408 No

6 0.632 -0.632 -0.588 1.837 Yes

7 1.739 -0.632 2.157 0.408 No

Number Lines Line types

Rectangles Colours Mondrian?

8 1.739 1.581 -0.131 -1.021

Test instance

Page 13: Instance Based Learning

13

Distances of test instance from training data

Example Distanceof testfromexample

Mondrian?

1 2.517 No

2 3.644 No

3 2.395 Yes

4 3.164 Yes

5 3.472 No

6 3.808 Yes

7 3.490 No

Classification1-NN Yes

3-NN Yes

5-NN No

7-NN No

Page 14: Instance Based Learning

14

What if the target function is real valued?

The k-nearest neighbour algorithm would just calculate the mean of the k nearest neighbours

Page 15: Instance Based Learning

15

Variant of kNN: Distance-Weighted kNN

We might want to weight nearer neighbors more heavily

Then it makes sense to use all training examples instead of just k (Stepard’s method)

2

1

1

),(1 where

)(:)(

iqik

i i

k

i iiq d

ww

fwf

xxx

x

Page 16: Instance Based Learning

16

Difficulties with k-nearest neighbour algorithms

Have to calculate the distance of the test case from all training cases

There may be irrelevant attributes amongst the attributes – curse of dimensionality

Page 17: Instance Based Learning

17

A generalisation: Locally Weighted Regression

Some useful terminology– Regression = approximating a real valued

function– Residual error = the error in approximating

the target function– Kernel function = the function of distance

that is used to determine the weight of each training example, I.e.

)),((:),(

12 iq

iqi xxdK

xxdw

Page 18: Instance Based Learning

18

(Locally Weighted Regression)

Note kNN forms local approximations to f for each query point Why not form an explicit local approximation for regions

surrounding the query point, e.g.– Fit a linear function to the k nearest neighbours (or a quadratic, …) e.g.

– …

)(

2

110

))(ˆ)((21)( minimize

)(...)()(ˆlet

qxkNNxq

nn

xfxfxE

xawxawwxf

Page 19: Instance Based Learning

19

Case-based reasoning (CBR) CBR is an advanced instance based learning

applied to more complex instance objects Objects may include complex structural

descriptions of cases & adaptation rules It doesn’t use Euclidean distance measures but

can do matching between objects using e.g. semantic nets.

It tries to model human problem-solving– uses past experience (cases) to solve new problems– retains solutions to new problems

CBR is an ongoing area of machine learning research with many applications

Page 20: Instance Based Learning

20

Applications of CBR Design

– landscape, building, mechanical, conceptual design of aircraft sub-systems

Planning– repair schedules

Diagnosis– medical

Adversarial reasoning– legal

Page 21: Instance Based Learning

21

CBR processNew Case

matching Matched Cases

Retrieve

Adapt?No

Yes

Closest CaseSuggest solution

Retain

Learn

Revise

Reuse

Case Base

Knowledge and Adaptation rules

Page 22: Instance Based Learning

22

CBR example: Property pricingCase Location

codeBedrooms Recep

roomsType floors Cond-

itionPrice£

1 8 2 1 terraced 1 poor 20,500

2 8 2 2 terraced 1 fair 25,000

3 5 1 2 semi 2 good 48,000

4 5 1 2 terraced 2 good 41,000

Case Locationcode

Bedrooms Receprooms

Type floors Cond-ition

Price£

5 7 2 2 semi 1 poor ???

Test instance

Page 23: Instance Based Learning

23

How rules are generated

Examine cases and look for ones that are almost identical– case 1 and case 2

• R1: If recep-rooms changes from 2 to 1 then reduce price by £5,000

– case 3 and case 4• R2: If Type changes from semi to terraced then

reduce price by £7,000

Page 24: Instance Based Learning

24

Matching Comparing test instance

– matches(5,1) = 3– matches(5,2) = 3– matches(5,3) = 2– matches(5,4) = 1

Estimate price of case 5 is £25,000

Page 25: Instance Based Learning

25

Adapting

Reverse rule 2– if type changes from terraced to semi then

increase price by £7,000 Apply reversed rule 2

– new estimate of price of property 5 is £32,000

Page 26: Instance Based Learning

26

Learning

So far we have a new case and an estimated price– nothing is added yet to the case base

If later we find house sold for £35,000 then the case would be added– could add a new rule

• if location changes from 8 to 7 increase price by £3,000

Page 27: Instance Based Learning

27

Problems with CBR

How should cases be represented? How should cases be indexed for fast

retrieval? How can good adaptation heuristics be

developed? When should old cases be removed?

Page 28: Instance Based Learning

28

Advantages

An local approximation is found for each test case

Knowledge is in a form understandable to human beings

Fast to train

Page 29: Instance Based Learning

29

Summary

K-Nearest Neighbour (Locally weighted regression) Case-based reasoning Lazy and eager learning

Page 30: Instance Based Learning

30

Lazy and Eager Learning

Lazy: wait for query before generalizing– k-Nearest Neighbour, Case based reasoning

Eager: generalize before seeing query– Radial Basis Function Networks, ID3, …

Does it matter?– Eager learner must create global approximation– Lazy learner can create many local

approximations