37
Interpolation versus Extrapolation Interpolation is technically defined only for inputs that are within the range of the data set min i x i x max i x i If an input is outside of this range, the model is said to be extrapolating A good model should do reasonable things for both cases Extrapolation is much a harder problem J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 3 Univariate Smoothing Overview Problem definition Interpolation Polynomial smoothing Cubic splines Basis splines Smoothing splines Bayes’ rule Density estimation Kernel smoothing Local averaging Weighted least squares Local linear models Prediction error estimates J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 1 Example 1: Linear Interpolation 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 Input x Output y Chirp Linear Interpolation J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 4 Problem Definition & Interpolation Smoothing Problem: Given a data set with a single input variable x, find the best function ˆ g(x) that minimizes the prediction error on new inputs (probably not in the data set) Interpolation Problem: Same as the smoothing problem except the model is subject to the constraint ˆ g(x i )= y i for every input-output pair (x i ,y i ) in the data set Linear Interpolation: Use a line between each pair of points Nearest Neighbor Interpolation: Find the nearest input in the data set and use the corresponding output as an approximate fit Polynomial Interpolation: Fit a polynomial of order n 1 to input output data: ˆ g(x)= n i=1 w i x i1 Cubic Spline Interpolation: Fit a cubic polynomial with continuous second derivatives in between each pair of points (more on this later) J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 2

Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Embed Size (px)

Citation preview

Page 1: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Interpolation versus Extrapolation

• Interpolation is technically defined only for inputs that are withinthe range of the data set mini xi ≤ x ≤ maxi xi

• If an input is outside of this range, the model is said to beextrapolating

• A good model should do reasonable things for both cases

• Extrapolation is much a harder problem

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 3

Univariate Smoothing Overview

• Problem definition

• Interpolation

• Polynomial smoothing

• Cubic splines

• Basis splines

• Smoothing splines

• Bayes’ rule

• Density estimation

• Kernel smoothing

• Local averaging

• Weighted least squares

• Local linear models

• Prediction error estimates

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 1

Example 1: Linear Interpolation

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Input x

Out

put y

Chirp Linear Interpolation

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 4

Problem Definition & Interpolation

• Smoothing Problem: Given a data set with a single inputvariable x, find the best function g(x) that minimizes theprediction error on new inputs (probably not in the data set)

• Interpolation Problem: Same as the smoothing problem exceptthe model is subject to the constraint g(xi) = yi for everyinput-output pair (xi, yi) in the data set

– Linear Interpolation: Use a line between each pair of points

– Nearest Neighbor Interpolation: Find the nearest input inthe data set and use the corresponding output as anapproximate fit

– Polynomial Interpolation: Fit a polynomial of order n − 1 toinput output data: g(x) =

∑ni=1 wix

i−1

– Cubic Spline Interpolation: Fit a cubic polynomial withcontinuous second derivatives in between each pair of points(more on this later)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 2

Page 2: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

yh = At*w;

figure;

FigureSet(1,’LTX’);

h = plot(xt ,yh,’b’,x,y,’r.’);

set(h,’MarkerSize ’ ,8);

set(h,’LineWidth ’,1.2);

xlabel(’Input x’);

ylabel(’Output y’);

title(’Chirp Polynomial Interpolation ’);

set(gca ,’Box’,’Off’);

grid on;

axis ([0 1 -2 2]);

AxisSet (8);

print -depsc InterpolationPolynomial;

%================================================

% Cubic Spline Interpolation

%================================================

figure;

FigureSet(1,’LTX’);

yh = spline(x,y,xt);

h = plot(xt ,yh,’b’,x,y,’r.’);

set(h,’MarkerSize ’ ,8);

set(h,’LineWidth ’,1.2);

xlabel(’Input x’);

ylabel(’Output y’);

title(’Chirp Cubic Spline Interpolation ’);

set(gca ,’Box’,’Off’);

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 7

Example 1: MATLAB Code

%function [] = Interpolation ();

close all;

N = 15;

rand(’state’ ,2);

x = rand(N,1);

y = sin (2*pi*2*x.^2) + 0.2*randn(N,1);

xt = (0:0 .0001 :1)’; % Test inputs

%================================================

% Linear Interpolation

%================================================

figure;

FigureSet(1,’LTX’);

yh = interp1(x,y,xt,’linear’);

h = plot(xt ,yh,’b’,x,y,’r.’);

set(h,’MarkerSize ’ ,8);

set(h,’LineWidth ’,1.2);

xlabel(’Input x’);

ylabel(’Output y’);

title(’Chirp Linear Interpolation ’);

set(gca ,’Box’,’Off’);

grid on;

axis ([0 1 -2 2]);

AxisSet (8);

print -depsc InterpolationLinear;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 5

grid on;

axis ([0 1 -2 2]);

AxisSet (8);

print -depsc InterpolationCubicSpline;

%================================================

% Cubic Spline Interpolation

%================================================

figure;

FigureSet(1,’LTX’);

yt = sin (2*pi*2* xt. ^2);

h = plot(xt ,yt,’b’,x,y,’r.’);

set(h,’MarkerSize ’ ,8);

set(h,’LineWidth ’,1.2);

xlabel(’Input x’);

ylabel(’Output y’);

title(’Chirp Optimal Model’);

set(gca ,’Box’,’Off’);

grid on;

axis ([0 1 -2 2]);

AxisSet (8);

print -depsc InterpolationOptimalModel;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 8

%================================================

% Nearest Neighbor Interpolation

%================================================

figure;

FigureSet(1,’LTX’);

yh = interp1(x,y,xt,’nearest ’);

h = plot(xt ,yh,’b’,x,y,’r.’);

set(h,’MarkerSize ’ ,8);

set(h,’LineWidth ’,1.2);

xlabel(’Input x’);

ylabel(’Output y’);

title(’Chirp Nearest Neighbor Interpolation ’);

set(gca ,’Box’,’Off’);

grid on;

axis ([0 1 -2 2]);

AxisSet (8);

print -depsc InterpolationNearestNeighbor;

%================================================

% Polynomial Interpolation

%================================================

A = zeros(N,N);

for cnt = 1:size(A,2),

A(:,cnt) = x.^(cnt -1);

end;

w = pinv(A)*y;

At = zeros(length(xt),N);

for cnt = 1:size(A,2),

At(:,cnt) = xt.^(cnt -1);

end;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 6

Page 3: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 3: Polynomial Interpolation

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Input x

Out

put y

Chirp Polynomial Interpolation

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 11

Example 2: Nearest Neighbor Interpolation

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Input x

Out

put y

Chirp Nearest Neighbor Interpolation

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 9

Example 3: MATLAB Code

Same data set and test inputs as linear interpolation example.

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 12

Example 2: MATLAB Code

Same data set and test inputs as linear interpolation example.

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 10

Page 4: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Interpolation Comments

• There are an infinite number of functions that satisfy theinterpolation constraint:

g(xi) = yi ∀i

• Of course, we would like to choose the model that minimizes theprediction error

• Given only data, there is no way to do this exactly

• Our data set only specifies what g(x) should be at specific points

• What should it be in between these points?

• In practice, the method of interpolation is usually chosen by theuser

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 15

Example 4: Cubic Spline Interpolation

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Input x

Out

put y

Chirp Cubic Spline Interpolation

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 13

Smoothing

• For the smoothing problem, even the constraint is relaxed

g(xi) ≈ yi ∀i

• The data set can be merely suggesting what the model outputshould be approximately at some specified points

• We need another constraint or assumption about the relationshipbetween x and y to have enough constraints to uniquely specifythe model

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 16

Example 4: MATLAB Code

Same data set and test inputs as linear interpolation example.

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 14

Page 5: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Smoothing

yi = g(xi) + εi

• When we add noise, we can drop the interpolation constraint:g(xi) = yi ∀i

• But we still want g(·) to be consistent with (i.e. close to) thedata: g(xi) ≈ yi

• The methods we will discuss are biased in favor of models that aresmooth

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 19

Smoothing Assumptions and Statistical Model

y = g(x) + ε

• Generally we assume that the data was generated from thestatistical model above

• εi is a random variable with the following assumed properties

– Zero mean: E[ε] = 0– εi and εj are independently distributed for i �= j

– εi is identically distributed

• The two additional assumptions are usually made for thesmoothing problem

– g(x) is continuous

– g(x) is smooth

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 17

Bias-Variance Tradeoff

Recall that

MSE(x) = E[(g(x) − g(x))2

]= (g(x) − E[g(x)])2 + E

[(g(x) − E[g(x)])2

]= Bias2 + Variance

• Fundamental smoother tradeoff:

– Smoothness of the estimate g(x)– Fit to the data

• This can also be framed as a bias-variance tradeoff

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 20

Example 5: Interpolation Optimal Model

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Input x

Out

put y

Chirp Optimal Model

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 18

Page 6: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 6: Smoothing Problem MATLAB Code

function [] = SmoothingProblem ();

A = load(’MotorCycle.txt ’);

x = A(: ,1);

y = A(: ,2);

figure;

FigureSet(1,’LTX’);

h = plot(x,y,’r.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Input x’);

ylabel(’Output y’);

title(’Motorcycle Data Set’);

set(gca ,’Box’,’Off’);

grid on;

ymin = min(y);

ymax = max(y);

yrng = ymax -ymin;

ymin = ymin - 0.05*yrng;

ymax = ymax + 0.05*yrng;

axis([min(x) max(x) ymin ymax ]);

AxisSet (8);

%print -depsc Test;

print -depsc SmoothingProblem;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 23

Bias-Variance Tradeoff Continued

MSE(x) = (g(x) − E[g(x)])2 + E[(g(x) − E[g(x)])2

]

• Smooth models

– Less sensitive to the data

– Less variance

– Potentially high bias since they don’t fit the data well

• Flexible models

– Sensitive to the data

– In the most extreme case, they interpolate the data

– High variance since they are sensitive to the data

– Low bias

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 21

return;

%================================================

% Linear

%================================================

figure;

FigureSet (1,4.5 ,2.8);

A = [ones(N,1) x];

w = pinv(A)*y;

yh = [ones(size(xt)) xt]*w;

h = plot(xt ,yh,’b’,x,y,’r.’);

set(h,’MarkerSize ’ ,8);

set(h,’LineWidth ’,1.2);

xlabel(’Input x’);

ylabel(’Output y’);

title(’Chirp Linear Least Squares ’);

set(gca ,’Box’,’Off’);

grid on;

ymin = min(y);

ymax = max(y);

yrng = ymax -ymin;

ymin = ymin - 0.05*yrng;

ymax = ymax + 0.05*yrng;

axis([min(x) max(x) ymin ymax ]);

AxisSet (8);

print -depsc LinearLeastSquares;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 24

Example 6: Univariate Smoothing Data

5 10 15 20 25 30 35 40 45 50 55

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Set

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 22

Page 7: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 7: MATLAB Code

Matlab/PolynomialSmoothing.m

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 27

Polynomial Smoothing

• We can fit a polynomial g(x) =∑p−1

i=0 wixi to the data using the

linear modeling methods

• Note that linear models are linear in the parameters wi

• They need not be linear in the inputs

• Alternatively, you can think of this as a linear model with pdifferent inputs where the ith input is given by xi = xi

• This model is smooth in the sense that all derivatives of g(x) arecontinuous

• This is one measure of model smoothness

• In general, this is a terrible smoother

– Terrible at extrapolation

– The matrix inverse is often poorly conditioned andregularization is necessary

– The user has to pick the order of the polynomial p − 1

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 25

Cubic Splines

• Cubic splines are modeled after the properties of flexible rods shipdesigners used to use to draw smooth curves

• The rod would be rigidly constrained to go through specific points(interpolation)

• The rod smoothly bent from one point to the next

• The rod naturally minimized its bending energy (i.e. curvature)

• This can be approximated by a piecewise cubic polynomial

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 28

Example 7: Polynomial Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Linear Regression

LinearQuadraticCubic4th Order5th Order

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 26

Page 8: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Cubic Spline Constraints

Cubic Spline

x

y

• Cubic splines are piecewise cubic

• This means g(x) =∑3

i=0 wi(x)xi has different weights betweeneach pair of points

• For the entire region between each pair of points, the weights arefixed

• Since each polynomial is defined by 4 parameters wi(x)

• We have n + 1 regions where n is the number of points in thedata set

• Thus, we need at least 4 × (n + 1) constraints for each region touniquely specifies the weights

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 31

Cubic Splines Functional Form

g(x) =3∑

i=0

wi(x)xi

• Unlike polynomial regression, here the parameters wi(x) are also afunction of x

• Consider a class of functions g(x) that have the followingproperties

– Continuous

– Continuous 1st derivative

– Continuous 2nd derivative

– Interpolates the data: g(xi) = yi

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 29

Cubic Spline Constraints Continued

Cubic Spline

x

y

Let pk(x) be the polynomial between the point xk and xk+1. We need4 × (n + 1) constraints to have the problem well defined.Property Expression Constraints

Interpolation g(xi) = yi nContinuous pk(xk+1) = pk+1(xk+1) nContinuous Derivative p′

k(xk+1) = p′k+1(xk+1) n

Continuous 2nd Derivative p′′k(xk+1) = p′′

k+1(xk+1) n

Natural splines have 4 additional constraints

p0(x1)′′′ = 0 p0(x1)′′ = 0

pn(xn)′′′ = 0 pn(xn)′′ = 0

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 32

Cubic Splines Smoothness Definition

• Out of all the functions that meet the above criteria, considerthose that also minimize the approximate “curvature” of g(x)

C ≡∫ xmax

xmin

(d2g(x)

dx

)2

dx

• These are piecewise cubics and are called cubic splines

• In the sense of satisfying the criteria listed above and minimizingthe curvature C, cubic splines are optimal

• Even with all of these constraints, g(x) is not uniquely specified

• There are several cubic splines that meet the strict criteria andhave the same curvature

• The most popular additional constraints are

g(xmin)′′ = 0 g(xmax)′′ = 0

• These are called natural cubic splines

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 30

Page 9: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 8: Basis Function 0

0 1 2 3 4 5 6 7 8 9 10

0

0.2

0.4

0.6

0.8

1

Input x

Out

put y

Basis Function B0(x)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 35

Basis Splines

• You could solve for the 4(n + 1) model coefficients by solving aset of 4(n + 1) linear equations

• This is cumbersome and very inefficient mathematically

• An easier way is to use basis functions

• Mathematically, each basis function is defined recursively

bi,j(x) =x − kj

ki+j − kjbi−1,j(x) +

ki+j+1 − x

ki+j+1 − kj+1bi−1,j+1(x)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 33

Example 8: Basis Function 1

0 1 2 3 4 5 6 7 8 9 10

0

0.2

0.4

0.6

0.8

1

Input x

Out

put y

Basis Function B1(x)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 36

Basis Splines Continued

• The output of our model can then be written as

g(x) =n−1∑i=−2

wib3,i(x)

• Numerically, this can be solved much more quickly (the order isproportional to n)

• Since the basis functions have finite support (i.e. finite span) theequivalent A matrix is banded

• Basis splines also have the nice property that they sum to unity

n−1∑j=1−i

bi,j(x) = 1 ∀x ∈ [k1, kn]

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 34

Page 10: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 8: MATLAB Code

Matlab/BasisFunctions.m

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 39

Example 8: Basis Function 2

0 1 2 3 4 5 6 7 8 9 10

0

0.2

0.4

0.6

0.8

1

Input x

Out

put y

Basis Function B2(x)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 37

Smoothing Splines

• For smoothing, we do not require g(xi) = yi

• But we would like it to be close: g(xi) ≈ yi

• How do we tradeoff smoothness (low variance) for a good fit tothe data (low bias)?

• One way is to find the g(xi) that minimizes the followingperformance criterion:

Eλ =n∑

i=1

(yi − g(xi))2 + λ

∫ +∞

−∞(g(x)′′)2 dx

• Contrast to cubic splines in which we required the first term to bezero

• The second term is a roughness penalty

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 40

Example 8: Basis Function 3

0 1 2 3 4 5 6 7 8 9 10

0

0.2

0.4

0.6

0.8

1

Input x

Out

put y

Basis Function B3(x)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 38

Page 11: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 9: Smoothing Spline

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Smoothing Spline Regression

α = 1.0

α = 0.5α = 0.0

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 43

Smoothing Splines Continued

Eλ =n∑

i=1

(yi − g(xi))2 + λ

∫ +∞

−∞(g(x)′′)2 dx

• λ is a user-specified parameter that controls the tradeoff

• It turns out the optimal solution (in the sense of minimizing Eλ)is a smoothing spline

• A smoothing spline is identical to a cubic spline in form

– There is an ith order polynomial between each pair of points

– Same number of knots

– Same number of different sets of polynomials

• Unlike the cubic spline, we now drop the constraint that g(xi) = yi

• Instead, g(xi) = yi for some set of yi

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 41

Example 10: Smoothing Spline

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Smoothing Spline Regression α=0.0001

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 44

Smoothing Splines Comments

Eλ =n∑

i=1

(yi − g(xi))2 + λ

∫ +∞

−∞(g(x)′′)2 dx

• Smoothing splines are smooth in the same sense as cubic splines

– If cubic, the second derivative is continuous

– If quadratic, the first derivative is continuous

– If linear, the function is continuous

• If cubic smoothing spline, then

– As λ → ∞, g(x) approaches a linear least squares fit to thedata (i.e. g(x)′′ → 0)

– As λ → 0, g(x) becomes an interpolating cubic spline

• This is implemented in MATLAB as csaps

• Instead of λ, it takes an equivalent parameter scaled between 0(linear least squares fit) and 1 (cubic spline interpolation)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 42

Page 12: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 10: Smoothing Spline

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Smoothing Spline Regression α=0.2000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 47

Example 10: Smoothing Spline

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Smoothing Spline Regression α=0.0010

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 45

Example 10: Smoothing Spline

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Smoothing Spline Regression α=0.5000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 48

Example 10: Smoothing Spline

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Smoothing Spline Regression α=0.0100

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 46

Page 13: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 10: MATLAB Code

function [] = SmoothingSplineEx ();

close all;

A = load(’MotorCycle.txt ’);

xr = A(: ,1); % Raw values

yr = A(: ,2); % Raw values

x = unique(xr);

y = zeros(size(x));

for cnt = 1: length(x),

y(cnt) = mean(yr(xr==x(cnt )));

end;

N = size(A,1); % No. data set points

xt = ( -10:0.2:70) ’;

NT = length(xt); % No. test points

NS = 3; % No. of different splines

yh = zeros(NT ,NS);

yh(:,3) = csaps(x’,y’,0,xt ’)’;

yh(:,2) = csaps(x’,y’,0.5,xt ’)’;

yh(:,1) = csaps(x’,y’,1.0,xt ’)’;

FigureSet(1,’LTX’);

h = plot(xt ,yh,x,y,’k.’);

set(h,’MarkerSize ’ ,8);

set(h,’LineWidth ’,1.2);

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 51

Example 10: Smoothing Spline

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Smoothing Spline Regression α=0.9000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 49

xlabel(’Input x’);

ylabel(’Output y’);

title(’Motorcycle Data Smoothing Spline Regression ’);

set(gca ,’Box’,’Off’);

grid on;

axis([-10 70 -150 90]);

AxisSet (8);

legend(’\alpha = 1.0’,’\alpha = 0.5’,’\alpha = 0.0’ ,4);

print -depsc SmoothingSplineEx;

L = [0.0001 0.001 0.01 0.2 0.5 0.9 0.99];

for cnt = 1: length(L),

alpha = L(cnt);

figure;

FigureSet(1,’LTX’);

yh = csaps(x’,y’,alpha ,xt ’)’;

h = plot(xt,yh ,x,y,’k.’);

set(h,’MarkerSize ’ ,8);

set(h,’LineWidth ’,1.2);

xlabel(’Input x’);

ylabel(’Output y’);

st = sprintf(’Motorcycle Data Smoothing Spline Regression \\ alpha =%6 .4f’,alpha);

title(st);

set(gca ,’Box’,’Off’);

grid on;

axis ([-10 70 -150 90]);

AxisSet (8);

st = sprintf(’print -depsc SmoothingSplineEx %04d;’,round(alpha *10000));

eval(st);

end;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 52

Example 10: Smoothing Spline

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Smoothing Spline Regression α=0.9900

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 50

Page 14: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Density Estimation

• Then a kernel density estimator is simply expressed as

f(x) =1n

n∑i=1

bσ(x − xi)

• The width of the kernel is specified by σ. Typically

bσ(u) =1σ

b(u

σ

)where it is easy to show that

∫bσ(u) du = 1 for any value of σ

• Bumps shaped like a Gaussian are popular

b(u) = 1√2π

e−u22

• Typically the bumps have even symmetry:

b(u) = b(−u) = b(|u|)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 55

Review of Bayes’ Rule

• Bayes’ rule says that two discrete-valued random variables A andB have the following relationship

Pr {B|A} =Pr {A, B}Pr {A} =

Pr {A|B}Pr {B}Pr {A}

• Recall that earlier we found the the g(x) that minimizes the MSEis given by

Y = g∗(x) = E[Y |X = x]

• For smoothing, we can use the continuous analog of Bayes’ rule toestimate E[Y |X = x]

f(y|X = x) =f(x, y)f(x)

=f(x|Y = y)f(y)

f(x)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 53

Example 11: Density Estimation

−10 0 10 20 30 40 50 60 700

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Input x

Den

sity

p(x

)

Motorcycle Data Density Estimation w= 0.1

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 56

Continuous Bayes’ Rule

f(y|X = x) =f(x, y)f(x)

=f(x|Y = y)f(y)

f(x)

• E[Y |X = x] is given by

E[Y |X = x] =∫ +∞

−∞yf(y|X = x) dy

• In order to estimate these equations we need a means ofestimating the densities f(x) and f(x, y)

• A popular method of estimating density is to add a series of“bumps” together

• The bumps are called kernels and should have the followingproperty ∫ +∞

−∞bσ(u) du = 1

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 54

Page 15: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 11: Density Estimation

−10 0 10 20 30 40 50 60 700

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Input x

Den

sity

p(x

)

Motorcycle Data Density Estimation w= 1.0

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 59

Example 11: Density Estimation

−10 0 10 20 30 40 50 60 700

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Input x

Den

sity

p(x

)

Motorcycle Data Density Estimation w= 0.2

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 57

Example 11: Density Estimation

−10 0 10 20 30 40 50 60 700

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Input x

Den

sity

p(x

)

Motorcycle Data Density Estimation w= 5.0

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 60

Example 11: Density Estimation

−10 0 10 20 30 40 50 60 700

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Input x

Den

sity

p(x

)

Motorcycle Data Density Estimation w= 0.5

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 58

Page 16: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Density Estimation in Higher Dimensions

• Density estimation can be extended to higher dimensions in a theobvious way

f(x) =1n

n∑i=1

p∏j=1

bσ(xj − xi,j)

where xj is the jth element of the input vector x and xi,j is thejth element of the ith input vector in the data set

• Although you can use this for large values of p, it is notrecommended

• The estimate becomes inaccurate very quickly as the number ofdimensions grows

• For one or two dimensions this is a pretty good technique

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 63

Example 11: MATLAB Code

function [] = DensityEx ();

close all;

A = load(’MotorCycle.txt ’);

x = A(: ,1); % Raw values

y = A(: ,2); % Raw values

W = [0.1 0.2 0.5 1.0 5.0];

xt = ( -10:0 .05 :70) ’;

for c1 = 1: length(W),

w = W(c1);

bs = zeros(size(xt)); % Bump sum

for c2 = 1: length(x),

bs = bs + exp(-(xt -x(c2)).^2/(2* w. ^2))/ sqrt (2*pi*w^2);

end;

bs = bs/length(x);

figure;

FigureSet(1,’LTX’);

h = plot(x,zeros(size(x)),’k.’,xt ,bs);

set(h,’LineWidth ’,1.5);

xlabel(’Input x’);

ylabel(’Density p(x)’);

st = sprintf(’Motorcycle Data Density Estimation w=%5 .1f’,w);

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 61

Example 12: 2D Density Estimation

Input x

Out

put y

Motorcycle Data Input−Output Density Estimation w= 0.05

0 10 20 30 40 50

−100

−50

0

50

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 64

title(st);

set(gca ,’Box’,’Off’);

grid on;

axis ([-10 70 0 0.1]);

AxisSet (8);

st = sprintf(’print -depsc DensityEx %02d;’,round(w*10));

eval(st);

end;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 62

Page 17: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 12: 2D Density Estimation

Input x

Out

put y

Motorcycle Data Input−Output Density Estimation w= 0.50

0 10 20 30 40 50

−100

−50

0

50

0.02

0.04

0.06

0.08

0.1

0.12

0.14

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 67

Example 12: 2D Density Estimation

Input x

Out

put y

Motorcycle Data Input−Output Density Estimation w= 0.10

0 10 20 30 40 50

−100

−50

0

50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 65

Example 12: 2D Density Estimation

Input x

Out

put y

Motorcycle Data Input−Output Density Estimation w= 1.00

0 10 20 30 40 50

−100

−50

0

50

0.01

0.02

0.03

0.04

0.05

0.06

0.07

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 68

Example 12: 2D Density Estimation

Input x

Out

put y

Motorcycle Data Input−Output Density Estimation w= 0.20

0 10 20 30 40 50

−100

−50

0

50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 66

Page 18: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Density Estimation and Scaling

• In higher dimensions it is important to scale each input to havethe same variance

• The following example shows the same data set without scaling

• Notice the oval-shaped bumps

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 71

Example 12: MATLAB Code

function [] = DensityEx2D ();

close all;

A = load(’MotorCycle.txt ’);

xr = A(: ,1); % Raw values

yr = A(: ,2); % Raw values

xm = mean(xr);

ym = mean(yr);

xs = std(xr);

ys = std(yr);

x = (xr -xm)/xs;

y = (yr -ym)/ys;

W = [0.05 0.1 0.2 0.5 1.0];

xst = -2.0:0.02:2.5; % X-test points

yst = -2.5:0.02:2.5; % Y-test points

[xmt ,ymt] = meshgrid(xst ,yst); % Grids of scaled test points

xt = xst*xs + xm; % Unscaled x-test values

yt = yst*ys + ym; % Unscaled y-test values

for c1 = 1: length(W),

w = W(c1);

bs = zeros(size(xmt )); % Bump sum

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 69

Example 13: 2D Density Estimation

Input x

Out

put y

Motorcycle Data No Scaling Density Estimation w= 0.50

0 10 20 30 40 50 60−150

−100

−50

0

50

1

2

3

4

5

6

7

8

9x 10

−3

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 72

for c2 = 1: length(x),

bx = exp(-(xmt -x(c2)).^2/(2* w. ^2))/ sqrt (2*pi*w^2);

by = exp(-(ymt -y(c2)).^2/(2* w. ^2))/ sqrt (2*pi*w^2);

bs = bs + bx.*by;

end;

bs = bs/length(x);

figure;

FigureSet(1,’LTX’);

h = imagesc(xt ,yt ,bs);

hold on;

h = plot(xr,yr ,’k.’,xr,yr ,’w.’);

set(h(1),’MarkerSize ’ ,4);

set(h(2),’MarkerSize ’ ,2);

hold off;

set(gca ,’YDir’,’Normal’);

xlabel(’Input x’);

ylabel(’Output y’);

st = sprintf(’Motorcycle Data Input -Output Density Estimation w=%5 .2f’,w);

title(st);

set(gca ,’Box’,’Off’);

colorbar;

AxisSet (8);

st = sprintf(’print -depsc DensityEx2D %03d;’,round(w*100));

eval(st);

end;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 70

Page 19: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 13: 2D Density Estimation

Input x

Out

put y

Motorcycle Data No Scaling Density Estimation w= 10.00

0 10 20 30 40 50 60−150

−100

−50

0

50

0.5

1

1.5

2

2.5

3

x 10−4

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 75

Example 13: 2D Density Estimation

Input x

Out

put y

Motorcycle Data No Scaling Density Estimation w= 1.00

0 10 20 30 40 50 60−150

−100

−50

0

50

0.5

1

1.5

2

2.5

3

3.5

4

4.5

x 10−3

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 73

Example 13: 2D Density Estimation

Input x

Out

put y

Motorcycle Data No Scaling Density Estimation w= 20.00

0 10 20 30 40 50 60−150

−100

−50

0

50

2

4

6

8

10

12

14

x 10−5

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 76

Example 13: 2D Density Estimation

Input x

Out

put y

Motorcycle Data No Scaling Density Estimation w= 5.00

0 10 20 30 40 50 60−150

−100

−50

0

50

1

2

3

4

5

6

7

8

x 10−4

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 74

Page 20: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

h = plot(x,y,’k.’,x,y,’w.’);

set(h(1),’MarkerSize ’ ,4);

set(h(2),’MarkerSize ’ ,2);

hold off;

set(gca ,’YDir’,’Normal’);

xlabel(’Input x’);

ylabel(’Output y’);

st = sprintf(’Motorcycle Data No Scaling Density Estimation w=%6 .2f’,w);

title(st);

set(gca ,’Box’,’Off’);

colorbar;

AxisSet (8);

st = sprintf(’print -depsc DensityEx2Db %03d;’,round(w*10));

eval(st);

end;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 79

Example 13: 2D Density Estimation

Input x

Out

put y

Motorcycle Data No Scaling Density Estimation w= 50.00

0 10 20 30 40 50 60−150

−100

−50

0

50

1

1.5

2

2.5

3

3.5

4

x 10−5

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 77

Kernel Smoothing Derivation

The following equations compose the Nadaraya-Watson estimator ofE[y|x]

E[y|x] =∫ ∞

−∞y f(y|x) dy =

∫ ∞

−∞y

f(x, y)f(x)

dy =

∫∞−∞ y f(x, y) dy

f(x)

The two densities can be estimated as follows

f(x, y) =1n

n∑i=1

bσ(|x − xi|) · bσ(|y − yi|)

f(x) =1n

n∑i=1

bσ(|x − xi|)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 80

Example 13: MATLAB Code

function [] = DensityEx2D ();

% This is the same as DensityEx2D , except no scaling is used.

close all;

A = load(’MotorCycle.txt ’);

x = A(: ,1); % Raw values

y = A(: ,2); % Raw values

W = [0.5 1.0 2.0 5.0 10.0 20.0 50.0];

xt = 0:0.5:60; % X-test points

yt = -150:90; % Y-test points

[xmt ,ymt] = meshgrid(xt ,yt); % Grids of scaled test points

for c1 = 1: length(W),

w = W(c1);

bs = zeros(size(xmt )); % Bump sum

for c2 = 1: length(x),

bx = exp(-(xmt -x(c2)).^2/(2* w. ^2))/ sqrt (2*pi*w^2);

by = exp(-(ymt -y(c2)).^2/(2* w. ^2))/ sqrt (2*pi*w^2);

bs = bs + bx.*by;

end;

bs = bs/length(x);

figure;

FigureSet(1,’LTX’);

h = imagesc(xt ,yt ,bs);

hold on;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 78

Page 21: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Kernel Smoothing Derivation Continued

Thus, by combining the equations on the previous slides we obtain

E[y|x] ≈ g(x) =∑n

i=1 yi bσ(|x − xi|)∑ni=1 bσ(|x − xi|)

• Popular kernels includeEpanechnikov: b(u) = c (1 − u2) p(u)Biweight: b(u) = c (1 − u2)2 p(u)Triweight: b(u) = c (1 − u2)3 p(u)Triangular: b(u) = c (1 − |u|) p(u)Gaussian: b(u) = c e−u2

Sinc: b(u) = c sinc(u)• Here c is a constant chosen to meet the constraint∫∞

−∞ b(u) du = 1

• p(u) is the unit pulse:

p(u) =

{1 |u| ≤ 10 Otherwise

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 83

Kernel Smoothing Derivation Continued (1)

E[y|x] ≈∫∞−∞ yf(x, y) dy

f(x)

f(x) E[y|x] ≈∫ ∞

−∞y

(1n

n∑i=1

bσ(|x − xi|) · bσ(|y − yi|))

dy

=1n

n∑i=1

bσ(|x − xi|)∫ ∞

−∞y bσ(|y − yi|) dy

=1n

n∑i=1

bσ(|x − xi|)∫ ∞

−∞(y − yi + yi) bσ(|y − yi|) dy

=1n

n∑i=1

bσ(|x − xi|) ×[yi

∫ ∞

−∞bσ(|y − yi|) dy +

∫ ∞

−∞(y − yi) bσ(|y − yi|) dy

]

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 81

Example 14: Kernels

−2 0 2

0

0.5

1

Epanechnikov

−2 0 2

0

0.5

1

Biweight

−2 0 2

0

0.5

1

Triweight

−2 0 2

0

0.5

1

Triangular

−2 0 2

0

0.5

1

Gaussian

−2 0 2

0

0.5

1

Sinc

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 84

Kernel Smoothing Derivation Continued (2)

f(x) E[y|x] =1n

n∑i=1

bσ(|x − xi|) ×[yi

∫ ∞

−∞bσ(|y − yi|) dy +

∫ ∞

−∞(y − yi) bσ(|y − yi|) dy

]

=1n

n∑i=1

bσ(|x − xi|) ×[yi +

∫ ∞

−∞u bσ(|u|) dy

]

=1n

n∑i=1

yi bσ(|x − xi|)

E[y|x] =1n

∑ni=1 yi bσ(|x − xi|)

1n

∑ni=1 bσ(|x − xi|)

=∑n

i=1 yi bσ(|x − xi|)∑ni=1 bσ(|x − xi|)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 82

Page 22: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Kernel Smoothing Comments

E[y|x] ≈ g(x) =∑n

i=1 yi bσ(|x − xi|)∑ni=1 bσ(|x − xi|) =

n∑i=1

yi

(bσ(|x − xi|)∑n

j=1 bσ(|x − xj |)

)

• Kernel smoothing can be written as a weighted average

g(x) =n∑

i=1

yi wi(x) wi(x) =bσ(|x − xi|)∑n

j=1 bσ(|x − xj |)

• Note that by definition∑n

i=1 wi(x) = 1

• If all the weights were equal, wi(x) = 1n , then g(x) = y

• This occurs as σ → ∞

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 87

Example 14: MATLAB Code

function [] = Kernels ();

ST = 0.01;

x = (-2.2:ST:2.2)’;

u = abs(x);

I = (u≤1);

kep = (1-u.^2).*I; % Epanechnikov

kbw = (1-u.^2).^2.*I; % Biweight

ktw = (1-u.^2).^3.*I; % Triweight

ktr = (1-u).*I; % Triangular

kga = exp(-u.^2); % Gaussian

ksn = sinc(u); % Sinc

kep = kep/(sum(kep)*ST); % Normalize

kbw = kbw/(sum(kbw)*ST); % Normalize

ktw = ktw/(sum(ktw)*ST); % Normalize

ktr = ktr/(sum(ktr)*ST); % Normalize

kga = kga/(sum(kga)*ST); % Normalize

ksn = ksn/(sum(ksn)*ST); % Normalize

K = [kep kbw ktw ktr kga ksn];

L = {’Epanechnikov ’,’Biweight ’,’Triweight ’,’Triangular ’,’Gaussian ’,’Sinc’};

FigureSet (1,4.5 ,2.8);

for cnt = 1:6,

subplot(2,3,cnt);

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 85

Kernel Smoothing Effect of Support

E[y|x] ≈ g(x) =∑n

i=1 yi bσ(|x − xi|)∑ni=1 bσ(|x − xi|)

As the width decreases (σ ↓) one of two things happens

• If b(u) has infinite support,

– All of the equivalent weights become nearly equal to zero

– The weight from the nearest neighbor dominates

– Thus g(x) does nearest neighbor interpolation as σ → 0

• If b(u) has finite support,

– At some values of x all of the weights may be 0

– In this happens g(x) at these points is not defined (depends onimplementation)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 88

h = plot([-5 5],[0 0],’k:’,x,K(:,cnt ));

set(h(2),’LineWidth ’,1.5);

title(char(L(cnt )));

box off;

axis([min(x) max(x) -0.3 1.2]);

end;

AxisSet (8);

print -depsc Kernels;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 86

Page 23: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 15: Kernel Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Kernel Smoothing Epanechnikov Kernel w=1.0000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 91

Kernel Smoothing Bias-Variance Tradeoff

E[y|x] ≈ g(x) =1n

∑ni=1 yi bσ(|x − xi|)

1n

∑ni=1 bσ(|x − xi|)

• Thus, as with smoothing splines there is a single parameter thatcontrols the tradeoff of smoothness (high bias) for the ability ofthe model to fit the data (high variance)

• Kernel smoothers have bounded outputs

mini

yi ≤ minx

g(x) ≤ g(x) ≤ maxx

g(x) ≤ maxi

yi

• In this sense, they are more stable than smoothing splines

• Recall smoothing splines diverge outside of the data range

• However, kernel smoothers are more likely to round off sharpedges and peaks and troughs

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 89

Example 15: Kernel Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Kernel Smoothing Epanechnikov Kernel w=2.0000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 92

Example 15: Kernel Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Kernel Smoothing Epanechnikov Kernel w=0.1000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 90

Page 24: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 15: Kernel Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Kernel Smoothing Gaussian Kernel w=0.1000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 95

Example 15: Kernel Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Kernel Smoothing Epanechnikov Kernel w=5.0000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 93

Example 15: Kernel Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Kernel Smoothing Gaussian Kernel w=1.0000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 96

Example 15: Kernel Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Kernel Smoothing Epanechnikov Kernel w=10.0000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 94

Page 25: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 15: Kernel Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Kernel Smoothing Gaussian Kernel w=10.0000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 99

Example 15: Kernel Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Kernel Smoothing Gaussian Kernel w=2.0000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 97

Example 15: MATLAB Code

function [] = KernelSmoothingEx ();

close all;

A = load(’MotorCycle.txt ’);

x = A(: ,1); % Raw values

y = A(: ,2); % Raw values

xt = ( -10:0 .05 :70) ’;

W = [0.1 1.0 2.0 3.0 5.0 10.0];

% Epanechnikov Kernel

for cnt = 1: length(W),

w = W(cnt);

figure;

FigureSet(1,’LTX’);

yh = Kernel(x,y,xt ,w,2);

h = plot(xt,yh ,’b’,x,y,’k.’);

set(h,’MarkerSize ’ ,8);

set(h,’LineWidth ’,1.2);

xlabel(’Input x’);

ylabel(’Output y’);

st = sprintf(’Motorcycle Data Kernel Smoothing Epanechnikov Kernel w=%6 .4f’,w);

title(st);

set(gca ,’Box’,’Off’);

grid on;

axis ([-10 70 -150 90]);

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 100

Example 15: Kernel Smoothing

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

Input x

Out

put y

Motorcycle Data Kernel Smoothing Gaussian Kernel w=5.0000

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 98

Page 26: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Local Averaging Concept

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set

5 10 15 20 25 30 35 40 45 50 55−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 103

AxisSet (8);

st = sprintf(’print -depsc EKernelSmoothingEx %03d;’,round(w*10));

eval(st);

end;

% Gaussian Kernel

for cnt = 1: length(W),

w = W(cnt);

figure;

FigureSet(1,’LTX’);

yh = Kernel(x,y,xt ,w,1);

h = plot(xt,yh ,’b’,x,y,’k.’);

set(h,’MarkerSize ’ ,8);

set(h,’LineWidth ’,1.2);

xlabel(’Input x’);

ylabel(’Output y’);

st = sprintf(’Motorcycle Data Kernel Smoothing Gaussian Kernel w=%6 .4f’,w);

title(st);

set(gca ,’Box’,’Off’);

grid on;

axis ([-10 70 -150 90]);

AxisSet (8);

st = sprintf(’print -depsc GKernelSmoothingEx %03d;’,round(w*10));

eval(st);

end;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 101

MATLAB Code

function [] = LocalAverageConcept ();

D = load(’Motorcycle.txt ’);

x = D(: ,1);

y = D(: ,2);

[x,is] = sort(x);

y = y(is);

q = 30; % Query point (test input)

k = 10; % No. of neighbors

d = (x-q).^2;

[ds,is] = sort(d);

xs = x(is);

ys = y(is);

xn = xs(1:k);

yn = ys(1:k);

[xsmin , imin] = min(xs(1:k));

[xsmax , imax] = max(xs(1:k));

imin = is(imin);

imax = is(imax);

xll = (x(imin) + x(imin -1))/2; % lower limit

xul = (x(imax) + x(imax +1))/2; % upper limit

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 104

Local Averaging

g(x) =n∑

i=1

wi(x) yi

• We saw that kernel smoothers can be viewed as a weightedaverage

• Instead, we could take a local average of the k-nearest neighborsof x

g(x) =1k

k∑i=1

yc(i)

where c(i) is the data set index of the ith nearest point

• For this type of model, k controls the smoothness

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 102

Page 27: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Local Averaging Discussion

g(x) =1k

k∑i=1

yc(i)

• Local averaging has a number of disadvantages

– The data set must be stored in memory. (This is essentiallytrue for kernel smoothers and smoothing splines also)

– The output g(x) is discontinuous

– Finding the k nearest neighbors can be computationallyexpensive

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 107

rg = max(y)-min(y);

ymin = min(y)-0.1*rg;

ymax = max(y)+0.1*rg;

xbox = [xll xul xul xll ];

ybox = [ymin ymin ymax ymax];

yav = mean(yn)*[1 1];

xav = [xll xul];

A = [xn ones(k ,1)];

b = yn;

v = pinv(A)*b;

xl1 = 0;

yl1 = [xl1 1]*v;

xl2 = 1.5;

yl2 = [xl2 1]*v;

xll = [xl1 xl2];

yll = [yl1 yl2];

figure;

FigureSet(1,’LTX’);

h = patch(xbox ,ybox ,’g’);

set(h,’FaceColor ’,.8*[1 1 1]);

set(h,’EdgeColor ’,.8*[1 1 1]);

hold on;

h = plot(x,y,’k.’);

set(h,’MarkerSize ’ ,8);

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 105

Example 16: Local Averaging

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=2

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 108

h = plot(xav ,yav ,’r-’);

set(h,’LineWidth ’,1.5);

%h = plot(xll ,yll ,’b:’);

h = plot(q*[1 1], [ymin ymax],’b--’);

set(h,’LineWidth ’,1.5);

hold off;

rg = max(y)-min(y);

axis([min(x) max(x) ymin ymax ]);

xlabel(’Time (ms)’);

ylabel(’Head Acceleration (g)’);

title(’Motorcycle Data Set’);

set(gca ,’Layer’,’top’);

set(gca ,’Box’,’off’);

AxisSet (8);

print -depsc LocalAverageConcept;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 106

Page 28: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 16: Local Averaging

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=20

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 111

Example 16: Local Averaging

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=5

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 109

Example 16: Local Averaging

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=50

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 112

Example 16: Local Averaging

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=10

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 110

Page 29: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Weighted Local Averaging

g(x) =∑k

i=1 bk(|x − xc(i)|) yc(i)∑ki=1 bk(|x − xc(i))

=∑k

i=1 bi yc(i)∑ki=1 bi

• Local averaging can be tweaked to produce a continuous g(x)

• We simply take a weighted average where b(u) is a smoothlydecreasing function of the distance

• We can use our familiar (non-negative) kernels to achieve this

• My favorite is the biweight function

bi =(

1 − d2i

d2k+1

)2

where di = |x − xc(i)| is the distance between the input and theith nearest neighbor

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 115

Example 16: MATLAB Code

function [] = LocalAverageFit ();

close all;

D = load(’Motorcycle.txt ’);

x = D(: ,1);

y = D(: ,2);

[x,is] = sort(x);

y = y(is);

xt = ( -10:0.1:70) ’;

yh = zeros(size(xt));

K = [2 5 10 20 50];

for c = 1: length(K),

k = K(c);

for cnt = 1: length(xt),

d = (x-xt(cnt)).^2;

[ds ,is] = sort(d);

xs = x(is);

ys = y(is);

xn = xs(1:k);

yn = ys(1:k);

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 113

Example 17: Local Averaging Weighting Functions

0 0.2 0.4 0.6 0.8 1 1.2

0

0.2

0.4

0.6

0.8

1

Distance (u)

Wei

ghtin

g Fu

nctio

n

Weighted Averaging Weighting Functions

EpanechnikovBiweightTriweightTriangular

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 116

yh(cnt) = mean(yn);

end;

figure;

FigureSet(1,’LTX’);

h = plot(x,y,’k.’);

set(h,’MarkerSize ’ ,8);

hold on;

h = stairs(xt ,yh,’b’);

set(h,’LineWidth ’,1.2);

hold off;

axis ([-10 70 -150 90]);

xlabel(’Time (ms)’);

ylabel(’Head Acceleration (g)’);

st = sprintf(’Motorcycle Data Set k=%d’,k);

title(st);

set(gca ,’Layer’,’top’);

set(gca ,’Box’,’off’);

AxisSet (8);

st = sprintf(’print -depsc LocalAverageEx %02d;’,k);

eval(st);

end;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 114

Page 30: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 17: Weighted Averaging

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=10

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 119

Example 17: Weighted Averaging

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=2

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 117

Example 17: Weighted Averaging

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=20

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 120

Example 17: Weighted Averaging

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=5

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 118

Page 31: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

w = (1-(dn/dmax)).^2;

yt(cnt) = sum(w.*yn)/sum(w);

end;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 123

Example 17: Weighted Averaging

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=50

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 121

Weighted Local Averaging Comments

g(x) =∑k

i=1 bk(|x − xc(i)|) yc(i)∑ki=1 bk(|x − xc(i)|)

• Like kernel smoothers, weighted local averaging models are stable(bounded)

minc(i)∈1,k

yc(i) ≤ g(x) ≤ maxc(i)∈1,k

yc(i)

• The key difference here is that the kernel width is determined bythe distance to the (k + 1)th nearest neighbor

• This is advantageous

– In regions of dense data, the equivalent kernel width shrinks

– In regions of sparse data, the equivalent kernel width expands

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 124

Example 17: MATLAB Code

function [yt] = WeightedAverage(x,y,xt ,k);

xarg = x;

yarg = y;

x = unique(xarg);

y = zeros(size(x));

for cnt = 1: length(x),

y(cnt) = mean(yarg(xarg==x(cnt )));

end;

yt = zeros(length(xt),1);

[Np,Ni] = size(x);

for cnt = 1: length(xt),

d = zeros(Np ,1);

for cnt2 = 1:Ni,

d = d + (x(:,cnt2)-xt(cnt ,cnt2)).^2;

end;

[ds ,is] = sort(d);

xs = x(is);

ys = y(is);

dn = ds(1:k);

dmax = ds(k+1);

xn = xs(1:k);

yn = ys(1:k);

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 122

Page 32: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Local Model Consistency

• Under general assumptions, it can be shown that smoothingsplines, kernel smoothers, and local models are consistent

• Consistency means that as n → ∞, if the following conditions aresatisfied

–∫ |bσ(u)|du < ∞

– limu→∞ u bσ(u) = 0– E[ε2

i ] < ∞– As n → ∞, σ → 0 and nσ → ∞then at every point that is continuous for g(x) and f(x) withf(x) > 0,

g(x) → g(x)

with probability 1.

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 127

Local Model Optimality

It can be shown that for fixed weighting functions, w(x), both kernelsmoothers and weighted local averaging models minimize the weightedaverage squared error

ASE ≡ 1n

n∑i=1

(yi − g(xi))2b(|x − xi|)

dASEdg(xi)

∝n∑

i=1

(yi − g(x)) b(|x − xi|) = 0

0 =n∑

i=1

yib(|x − xi|) −n∑

i=1

g(x)b(|x − xi|)

=n∑

i=1

yib(|x − xi|) − g(x)n∑

i=1

b(|x − xi|)

g(x) =∑n

i=1 yi b(|x − xi|)∑ni=1 b(|x − xi|)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 125

Bias-Variance Tradeoff

PE ≡ E[(y − g(x))2

]= (g(x) − E [g(x)])2︸ ︷︷ ︸

Bias2

+ E[(g(x) − E [g(x)])2

]︸ ︷︷ ︸

Variance

• For each, we discussed a bias-variance tradeoff

– Less smooth ⇒ More variance and less bias

– More smooth ⇒ Less variance and more bias

• Recall that the prediction error can be written as shown above

• The expectation is taken over the distribution of data sets used toconstruct g(x)

• Conceptually, this can be plotted

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 128

Local Model Optimality Continued

ASE ≡ 1n

n∑i=1

(yi − g(xi))2b(|x − xi|)

g∗(x) =∑n

i=1 yib(|x − xi|)∑ni=1 b(|x − xi|)

• Thus, we have an alternative derivation of kernel smoothers andweighted local averaging models

• They are the models that minimize the weighted ASE

• The only difference between kernel smoothers, local averagingmodels, and weighted local averaging models are the weightingfunctions, b(·)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 126

Page 33: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Model Selection Continued

• The prediction error can then be estimated by

PE ≈ r( p

n

)× ASE

where the r(·) is a function that adjusts the ASE to be a moreaccurate estimate of the PE

• A number of different functions r(·) have been proposed

– Final Prediction Error: r(u) = 1+u1−u

– Schwartz’ Criterion: r(u) = 1 + loge n2

u1−u

– Generalized CVE: r(u) = 1(1−u)2

– Shabata’s Model Selector: r(u) = 1 + 2u

• In each case u = pn

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 131

Bias-Variance Tradeoff Continued

Smoothness

Prediction Error

Bias

Variance

• Our goal is to minimize the prediction error

• How do we choose the best smoothing parameter?

• All of the methods we discussed had a single parameter thatcontrolled smoothness

– Smoothing splines had a smoothness penalty parameter λ

– Kernel methods had the bump width σ

– Local averaging models had the number of neighbors k

• How do we pick the best smoothness?

• Would like an accurate estimate of the prediction error

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 129

Resampling Techniques

• It is also possible to use resampling techniques

– N-Fold Cross-Validation: Divide the data set into Ndifferent sets. Pick the first set as the test set and build themodel using the remaining N − 1 sets of points. Calculate theASE on the test set. Repeat for all of the sets and average allN estimates of the ASE.

– Leave-one-out Cross-Validation: Same as above for N = n.

– Bootstrap: Select n points from the data set withreplacement and calculate the ASE.

• I personally prefer to use Leave-one-out CVE

• A study I conducted last spring with weighted averaging modelsindicated that CVE and Generalized CVE were the best (for thattype of model)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 132

Model Selection

• How do we estimate the prediction error with only one data set?

• The ASE won’t work: monotonically decreases as the smoothnessdecreases

• All of our smoothers can be written as

g(x) = y = H(x)y

for a given input vector x.

• This is very similar to the hat matrix of linear models, except nowthe H matrix is a function of x

• The equivalent degrees of freedom can be estimated by

p ≈ trace(HH

T)

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 130

Page 34: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Weighted Least Squares

ASEb =1n

n∑i=1

b2i (yi − yi)2

= (y − Aw)TB

TB(y − Aw)

where

B =

⎡⎢⎢⎢⎣

b1 0 . . . 00 b2 . . . 0...

.... . .

...0 0 . . . bn

⎤⎥⎥⎥⎦

• When we discussed linear models we found that the minimized theaverage squared error

• We can generalize this easily to find the best linear model thatminimizes the weighted ASE

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 135

Example: Weighted Averaging CVE

Number of Neighbors (k)

CV

E

Local Averaging Cross−Validation Error

0 5 10 15 20 25 30

400

500

600

700

800

900

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 133

Weighted Least Squares Continued

ASEb = (y − Aw)TB

TB(y − Aw)

• This can be framed as a typical (unweighted) least squaresproblem if we add the following definitions

Ab ≡ BA yb ≡ By

• Then the ASEb can be written as

ASEb = (yb − Abw)T(yb − Abw)

which has the known optimal least squares solution

w = (AT

b Ab)−1AT

b yb

• Now we can easily generalize kernel methods and local averagingmodels to create localized linear models

• We merely specify the weights so that points near the input havethe most influence on the model output

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 136

Example: Weighted Averaging CVE

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Local Averaging kopt

=11

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 134

Page 35: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 18: Local Linear Model

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=10

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 139

Example 18: Local Linear Model

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=2

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 137

Example 18: Local Linear Model

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=20

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 140

Example 18: Local Linear Model

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=5

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 138

Page 36: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Example 18: Local Linear Model

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=93

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 143

Example 18: Local Linear Model

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=30

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 141

Example 18: MATLAB Code

function [] = LocalLinearEx ();

close all;

D = load(’Motorcycle.txt ’);

x = D(: ,1);

y = D(: ,2);

[x,is] = sort(x);

y = y(is);

xt = ( -10:0 .05 :70) ’;

k = [2 5 10 20 30 50 93];

for cnt = 1: length(k),

yh = LocalLinear(x,y,xt ,k(cnt ));

figure;

FigureSet(1,’LTX’);

h = plot(x,y,’k.’,xt,yh ,’b’);

set(h(1),’MarkerSize ’ ,8);

set(h(2),’LineWidth ’,1.2);

axis ([-10 70 -150 90]);

xlabel(’Time (ms)’);

ylabel(’Head Acceleration (g)’);

st = sprintf(’Motorcycle Data Set k=%d’,k(cnt ));

title(st);

set(gca ,’Layer’,’top’);

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 144

Example 18: Local Linear Model

Time (ms)

Hea

d A

ccel

erat

ion

(g)

Motorcycle Data Set k=50

−10 0 10 20 30 40 50 60 70−150

−100

−50

0

50

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 142

Page 37: Problem Definition & Interpolation Univariate Smoothing ...web.cecs.pdx.edu/~edam/Slides/UnivariateSmoothingx4.pdf · – Cubic Spline Interpolation: Fit a cubic polynomial with

Univariate Smoothing Summary Continued

• All of the smoothing methods had a single parameter that controlsthe smoothness of the model

• For each, we discussed a bias-variance tradeoff

– Less smooth ⇒ More variance and less bias

– More smooth ⇒ Less variance and more bias

• We discussed a several methods of estimating the “true”prediction error of the model

– Some of the methods were simple modifications of the ASE– Other methods were based on resampling (cross-validation &

the bootstrap)

• Most can be generalized to the multivariate case

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 147

set(gca ,’Box’,’off’);

AxisSet (8);

st = sprintf(’print -depsc LocalLinearEx %02d;’,k(cnt ));

eval(st);

end;

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 145

Univariate Smoothing Summary

• We discussed four methods of interpolation

– Linear Interpolation

– Nearest Neighbor Interpolation

– Polynomial Interpolation

– Cubic Spline Interpolation

• We discussed six methods of univariate smoothing

– Polynomial regression (generalization of linear models)

– Smoothing splines

– Kernel smoothing

– Local averaging

– Weighted local averaging

– Local linear models (weighted)

• Discussed one method of density estimation based on kernels

J. McNames Portland State University ECE 4/557 Univariate Smoothing Ver. 1.25 146