Upload
claud-brooks
View
225
Download
0
Embed Size (px)
Citation preview
Chem 302 - Math 252
Chapter 5Regression
Linear & Nonlinear Regression
• Linear regression– Linear in the parameters– Does not have to be linear in the
independent variable(s)– Can be solved through a system of linear
equations
• Nonlinear– Nonlinear in parameters– Usually requires linearization and iteration
0 1y a a x 2
0 1 2y a a x a x
0 1xy a a e
10
a xy a e2
0 1ay a a x
Linear Least-Squares Regression
1
,n
i i ix y
,obs ,calci i iy y
2
1
n
ii
Z
Residual
Sum of Square Residuals
Want to minimize Z ,calc :{ }i i my f x a
Linear Least-Squares Regression
1,
n
i i ix y
,obs ,calc ,obs 0 1 ,obsi i i i iy y y a a x
0
1
At the min
0
0
Z
a
Z
a
,calc 0 1iy a a x ,obs 0 1 ,obs10
,obs 0 1 ,obs1 1 1
0 1
0 1
0 2 1
1 0
0
n
i ii
n n n
i ii i i
y x
x y
Zy a a x
a
y a a x
s a n a s
a n a s s
22
,obs 0 1 ,obs1 1
n n
i i ii i
Z y a a x
,obs 0 1 ,obs ,obs11
0 1
0 2n
i i ii
x xx xy
Zy a a x x
a
a s a s s
2,obs ,obs ,obs ,obs ,obs
1 1 1 1
n n n n
x i y i xx i xy i ii i i i
s x s y s x s x y
Linear Least-Squares Regression
0
1
x y
x xy xy
n s sa
s s sa
0 2
1 2
y xx xy x
xx x
xy x y
xx x
s s s sa
ns s
ns s sa
ns s
Linear Regression.mws
Example 1.00,3.0 , 2.00,6.0 3.00,7.0 , 4.00,10.0
4
1.00 2.00 3.00 4.00 10.00
3.0 6.0 7.0 10.0 26.0
1.00 4.00 9.00 16.00 30.00
3.0 12.0 21.0 40.0 76.0
x
y
xx
xy
n
s
s
s
s
0 2 2
1 2 2
26.0 30.00 76.0 10.001.0
4 30.00 10.00
4 76.0 10.00 26.02.20
4 30.00 10.00
y xx xy x
xx xx
xy x y
xx xx
s s s sa
ns s
ns s sa
ns s
Linear Least-Squares RegressionUncertainties in Parameters
Linear Regression.mws
0
2 2
2 2 20 0
1 1
2 2 2 22 2
22 21 1
22 2 2
221
22 2
22
2
2
2
i
n n
a y yi ii i
n nxx x i xx xx x i x i
y yi ixx x xx x
ny
xx xx x i x ii
xx x
yxx xx x x x xx
xx x
a a
y y
s s x s s s x s x
ns s ns s
s s s x s xns s
s n s s s s sns s
2 22 2
2 2 222
2y xx y xx xx
xx x xxx x xx xxx x
s s Z ss n s s
ns s n ns sns s
y
Z
n m
1
2
2 212
1 2i
n
a yi i xx x
a Z n
y n ns s
Example0.80Z
0
0
22 2
0.80 30.000.6
2 4 2 4 30.00 10.00
0.8
xxa
xx x
a
Z s
n ns s
10.28a
Linear Least-Squares Regression
Linear Regression.mws
Regression on “y”
Treat x as y and y as x
1.00,3.0 , 2.00,6.0 3.00,7.0 , 4.00,10.0
4
3.0 6.0 7.0 10.0 26.0
1.00 2.00 3.00 4.00 10.00
9.0 36.0 49.0 100.0 194.0
3.0 12.0 21.0 40.0 76.0
x
y
xx
xy
n
s
s
s
s
0 2
1 2
0.36
0.44
y xx xy x
xx xx
xy x y
xx xx
s s s sa
ns s
ns s sa
ns s
0.44 0.36
0.36 / 0.44 2.27 0.82
x y
y x x
Choose x as variable with smallest error
Can also be determined by equation
Linear Least-Squares Regression
1,
n
i i ix y
At the min
0j
Z
a
,calc1
m
i k k ik
y a f x
,obs1 1
,obs1 1 1
,obs1 1 1
0 2
0
n m
i k k i j ii kj
n n m
i j i k k i j ii i k
m n n
k k i j i i j ik i i
Zy a f x f x
a
y f x a f x f x
a f x f x y f x
2
,obs1 1
n m
i k k ii k
Z y a f x
In matrix form
CA D
1
,obs1
n
jk kj k i j ii
k k
n
k i k ii
C C f x f x
A a
D y f x
1A C D
Example – Vapour Pressure of Cadmium2
1 3ln lna
p a a TT
1 2 3
1ln 1 lny p f T f T f T T
T
9.00 0.00156 60.04
0.00156 0.0000153 0.07679
60.04 0.07679 400.8
C1
45882.2 4598324.9 5992.4
4598324.9 462691895.1 600209.4
5992.4 600209.4 782.7
C
24.27
0.02605
165.8
D 1
28.74
13449
1.315
A C D
Package
Linear Least-Squares RegressionUncertainties in Parameters
2
12 2 2 2
12 2 2 2 2 1 2 1
1 1 1 1 1 1 1
2 1 1
1 1 1
l i
m
lk kn n n n m n mkl l k
a y y y y lk y lk k ii i i i k i ki i i i
n m m
y lk k i lk k ii k k
C Da a D
C C f xy y y y
C f x C f x
2 1 1
1 1 1
2 1 1
1 1 1
2 1 1 2 1 1
1 1 1 1 1
2 1 1 2 1 2 1
1 1 1
n m m
y lj j i lk k ii j k
n m m
y lj j i lk k ii j k
m m n m m
y lj lk j i k i y lj lk jkj k i j k
m m m
y lk lj jk y lk lk y llk j k
C f x C f x
C f x C f x
C C f x f x C C C
C C C C C
Z
n m
1
llC y
Z
n m
Nonlinear Least-Squares Regression
1,
n
i i ix y
At the min
0j
Z
a
,calc 1 2; , , ,i i my f x
2
,obs ,calc1
n
i ii
Z y y
This results in a system of nonlinear equations
Linearize & solve iteratively
Need initial estimate of parameters
In matrix form
C D
1 ,,
1,obs ,calc
1 ,
1
;
rrii
ri
n
jk kji j k xx
nr
k i i ii k x
r r rk k k
f fC C
fD y y x
1 C D
rrj
1, ,1
Uncertainty in parameters
k
m n m kkmF C Z
n m
Adobe Acrobat Document
Nonlinear Least-Squares Regression - ExampleVan der Waals parameters for nitrogen
2m m
RT ap
V b V
2
2
1
m
m
p
a V
p RT
b V b
p/atm T/K Vm/(L mol-1) p/atm T/K Vm/(L mol-1)
1 223.15 18.28340 5 373.15 6.13064
5 223.15 3.63436 20 373.15 1.53844
10 223.15 1.80389 50 373.15 0.621118
20 223.15 0.889748 5 473.15 7.77970
1 273.15 22.4046 10 473.15 3.89744
10 273.15 2.23174 20 473.15 1.95651
20 273.15 1.11189 50 473.15 0.792572
50 273.15 0.44191
Package
Package
Weighted Least-Squares Regression
2
,obs ,calc1
n
i i ii
Z w y y
may not always want to give equal weight to each point
Applies to linear and nonlinear case
1 ,,
1,obs ,calc
1 ,
Nonlinear case
;
rrii
ri
n
jk kj ii j k xx
nr
k i i i ii k x
f fC C w
fD w y y x
1
,obs1
Linear casen
jk kj i k i j ii
n
k i i k ii
C C w f x f x
D w y f x
Drawbacks of Iterative Matrix Method
• Local minima can cause problems
• Can be sensitive to initial guess
• Derivatives must be evaluated for each iteration
Simplex Method
• Simplex has one more vertex than dimension of space– 2D – Triangle
• m parameters – m+1 vertices
• Simplex Method used to optimize a set of parameters– Find optimal set of ’s such that Z is minimum
• More robust than previous iterative procedure– Often slower
Simplex Method
1. Evaluate Z at m+1 unique sets of parameters
2. Identify ZB (best, smallest) and ZW (worst, largest)
3. Calculate Centroid of all but worst (average of different sets of parameters ignoring worst set)
4. Reflect worst point through Centroid
1 ,2*k k k WR C
,
1k k i
i W
Cm
Simplex Method5. Replace Worst point:
a. If ZR1<ZB (reflected point is better than previous best) calculate
i. If ZR2<ZR1
replace W with R2
ii. Otherwise replace W with R1
b. If ZB<ZR1<ZW replace W with R1
c. If ZR1>ZW a contracted point id calculated
i. If ZR3<ZW replace W with R3
ii. Otherwise move all points closer to the best point
6. Repeat until converged or maximum number of iterations have been performed
2 ,3*k k k WR C
3 ,0.5*k k k WR C
Simplex Regression - ExampleVan der Waals parameters for nitrogen
2m m
RT ap
V b V
p/atm T/K Vm/(L mol-1) p/atm T/K Vm/(L mol-1)
1 223.15 18.28340 5 373.15 6.13064
5 223.15 3.63436 20 373.15 1.53844
10 223.15 1.80389 50 373.15 0.621118
20 223.15 0.889748 5 473.15 7.77970
1 273.15 22.4046 10 473.15 3.89744
10 273.15 2.23174 20 473.15 1.95651
20 273.15 1.11189 50 473.15 0.792572
50 273.15 0.44191
Nonlinear Regression Nitrogen Gas Optimization.mws
Simplex program
Adobe Acrobat Document
Adobe Acrobat Document
Adobe Acrobat Document
Adobe Acrobat Document
Simplex - ExampleIteration 1: Response 0.344652beta Response1.300000 0.050000 0.4254371.326000 0.050500 0.344652 Best1.313000 0.051000 0.579697 Worst1.313000 0.050250 Centroid1.313000 0.049500 0.229741 First reflected point1.313000 0.048750 0.116962 Second reflected point
Iteration 2: Response 0.116962beta Response1.300000 0.050000 0.425437 Worst1.326000 0.050500 0.3446521.313000 0.048750 0.116962 Best1.319500 0.049625 Centroid1.339000 0.049250 0.076378 First reflected point1.358500 0.048875 0.011665 Second reflected point
Iteration 3: Response 0.0116649beta Response1.358500 0.048875 0.011665 Best1.326000 0.050500 0.344652 Worst1.313000 0.048750 0.1169621.335750 0.048812 Centroid1.345500 0.047125 0.041013 First reflected point
Iteration 4: Response 0.0116649beta Response1.358500 0.048875 0.011665 Best1.345500 0.047125 0.0410131.313000 0.048750 0.116962 Worst1.352000 0.048000 Centroid1.391000 0.047250 0.195042 First reflected point1.332500 0.048375 0.027212 Contracted point
Iteration 31: Response 0.00543252beta Response1.393487 0.049624 0.0054331.393340 0.049619 0.005433 Best1.393220 0.049616 0.005433 Worst1.393413 0.049621 Centroid1.393607 0.049627 0.005433 First reflected point1.393317 0.049619 0.005433 Contracted point
Iteration 32: Response 0.00543252beta Response1.393487 0.049624 0.005433 Worst1.393340 0.049619 0.0054331.393317 0.049619 0.005433 Best1.393328 0.049619 Centroid1.393170 0.049613 0.005433 First reflected point1.393408 0.049621 0.005433 Contracted point
Iterations converged. R^2 0.999999
Final Converged Parametersk beta0 1.393321 0.0496186
Simplex – Example (Iteration 1)
0.045
0.046
0.047
0.048
0.049
0.05
0.051
0.052
1.29 1.31 1.33 1.35 1.37 1.39
a
b
BW
C
R1
R2
Simplex – Example (Iteration 2)
0.045
0.046
0.047
0.048
0.049
0.05
0.051
0.052
1.29 1.31 1.33 1.35 1.37 1.39
a
b
B
W
C R1
R2
Simplex – Example (Iteration 3)
0.045
0.046
0.047
0.048
0.049
0.05
0.051
0.052
1.29 1.31 1.33 1.35 1.37 1.39
a
b
B
W
C
R1
Simplex – Example (Iteration 4)
0.045
0.046
0.047
0.048
0.049
0.05
0.051
0.052
1.29 1.31 1.33 1.35 1.37 1.39
a
b
BW
C
R1
Contracted
Simplex – Example (Iteration 32)
0.049612
0.049614
0.049616
0.049618
0.04962
0.049622
0.049624
0.049626
1.3931 1.3932 1.3933 1.3934 1.3935 1.3936
a
b B
W
C
R1
Contracted
Comparing Models
• Often have more than 1 equation that can be used to represent the data
• If two equations (models) have the same number of parameters the one with smaller Z is a better representation (fit)
• If two models have different number of parameters then can not do a direct comparison– Need to use F distribution & Confidence level– Model A – fewer number of parameters
Model B – larger number of parameters
Comparing Models
Model B is a better model if (and only if)
, ,1B B
A B
B A A B B Am n m
B BB
B
Z Zm m Z Z m m
FZ n mZ
n m
Usually lookup F in Table and compare ratios
With Maple can calculate confidence level for which B is a better model than A
Linear Regression Heat capacity of CO2.mwsLinear Regression Cd VP 2.mws