Runge kutta essay

Talking about Runge-Kutta

An essay on an algorithm (extended version)

David Coulson, 2015 [email protected]

Straight up, I’ll confess that I’m not a historian. What I know for sure about the lives of Carl Runge and Martin Kutta could be written on the back of a postcard and still leave room for an address. But I like understanding, and I think maybe you are the same. I want to know why a man 120 years ago creates a mathematical technique so heavy on calculation that only a computer can churn through it, yet the inventor creates it fifty years before the first computers are even invented. That to me makes very little sense. The textbooks aren’t much help either. They explain the technique as a succession of theorems to be proved, as if Mr Runge already knew what he was looking for right from the first pen-scratching and went straight to it. But no-one is that logical and clairvoyant. No-one starts on an adventure by thinking “We seek a function of the form haFyahxFwhyy ninn ,1

Carl Runge was an astrophysicist as much as he was a mathematician, according to Wikipedia. He had a special interest in the spectra of stars, it seems, and mathematics was his toolkit; in particular, marching procedures for the solution of differential equations. Whether he was more interested in the toolkit or the projects he applied it to is anybody’s guess. In those days it probably didn’t matter. Science was still small enough that you could be a contributor to many branches of it. So with very little info to guide me, I’m going to try to reconstruct his world and imagine the unintended steps that led him to discover the algorithm that now bears his name. This is not a scientific paper. It’s not even likely to be correct. It is simply a story intended to make the learning of Runge’s methods a bit more intuitive.

Carl Runge wrote the first paper on his new mathematical method in 1895. Think about 1895. The world is in an industrialisation frenzy. Technology is expensive and its inventions are huge. Projects are so colossal and so costly, and they require the collaboration of so many people that the requisite technology needs to be proven right on paper before construction can begin. Hence mathematics. Mathematics comes to the fore as never before, as insurance against shoddy workmanship and crappy ideas. In particular, this means the calculus of differential equations, and many of the equations representing real processes are unsolvable by purely algebraic means. Solution therefore requires chipping away at the problem with brute force arithmetic, numerical approximations that approach the answer an inch at a time. There are no computers and spreadsheets. Arithmetic is hard, just as it was when we were all at primary school. It is done by hand; fountain pens, ink wells, notebooks, numbers in columns, reference tables for square roots and the like. It’s neat and precise until it goes wrong, upon which the page is torn out of the notebook and the writing starts again.

To solve a first order differential equation, one usually needs a numerical method, marching arithmetically forward through time in steps of a tenth of a second, from a known starting point on the left side of a graph to an uncertain future on its right.

The quality of your approximation depends on how many decimal places you choose to use. Four decimal places is common, if old-fashioned log tables are any indication, which means that multiplying any two numbers requires the multiplication of 16 pairs of digits, and the adding up of 16 subtotals; 31 operations just to move forward a single time step.

The most popular method is Euler’s method, shown here on the left. This method has been around 120 years by the time of Carl Runge’s investigations and its limitations are well known. Yet despite this, it continues to be the most commonly used method, simply because of its simplicity. No method is as rough as this, but no method is as fast either.

nnn hFyy 1

Euler’s method is famously inaccurate. In a field where lines bend monotonically downwards or upwards, Euler’s approximation careens across lanes like a car out of control. Beyond about a dozen time-steps, its trace is so off-track it is essentially useless. This is what comes of pretending that a bent line is best approximated by a tangent at one end. By definition, Euler’s method must go off course.

Practitioners try to combat this trend by making each step-size very small. This keeps the migration upwards or downwards at any step very slight, but this only increases the number of time-steps you have to do and the number of errors you introduce. Which is better? Many small errors or two or three big ones?

Well, there are alternatives. One way would be to apply a gradient that is the average of the gradients at both ends of the time-step. Having crossed the time-step using Euler’s method, you can then calculate the gradient at the other side and average this with the gradient you used to get there.

Because this modification is so easy to imagine, and the maths behind it so easy to apply, I’m going to guess that this was a commonly used procedure in 1895. I would be very disappointed in the scientists of the time if it wasn’t. Given that the tangents at each end of a uniformly bent line necessarily point in opposite directions, the average has to be better than the tangent offered from either end.

A different way of improving accuracy would be to take half a step out to the middle of the time interval and see what the gradient looks like there. The gradient at the middle would surely be more representative of the line as a whole, so if you used that gradient across the entire interval you would probably arrive at the right edge very close to where you should be.

To approximate the y-value at the midpoint, you would use Euler’s method starting from the left edge of the interval. That means there are two levels of approximation: the approximation required to get to the midpoint and the approximation to get to the right edge.

Is this method better than the one I described previously? I don’t know. Maybe it is and maybe it isn’t. But I think it is this question, or one like it, that drew Carl Runge into an investigation of the error-generating properties of numerical methods in general. Faced with several reasonable options, he just wanted to know which was best, in terms of accuracy versus effort.

? ?

This is what the first method, which I’ve been calling the averaging method but which is really called the Trapezoidal rule, looks like this mathematically:

Average slope emerging from yn.

121 nnh

nn FFyy

... '''221

2

nh

nnnh

nn FhFFFyy

... '''21

2

nh

nnn FhFFF

121 nnh

nn FFyy

We can relate Fn+1 to Fn using a Taylor series expansion about Fn.

... '''421

32

nh

nh

nnn FFhFyy

This can be reduced to a tidier form which looks very much like a Taylor series expansion for y(xn).

Underneath it, I’m going to write the real Taylor series expansion for y(xn).

This can be reduced to a tidier form which looks very much like a Taylor series expansion for y(xn).

... '''421

32

nh

nh

nnn FFhFyy

... '''''!3!21

32

nh

nh

nnn yyhyyy

... '''421

32

nh

nh

nnn FFhFyy

... '''''!3!21

32

nh

nh

nnn yyhyyy

Can you see that these series differ at the h3 term? This means that averaging two gradients across a time step is as effective as interpolating a parabola through the time-step, without all the computational work required to do that.

Why is this good? Well, assume that we are looking at a well-behaved graph-line that never climbs or dips by an angle steeper than 45 degrees. That means that the magnitude of the gradient function never exceeds 1. That’s a well-behaved graph-line. Assume also that the line is pretty straight, so that it’s curvature is also a small number. Assume thirdly that we are using a time-step of one-tenth of a second.

Under those conditions (which are not at all unrealistic) we get four decimal places of accuracy from each time-step.

If we did the same job using Euler’s method alone, with the lopsided gradient from one end, then the error introduced at each step would be ten times bigger than what we get from the Trapezoidal method.... ...which means that a single application of the Trapezoidal method is worth ten steps of the Euler method.

The amount of work done in each time step has increased three-fold, but we’re still winning in terms of effort saved.

Let’s see how the other method, the midpoint method, performs under the same sort of analysis.

This is the method in which the gradient at the middle of the time-step is obtained and used to cross the entire time-step.

Let’s see how the other method, the midpoint method, performs under the same sort of analysis.

... '''

22

2

2

21

nnh

nnFFFF

h

211

nnn hFyy

where

This is the method in which the gradient at the middle of the time-step is obtained and used to cross the entire time-step.

... '''821

32

nh

nh

nnn FFhFyy

... '''821

32

nh

nh

nnn FFhFyy

Compare this to the Taylor series and to the series for the averaging method. You can see that the Trapezoidal method and the Midpoint method are both in error at the h3 term, but not by much. One method underestimates that term by 1/24 h

3, the other overestimates it by 1/12 h3. Remember that for a step-size set at 0.1, the h3 term is 0.001. At this level the differences in accuracy between the methods is irrelevant.

... '''''!3!21

32

nh

nh

nnn yyhyyy

... '''421

32

nh

nh

nnn FFhFyy

Midpoint method:

Trapezoidal method:

Taylor series:

... '''821

32

nh

nh

nnn FFhFyy

Is there a way to do even better?

... '''''!3!21

32

nh

nh

nnn yyhyyy

... '''421

32

nh

nh

nnn FFhFyy

Midpoint method:

Taylor series:

Well, yes. Maybe I could average the results of the two methods and get one uber-method that cancels out the errors from each.

Trapezoidal method:

Here’s what it would look like.

where nn

hnn

FFFFh

'''22

2

2

21

nh

nnn FhFFF '''21

2

1424121 n

h

n

hn

hnn FFFyy

212

11222

11

nn

hn

hnn hFFFyy

Trapezoidal method

Midpoint method

1424121 n

h

n

hn

hnn FFFyy

nh

nnh

nnh

nh

nh

nn FhFFFFFFyyh

'''''' 2422241

22

2

212

11222

11

nn

hn

hnn hFFFyy

Midpoint method


Trapezoidal method

1424121 n

h

n

hn

hnn FFFyy

nh

nnh

nnh

nh

nh

nn FhFFFFFFyyh

'''''' 2422241

22

2

nnnnn FhFhhFyy ''' 3

1632

21

1

212

11222

11

nn

hn

hnn hFFFyy

Midpoint method


Trapezoidal method


1632

21

1

'''''' 3

612

21

1 yhyhhyyy nn True Taylor series expansion

Dave’s uber method

Compare the errors with what you would get from a Taylor series expansion.


1632

21

1

'''''' 3

612

21

1 yhyhhyyy nn

See the error? It’s still order-h3.

True Taylor series expansion

Dave’s uber method

What’s happening here is that the combined method looks in principle to be a better method but is proved not to be so when we examine the errors up close.


1632

21

1

'''''' 3

612

21

1 yhyhhyyy nn

Dave’s uber method is still inaccurate at the h3 level, meaning that it is only superficially better than the two methods it was made from. So it’s a lot of extra work for essentially no benefit. Also known as a hiding to hell.

But see how Runge’s way of thinking allows us to analyse these things? Maybe there’s a way of combining the gradients at the two ends of the time step and the one in the middle so that the three of them completely knock out the h3 error term.

1424121 n

h

n

hn

hnn FFFyy

1333121 n

h

n

hn

hnn FFFyy ?

X

For example, maybe I should be giving each of the three gradient estimates equal weight instead of biasing towards the middle. What do you think?

1424121 n

h

n

hn

hnn FFFyy

1333121 n

h

n

hn

hnn FFFyy ?

X

Well, I could go through that error analysis just like before, comparing the outcome to a Taylor series, but to save time I will tell you now that that method will also be wrong by a small amount. It’s a dead end.

So is there a way of combining these three values properly so that the h3 term is knocked out? Of course there is. Otherwise I wouldn’t have asked the question. The trick, however, is to STOP GUESSING and work it out algebraically!

1321121 nnnnn FwFwFwhyy

w1, w2 and w3 are fractions (weights) that add up to 1.

nh

nnnnh

nnnn FhFFhwFFFhwFhwyyh

'''''' 2322211

22

2

321

2813

32212

3211 ''' wwFhwwFhhFwwwyy nnnnn


So is there a way of combining these three values properly so that the h3 term is knocked out? Of course there is. Otherwise I wouldn’t have asked the question. The trick, however, is to STOP GUESSING and work it out algebraically!

... ''''' ' 3

612

21

1 nnnnn yhyhhyyy

Compare this with the Taylor series so that you can determine the weights.

321

2813

32212


nh

nnnnh

nnnn FhFFhwFFFhwFhwyyh

'''''' 2322211

22

2


1321 www

... ''''' ' 3

612

21

1 nnnnn yhyhhyyy


321

2813

32212


1321 www

21

3221 ww

... ''''' ' 3

612

21

1 nnnnn yhyhhyyy


321

2813

32212


1321 www

21

3221 ww

61

321

281 ww

... ''''' ' 3

612

21

1 nnnnn yhyhhyyy


321

2813

32212


61

3 w

32

2 w

61

1 w1321 www

21

3221 ww

61

321

281 ww

161

32

61

121 nnnnn FFFhyy

nhh

nnnFyxFF

22,

21

Where

and nnnn FyhxFF ,1

This is the third-order Runge-Kutta method, created by Martin Kutta in 1901, based on Carl Runge’s 1895 ideas. You can look it up in a book. It’s there. But here you have seen it obtained in a more intuitive fashion, using a bit more hand-waving and a bit less mumbo-jumbo.

161

32

61

121 nnnnn FFFhyy

This is a true order-h3 method, which means (under normal conditions) that each time-step of this method is worth a hundred steps of the Euler method or ten steps of the Trapezoidal method. Each step is maybe four times as laborious, but we are still winning in terms of how much pen-pushing we have to do.

We have arrived at a point in the story where we can create as many Runge-Kutta methods as we want, based on the models of the two-point method and the three-point method. The principles should be the same if we proceed to four-point or five-point or even six-point methods: select a number of points within the time interval, find the slopes at those points and attach weights to them to create an average slope across the whole region. However be warned that the task of identifying weights becomes increasingly onerous as we bring more points into consideration. A four-point method requires identifying four weights from four linear equations, and a five-point method requires identifying five weights from five linear equations. That gets tedious pretty fast without automation. But in principle at least, there is no limit to how accurate I can make a Runge-Kutta method, if I have the time and perseverance.

Just to prove the point, here’s a four-point method that I created myself, based on this idea. It uses four equally-spaced points across the time-step, and when combined properly they eliminate all error terms up to and including the h4 term. That means it is ten times as accurate again as Kutta’s three-point method (under normal conditions).

1161

132

31 321 7 nnnnnn FFFFhyy

Obtaining the method took a very long amount of time and I would not want to repeat the exercise, let alone proceed onwards to a five-point method.

nnnan ahFyahxFF ,

So how much accuracy in a numerical method is enough? Theoretically, a five-point method is worth 10,000 steps of the simple Euler method, so we may think that we could replace 10,000 time-steps of a tenth of a second with a single step of 1000 seconds. But no-one would do this. The whole purpose of drawing a graph is to see how it meanders between two points, not simply to see where it ends. This means we have reached a point where accuracy is no longer the primary consideration. Other factors come into play, like ease of use and programmability. We like the weights to be simple fractions, and we like methods with fewer stepping stones. We are thinking more in terms of bang-for-buck than sheer bang. So instead of proceeding into five-point and six-point methods, I want to show you a clever thing that Martin Kutta did with the three-point method.

Kutta appears to have wondered if it was possible to improve the estimate of the gradient at the midpoint before using it to calculate the gradient at the end point. Not only would he get a better estimate of the midpoint gradient, he would end up with a better estimate of the end point because less error is being passed forward. A picture will illustrate the idea best.

nh

nnFyy

221

21

21

21 ,

nnnyxFF

(1) Use Euler’s method to estimate the gradient at the midpoint.

Martin Kutta very cleverly puts the Trapezoidal method inside the Midpoint method. It goes like this:

1


(2) Average this midpoint gradient with the starting gradient to get a better estimate of the gradient at the midpoint.

21

21

21 ,

nnnyxFF

21

21 2

121

2

nnh

nnFFyy

2


(3) Use this improved midpoint estimate to cross the entire time-step.

211

nnn hFyy3

This should be the end of the process, but Kutta goes one step further. He takes the four gradients that have just been calculated and averages them all together – Runge Kutta style – in a way that matches terms with the Taylor series, down to the h4 level.

1612

311

31

61

121

21 nnnnnn FFFFhyy

Initial midpoint

Improved midpoint

1612

311

31

61

121

21 nnnnnn FFFFhyy

161

32

61

121 nnnnn FFFhyy

If you write the three-point method underneath it you can see the similarities between the two methods. It’s as if Kutta has cut the middle term into two equal-sized pieces, one of them the first estimate and the second one the improved estimate.

This is Kutta’s classic four-step method, which is really a three-step method with a correction term in the middle. For this improvement he gets an extra order of accuracy out of the estimation; in simple terms, four decimal places instead of three.

I know that you have seen this method before in your university maths classes, and perhaps you have even poured numbers through it a few times while doing a research project. This is THE Runge-Kutta method, the one most engineers turn to as if no other Runge-Kutta methods had ever been invented. It would not surprise me if this particular method has been used more often than all the other Runge-Kutta methods put together.

1612

311

31

61

121

21 nnnnnn FFFFhyy

432161 22 kkkkyy hnn

nn yxFk ,1

1222 , kyxFk hn

hn

2223 , kyxFk hn

hn

34 , hkyhxFk nn

... which is fine if you are a computer programmer and all you want to do is turn these equations into source code. But I think that stating it like this actually hides how the method works and reduces it to black magic.

Whenever you see it written in books, the method always looks like this...

432161 22 kkkkyy hnn

nn yxFk ,1

1222 , kyxFk hn

hn

2223 , kyxFk hn

hn

34 , hkyhxFk nn

Would you believe that Martin Kutta included this method in his 1901 paper alongside the three-point method that I described earlier? It looks for all the world like a method especially written for the computer age, yet it was published some 43 years before the invention of the first electronic computer, long before scientists in general imagined that computation could be automated.

It amazes me that a method as convoluted as this could be considered practical at a time when the only way to do maths was with pen and paper. And yet there it is. Mathematicians were tough in those days.

Kutta’s clever idea leads to an offspring of methods that similarly recycle numbers generated earlier in the process. Here’s a six-point method developed a quarter of a century after Kutta’s 1901 paper that achieves order h5 accuracy. The method was developed by a fellow named Evert Johannes Nyström, who appears to be the third great figure in the evolution of Runge-Kutta methods.

5758

43152

254

1256

54

6 0, hkhkhkhkhkyhxFk

4252

32013

259

110063

53

5 , hkhkhkhkyhxFk

3415

2149

4 5, hkhkhkyhxFk

252

152

3 0, hkhkyhxFk

151

51

2 , hkyhxFk

yxFk ,1

654311441

1 7550210017 kkkkkhyy nn

See how its k-values (meaning gradients) are recycled, row-by-row to make better gradients? It’s as if several Runge-Kutta processes have been nestled inside one another, telescope fashion, so that each gradient is a kind of average of the gradients that came before.

5758

43152

254

1256

54


4252

32013

259

110063

53

5 , hkhkhkhkyhxFk

3415

2149

4 5, hkhkhkyhxFk

252

152

3 0, hkhkyhxFk

151

51

2 , hkyhxFk

yxFk ,1

654311441


5758

43152

254

1256

54


4252

32013

259

110063

53

5 , hkhkhkhkyhxFk

3415

2149

4 5, hkhkhkyhxFk

252

152

3 0, hkhkyhxFk

151

51

2 , hkyhxFk

yxFk ,1

654311441


The expansion of the formulae make a kind of triangular structure reminiscent of a lower triangular matrix. This has inspired people to consider forms of recycling that need not occupy just the lower triangular part of the matrix but may fill the entire rectangle.

These are called implicit methods because on any row (after the first one) they assume you already know the values of all the other gradients, something which seems at first to be a logical paradox.

6655443322111 kckckckckckchyy nn

yxFk ,1

62652542432322212122 , hkahkahkahkahkahkayhcxFk





These are called implicit methods because on any row (after the first one) they assume you already know the values of all the other gradients, something which seems at first to be a logical paradox. However, it IS possible to determine the values of all these gradients simultaneously using iterative procedures, where you guess suitable values for these gradients and let a process of trial-and-error guide you towards more accurate values. The process is very tedious, even if you are setting it up on a computer.

Why would anyone want to do this? Implicit methods are not necessarily more accurate than explicit methods, and they definitely are a lot more work. Where’s the payoff? Well, if you think of RK formulae as cars, some of them are better suited to rough terrain than others. By rough terrain I mean graphs with wildly changing characteristics and divergent lines. In functional terrain like this you want to have a numeric method that is not easily thrown off course, and that means a method that samples many gradients in the neighbourhood and does so accurately. A method that starts with one gradient and uses that to generate a second gradient and then uses the pair of them to create a third is passing errors down the line, so that the fourth and fifth gradients are nowhere near as accurate as the first. On a wildly changing function field, small but finite errors may be enough to throw you onto the wrong track.

Methods that force you to work out all the gradients simultaneously are forcing you to produce gradients that are equally accurate. It’s that which halts the transmission of error from equation to equation. But as you can imagine, there is a tremendous price to pay for this extra stability, which is time and effort. It takes time to link a nonlinear-equation solver into your ODE program, more time perhaps than you would use on a more primitive method that produced less reliable results. For this reason engineers use implicit methods sparingly. Even when they need to, they will use an in-between family of Runge-Kutta methods known as semi-implicit methods. As the name suggests, they are only half as onerous.

These come in two flavours, from what I have seen. Diagonally-implicit RK methods require you to identify each gradient in terms of itself, one gradient at a time.


yxFk ,1

22222 , hkayhcxFk

33333 , hkayhcxFk

44444 , hkayhcxFk

55555 , hkayhcxFk

66666 , hkayhcxFk

Another kind acts like a lower triangular matrix with nonzero diagonal elements. In this case, previously identified gradients are used to determine a single new gradient, so that at any stage only one gradient is being sought implicitly.


22212122 , hkahkayhcxFk

33323213133 , hkahkahkayhcxFk

44434324214144 , hkahkahkahkayhcxFk

55545435325215155 , hkahkahkahkahkayhcxFk


1k yxF ,

These procedures are all huge and work like gigantic number-crunching factories. If they seem economical and efficient, it’s only because the problems they are applied to are so much bigger again. Speaking for myself, I prefer a method that is small and easily coded into my spreadsheet and operates in a way that suggests cleverness instead of brute computational force. So to finish this essay I am going to return to where I started, a discussion of two-point methods and how to get the best out of them.

Two-point methods work by finding a weighted average of two gradients that accurately mimics the gradient across the entire time step. Usually the first gradient is taken at the left edge of the time step and the second is taken either at the midpoint of the time step or at the right edge. But there is no reason that we have to restrict ourselves to these two points. In fact, if I let go of these traditional points then I should be able to find two other points inside the region where the gradients average together perfectly and produce an errorless bridge across the time step. Such methods are called Gauss-Legendre methods.

bbaann FwFwhyy 1

Mathematically the process looks like this:

...''''''

62

32

n

ah

n

ah

nna FFFahFF

...''''''

62

32

n

bh

n

bh

nnb FFFbhFF

where

bbaann FwFwhyy 1

...''''''

62

32

n

ah

n

ah

nna FFFahFF

...''''''

62

32

n

bh

n

bh

nnb FFFbhFF

n

bh

n

bh

nnb

n

ah

n

ah

nnann

FFFbhFhw

FFFahFhwyy

''''''

''''''

62

621

32

32

where


banban

bannbann

wbwaFhwbwaFh

bwawFhFwwhyy

334

61223

21

2

1

''' ''

'

bbaann FwFwhyy 1

...''''''

62

32

n

ah

n

ah

nna FFFahFF

...''''''

62

32

n

bh

n

bh

nnb FFFbhFF


where

banban

bannbann

wbwaFhwbwaFh

bwawFhFwwhyy

334

61223

21

2

1

''' ''

'

Compare terms with the Taylor series for y(x).

...'''''' 4

2413

612

21

1 nnnnnn FhFhFhhFyy

1 ba ww

122 ba bwaw

133 22 ba wbwa

144 33 ba wbwa

...'''''' 4

2413

612

21

1 nnnnnn FhFhFhhFyy

banban

bannbann

wbwaFhwbwaFh

bwawFhFwwhyy

334

61223

21

2

1

''' ''

'

ba

abw

12

21

ba

baw

12

21

Compare terms with the Taylor series for y(x).

ba

abw

12

21

ba

baw

12

21

1 ba ww

122 ba bwaw

133 22 ba wbwa

144 33 ba wbwa

qq

q

21

21

21 12

21

21

qq

q

21

21

21 12

21

21

Introduce that Wa = Wb = ½, and that a = ½ - q and b = ½ + q for some q yet to be determined. This is because the location and weights of Fa and Fb should be symmetric if they are to be applicable to any arbitrary F(x,y). Doing so reduces the problem to finding a single unknown.

Introduce that Wa = Wb = ½, and that a = ½ - q and b = ½ + q for some q yet to be determined. This is because the location and weights of Fa and Fb should be symmetric if they are to be applicable to any arbitrary F(x,y). Doing so reduces the problem to finding a single unknown.

qq

q

21

21

21 12

21

21

qq

q

21

21

21 12

21

21

32

1q

6

3

21 a

6

3

21 b

6

3

21 a

6

3

21 b

bann FFhyy 21

1Therefore

ahxah,yxFF nna

bhxbh,yxFF nnb

where

There is a problem with this, however. We don’t know what y is at any point except for at the left edge of the region. We have to approximate it in some way or other, which will introduce errors. However you have seen several ways of doing this now, and it is fitting that I should review them now, in this final stage of the essay.

6

3

21 a

6

3

21 b

bann FFhyy 21

1Therefore

ahxah,yxFF nna

bhxbh,yxFF nnb

where

nna Fahyy

nnb Fbhyy

One easy way would be to apply Euler’s method. But this is going to undermine the accuracy provided by the points xa and xb, so it’s not a very good idea.

6

3

21 a

6

3

21 b

bann FFhyy 21

1Therefore

ahxah,yxFF nna

bhxbh,yxFF nnb

where

anna FFahyy21

21

bnnb FFbhyy21

21

A better way would be to feed these initial estimates for ya and yb into the Trapezoidal rule to make better estimates.

nna Fahyy

nnb Fbhyy

6

3

21 a

6

3

21 b

bann FFhyy 21

1Therefore

ahxah,yxFF nna

bhxbh,yxFF nnb

where

anna FFahyy21

21

bnnb FFbhyy21

21

A better way still would have you repeating this correction stage two or three times to get really good estimates of ya and yb.

nna Fahyy

nnb Fbhyy

6

3

21 a

6

3

21 b

bann FFhyy 21

1Therefore

ahxah,yxFF nna

bhxbh,yxFF nnb

where

anna FFahyy 21

bnnb FFbhyy 21

banna FaFah,yxFF41

41

banna FFbah,yxFF41

41

But the method that maintains the highest level of accuracy is a fully implicit method developed by two theoreticians in 1955 (PC Hammer and JW Hollingworth).

6

3

21 a

6

3

21 b

bann FFhyy 21

1Therefore

ahxah,yxFF nna

bhxbh,yxFF nnb

where

As often happens, the best results summarise the ideas and creations of many people across many different times in history, doing seemingly unrelated things. Gauss and Legendre lived in the early 1800s. Runge and Kutta made their contributions circa 1900. Computerisation in the 1940s enabled the conception of implicit Runge-Kutta methods in the 1960s that ultimately brought us back to Gauss and Legendre. Developments in science rarely proceed in straight lines and so I find it fitting that a mathematical procedure that attempts to compensate for the curvature in a line segment itself has a history that is curved. More so, the development of RK methods branches into several paths through the middle of the twentieth century, including many that I haven’t even named (let alone investigated) in this essay. The subject is very deep and very broad and continues to grow in bulk volume as time goes by. I’m stopping here because 76 pages is enough and my own curiosity in the subject is satisfied. -DC

http://nptel.ac.in/courses/111104030/pdf_lectures/lecture17.pdf (part of a series of lecture notes by Prof M K Kadalbajoo, IIT Kanpur)

* Some references Generally when I write these mathematical essays I speak ‘off-the-cuff’. However in this case I needed to refer to a couple of online resources, which I have referenced below. Wikipedia has a very good article on Runge Kutta methods too. -DC

J C Butcher, ‘A history of Runge-Kutta methods’, (a PDF that is easily Googled)

[END]

Education

Runge kutta essay