17
Chapter 3 Parallel Computing As we have discussed in the Processor module, in these few decades, there has been a great progress in terms of the computer speed, in- deed a 20 million fold increase during a fifty year period. This is done, mainly due to the fact that more and more transistors have been integrated into a silicon chip, from a few to tens (SSI), to hun- dreds (MSI), to thousands (LSI), and to the bil- lions (VLSI). 1

Chapter 3 Parallel Computing - Plymouth State Universityturing.plymouth.edu/~zshen/Webfiles/notes/CSDI1400/note13.pdf · Chapter 3 Parallel Computing As we have discussed in the Processor

Embed Size (px)

Citation preview

Chapter 3

Parallel Computing

As we have discussed in the Processor module,

in these few decades, there has been a great

progress in terms of the computer speed, in-

deed a 20 million fold increase during a fifty

year period.

This is done, mainly due to the fact that more

and more transistors have been integrated into

a silicon chip, from a few to tens (SSI), to hun-

dreds (MSI), to thousands (LSI), and to the bil-

lions (VLSI).

1

Moore’s law

This phenomenon is nicely summarized via the

Moore’s law: The number of transistors placed

on a chip has been doubled every eighteen

month.

For example, Intel 8086, a processor chip made

by Intel in 1978, contained 29,000 transistors,

and ran at 5 MHz; and the Intel Core 2 Duo,

introduced in 2006, contained 291 million tran-

sistors and ran at the speed of 2.93 GHz.

Thus, during those 28 years, the number of

transistors has gone up by 10,034 times, or

doubled once every 24 months, or two years.

2

A picture worthshow many words?

More importantly, this increase of the transis-

tors directly leads to an increase of the com-

puter speed. In this case, the speed goes up

by 586 times during this period.

The following chart shows the increase of the

computer speed corresponding to that of the

integration number.

3

Not just the speed...

Moreover, besides processing speed, some of

the other capabilities of many digital electronic

devices are also strongly connected to Moore’s

law: memory capacity, sensors and even the

number and size of pixels in digital cameras.

As a result, all of these technology have also

been speeding up at this stunning exponential

rate as well.

Since Moore’s law precisely describes a driving

force of technological and social change in the

past thirty or so years, it has been used to

guide long term planning and to set targets

for research and development.

4

A dead end?

Unfortunately, this era of steady and rapid growth

of single-processor performance over 30 years

is essential over, because

• By “doubling every eighteen months,”, we

have to make the wires√

2 thinner every

eighteen months. This has to come to an

end at some point since we can’t make the

wires infinitely thin.

• Although every transistor produces only a

tiny bit of heat, when you put billions of

them to a tiny space, the amount do add

up, ..., to that at the surface of the Sun.

• We also have essentially done our best to

dig out all the benefits of a complicated

single processor architecture.

5

What to do?

Fortunately, Moore’s law is not completely out

of the window yet. It is predicted that it will

continue for another five years or so.

This many transistors will no longer be used

to construct a single processor, but to increase

the number of independent processors in a sin-

gle chip. We will then try to speed up the

whole process of letting those independent pro-

cessors work on the data in parallel.

An analogy could be that, in the ancient time,

we can only cook one thing at a time with our

old fashioned stove.

6

Nowadays, with a contemporary stove, we can

cook many different dishes in parallel, or at the

same time, which certainly saves time.

Similarly, we could cut up a big problem into

many smaller ones, and run them in parallel

with multiple processors. Could we?

7

They are happeningeverywhere...

Indeed, we can find many examples of “parallel

computing” in our work and/or life: multiple

galaxies running in the Universe, multiple lanes

in I-93, multiple gas pumps in most of the gas

stations, etc..

8

It is difficult....

They all sound good, but it is not as easy.

In the cooking example, a good chef knows

that she will not always cook everything at the

same time. To cook the dish of, e.g., Pepper,

Onions and Pork, she has to fry the pepper,

and the pork first, which can be done at the

same time; then fry the onion, which is mixed

with the partially fried pepper and the pork.

In the multiple lane case, although the cars in

different lanes can go forward in parallel, the

cars in the same lane have to go forward in

turn.

It is the same idea to do computing in paral-

lel. You have to figure out what parts can be

done in parallel, and what have to be done in

parallel.

9

An example

We have been using computers to do the courses

registration for quite a few years now. When

adding somebody into a class, a program has

to make sure, among other things, that the to-

tal number of students added into a class is no

more than the cap of that class, 25 for ours.

If we run course add sequentially, i.e., one by

one, this is what the program will do to add

another student into this calfs:

if the current number < 25

then add this student

Thus, before we add in another student, we

always check the cap.

10

The parallel case

Since the above add consists of two steps: one

check and another add, when we try to add

multiple requests at the same time, we might

get into trouble since we don’t know in what

order will the steps get mixed up.

For example, if there are 24 students signed up

for this course, and two more students come

to add into the course.

What is to happen?

11

This will.

If we do the add in parallel, and it happens

that the arrangement of the two steps for the

two adds look like the following:

Request 1 time Request 2

- ↓ -Check the number t1 -(Still 24) ↓ -- t2 Check the number- ↓ (Still 24)Add in student t3 -(Now 25) ↓ -- t4 Add in student- ↓ (Now 26)

Thus, as the above charts shows, we will add

more students than what the cap requires.

12

Software is really hard

Although we have been working with parallel

computer hardware for a long time, since the

late 1960’s, its programming is really difficult

as we have to take care of the communication

and coordination issues between the multiple

processors, just like when we do conference

calls, we want to make sure that only one per-

son speaks at a time.

In other words, the difficulty lies in on the soft-

ware part, although we can come with lots of

cheap hardware parts.

13

How fast could it be?

The natural expectation for the speed-up from

parallelization would be linear: If you put in a

two lane highway, then two cars can do through

the toll both at the same time, and if you put

in a four lane, then four cars can pay tolls in

parallel.

That is why we often put in multiple toll booths,

e.g., in Exit 11 in I-93. On the other hand,

this does not happen to the parallel comput-

ing: very few parallel algorithms achieve linear

speed-up.

Most of them have a near-linear speed-up for

small numbers of processing elements, but de-

grades to constant value for large numbers of

processing elements.

14

Here is the limit

The potential speed-up of a parallel algorithm

on a parallel computer is given by Amdahl’s

law, established in 1960s by Gene Amdahl.

When a big problem is cut into a bunch of

smaller one, some of them can run in parallel,

while the others have to run as a sequence,

then, it is the latter that will decide overall

speed-up available from parallelization.

This relationship is given by the equation:

S =1

1 − P,

where S is the speed-up of the program, as a

factor of its original sequential runtime, and P

is the fraction that can be run in parallel.

15

An example

If we cut the problem into ten pieces, nine of

them can run in parallel, while one piece can’t,

we have

S = 10%, P = 90%,

then, the Amdahl’s law tells us that

S =1

1 − 0.9=

1

0.1= 10.

In other words, at most, we can speed it up

10 times, no matter how many processors we

throw in.

This result thus puts an upper limit on the use-

fulness of adding more parallel execution units.

One way to put it: “The bearing of a child

takes nine months, no matter how many women

are assigned.”

16

Discussion topics

• Do some further research on Amdahl’s law,

and share with us your findings in laymen’s

language.

• What are some of the successful applica-

tions of this multi-processing idea in paral-

lel computing? Give some details.... What

is it? Why do we do it in parallel? What

are the benefits, as compared with sequen-

tial computing?

• In your life, study and/or work, have you

ever applied the multi-processing strategy,

i.e., do multiple things at one time? If

yes, give us some examples: what is the

problem? how to you cut it into smaller

problems? Can all these smaller ones be

run in parallel? If not all of them can be run

in parallel, how do you coordinate them?

17