Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
BRANCHING IN FORTRAN
J. G. Sullivan
Abstract
The execution time of Fortran programs can be improved by properly considering both the methd used to cause a p r o g w
I. branch and fhe position of the branch point. The discussion \--.. points out some general rules which can be applied to Fortran programs which wit l improve execution time.
was as an account of Ctoverpment sponmred work. Neither the United
ates, nor the Commission, nor any person actlng on behalf of the Commission: A. Makes anywarranty or representaUon, expressed or implied,with reapect to the accu-
completene~s, 01 ueahrlness of the informatlo11 contained h this report, or that the m e ~ & ~ ~ ~ u ~ , method, or process disclosed tn tbt. report mY not infringe
rlvate1y owaed ri&* or B. As-s any UablUties wlth respect to the uw of. or for M e n r e d W from tha
u, of any InformaUon, appuatua. metbod, or pmoess disclosed h tbIs report. As used In the above, "person actlug on behalf of the Cornmisalon" includes any em-
l o p e or contractor of the Cornmission. or employss of such wniractor, to the extent tbaf wh employee o r contractor of the Commission, or employee of such oontractor p rewes . i.~minnte., or provide. access to, any informatlo11 ta hls e m p l o v t or a n t m d ..- . --A"",....
DISCLAIMER
This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency Thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
DISCLAIMER
Portions of this document may be illegible in electronic image products. Images are produced from the best available original document.
- LEGAL NOTICE
This report was prepared as an osc~unt of Government sponrwd work. Neiber the United Stabs,
nor the Commission, nor any pelson octing on behalf of the Commission:
A. Mokes any warranty or rapresentation, expressed or implied, with respect to the accuracy,
completeness, or usefulnoas of the information contained in this report, or fhdt the us. of
any information, apparatus. method, or process disclosed in this report moy not infringe privately owned riphts; or
B. Asmumes any thbi l i t ies with resmct to the use of, or for damages resulfing from the use of
any infarmatiow, opparotus, mthod, or process disclosed i n this report.
As usmd in the above, ''per+on o&ng on beholf of the Commirsion*' inelwdes any employw or
contractor of the Commission, 91 mployee af such contractor, to the extoot that rueh employee
or contractor of the Commis&an, a employee of such conkactor prapuras, dissemincrt.~, or
provides access to, any Informution pursuant to his empSoymenf or contrael with the CommJsbion,
or his employmmt with such contraeta.
Branching i n Fortran
There a re f i v e methods available t o the Fortran programmer fo r
choosing among paths fo r program execution. Four of them are , essen t ia l ly ,
performing a t e s t of some variable and, depending on i t s v a u e , branching t o
a l t e rna te port ions of t he code. One, the GO TO n type statement where n i s
a statement number (e .g . , GO TO 273)) i s a branch with no t e s t . The burden
of t h i s discussion w i l l be the implications of statemerts which t e s t and
branch and t he considerations necessary fo r the FortraK programmer t o
e f f i c i en t l y choose among the abundance of possibilities.
The four methods are:
1 ) " ~ o m ~ u t e d " GO TO (nl, n> ,..., n.), I; e .g . , GO ~0(1,2 ,3 ,43,7) , 1 J
2) "Assigned" GO TO; e.g., GO TO I, where I has been previously
defined by an ASSIGN statement.
3 Arithmetic.IF; e.g., IF(A-E) 1, 2, 3
4 ) Logical IF; e.g., IF(A.GT.B).W TO 100
As methods 1 and 2 are undoubtedly r a r e r ,in occurrence than methods
3 and 4, a few remarks w i l l be made about them before devoting t he bulk ,of t h e
discussion t o methods 3 and 4.
GO TO STATEMENTS
F i r s t of all, t he "assigned" GO TO does not d i f f e r l og i ca l l y from
the "conputed" GO TO statement.
Example l a :
ASSIGN 200 TO ITRA
IF (x- 5 . ) 10, 20, 30
1 0 CONTINUE
- FORTRAN STATEMENTS-
ASEIGN 50 TO ITRA -
CQ TO 40
20 CONTINUE
- FORTRAg STArnENTS -
ASSIGN 73 to ITRA
Go 70 '0
30 m(x-6.) 31, 40, 40
f l C3I!JTINLX
- F - 3 R M STATEWNTS - ASSIGN 371 I1Y) ITRA
kl G O Y O r n
where t h e e=ecutix~ of statement 40 will cause a transfer of execution to
one of hhe follob-ing statements : 203, 50, 73, or 371.
The sm-e cz~ding, but x i n g the "ccmpnted" GO TO, is illustrated
in the .?ollcwing coding:
Example. 1t :
' I m = i
IF(X-5.1 10, 20, 30
10 CONTINUE
- F0R'nA.N STXTErnJTS -
- FORTRAN STA'TEMENTS -
= :,
0 TC* 4c
3 I F ( x - ~ . ) 31, 40, 40
31 CONTINUE
- FORTRAN STATrnIENTS - ITRA = 4
40 GO TO (200,50,73,371), ITRA
There i s no log ica l difference i n t h i s usage but t he use of t he "computedtt
GO TO probably makes t he coding a l i t t l e eas ie r t o follow. One advantage . .
fo r choosing t h e "computed" form over t h e "assigned" Porm i s t h a t , during t h e
debugging phase, t he "assigned" GO TO can give pecul isr (and nonreproduzible)
e r ro rs i f the re i s a GO TO ITRA and t h i s statement i s executed p r io r t o an
ASSIGN TO ITRA. Also, some "programmers" a re prone t o wri te ar i thmet ic
assignment statements (ITRA = 200) when they should have wri t ten ASSIGN;
200 TO ITIW. One disadvantage with using t h e "computed" GO TO during
debugging i s t h a t while t he action taken at statement 40 i n Example l b
i s c lea r ly defined f o r ITRA = 1,2,3, or 4 various corrpilers react i n
d i f fe r ing ways f o r ITRA < 1 and ITRA > 4. One's predi lect ions must be t h e
determining fac tor here. So f a r as speed of execution i s ' concerned t h e use
of t he "assigned" GO TO i s definit 'ely f a s t e r . This i s evident from cozsidering
t h a t the "computedt1 GO TO must e i t he r do IF-l ike t e s t s , e.g. :
or e l s e t r e a t t h e addresses 'of the .b rmch points as i f they were an array
subscripted by ITRA. There .a lso may be s a e "mickey-mouse" i n taking -are of
values of ITRA outside t he range of t rans fe rs . The "assigned" GO TO i s never
more t h m a load of ITRA and a branch t o the 'address specified b y ' t h e value of
ITRA. One might a lso note t ha t i f there a re only th ree branch points: GO TO
(200,50,73), ITRn the IF(ITRA-2)200,50,73 w a ~ r of wri t ing it i s probably a
shade f a s t e r i n execution. A l l i n all, use of t he "computed" GO TO i s
4
possibly b e t t e r during t h e debugging phase and t h e "assigned1' GO TO i s b e t t e r
i n the execution phase. (This, of course, only applies when methods 1 and 2
a re used interchangeably as i n Example 1.) I f one, say, , s t a r t s off a program
with:
READ 1, ITRA .
GOT0 (1, 2, 3, 4), ITRA
then use of t h e "assigned" GO TO i s , of course, not applicable.
I F STATEDENTS
The l o g i c d IF and tk-e arfthmetic IF can, i n many instances, be
ra ther f r2ely interchanged :
1 C O I ' r I W 1 CONTINUE
However, because 3f t h e greater complexity of the l og i ca l IF each method w i l l
be taken up s epar 5 te ly .
Method .3. ARITHMETIC! IF
In t h i s method there %re, of course, two items f o r consideration:
1) the e q r e s s i c n t 3 be evaluated t e s t ed and 2) t he possible branch points.+
Whenever my mithmet ic operations a r e . ca l l ed f o r i n t h e evaluation of an
expressior (2.g., A-B) t h e action takan i s e s sen t i a l l y t he same as t h a t
performed on t h e right-hand s ide of ?he ari thmetic replacement statement except
t h a t no s tor ing of t he r e s u l t occurs and t he r e su l t i s t e s t ed i n whatever way
t h e computer hardware permits. When no ari thmetic operation appears, then t h e
manner i n whicn t he t e s t of t h e var iable i n storage i s made i s a function of
ccmputer hazdware. (some computers c a ~ t e s t t h e var iable i n memory d i rec t ly ;
o thers Eay have t c load t he var iable i n t o an accumulator or r e g i s t s r of some
kind and then 2er fom t h e t e s t ) . For i l l u s t r a t i v e purposes t h e 360 hardware
operations are used t o describe the action taken and the mnemonics t c be
used are defined as follcws:
MNEMONIC, Var Table 'Action
LOAD,X The variable, X, i s loaced int.0 the r eg i s t e r
ADD, X The variable, X, i s added t o the r eg i s t e r
SUB,X ' , The variable, X, i s subtracted from the r eg l s t e r
. . B, 1 Branch t o statement 1
B Z , l Branch ' i f the reg is te r i s i e ro t o statement 1 otherwise, proceed t o the next ins t ruct ion
BP, 1 Branch i f the reg is te r i s greater than 0 t o statement 1, ctherwise proczed.
BM, 1 , . Branch on minus,' etc.
BPZ,1 ~ r m c h on greater than or equal t o zero, e tc .
BM, 1 Branch on minus or zero, e tc .
BPM, 1 Branch on greater than zero or l e s s than zero, e tc .
TEST .Test the reg is te r , .
It i s necessary t o note tha t a f t e r an addit5on or subtraction the reg is te r
can be tes ted and 'the branches made without the necessity for a TEST instruct ion.
After a LOAD, a TEST instruct ion must be given before the branch cm be made.
(statement numbers w i l l appear a t the l e f t of the mnemonic code).
E Y W U : LOAD,X Explanation: Y i s s7ubtracte3 from X.
SUB,Y , . . and i f the r e su l t i s zero 3,
BZ, 1 branch t o stztement 1 i s taken.
EXAMPLF:: LOAD,X Explanation: X i s loaded, tes ted, and
TEST i f X i s l e s s than zero a branzh t o
BM, 1 statement 1 i s taken.
EXAMFU: LOAD.,X Explanation: WRONG. The TEST must be
BM, 1 given a f t l r the LOAD s s in the
preceding example.
Example 2
TDEIYGS i n psec for 360175 I
.9 :LOAD,A 7
I3 7
BM, 1 1.0
3z,2 1 .0
3p,: . 1.0
Finding .the amount of computer time t o execute ststement 9 i s a s follows:
Case . . Time (psec)
A < B 2.4
A = ' B , 3.4
A > B 4.4
Thus, ~ons iderab le diff,erence i n exes~ t ion . t ime occur. In fac t , t he
e x e c u t l o ~ time for ;he case where A > B i s almost double t ha t fo r the case
where -< < E. Coxpilers have no information upon which t o base the order i n
which the t s s t s -A-ould be "most optimally" performed and w i l l usually t rans la te
t h i s 13' i n 3 neckanical way (BN, BZ, EP) or (BP, EZ, BM), e tc . E n n , i f the
prograrmer lcnows t h e t A i s usuallf greater than B, t e must a lso k?ow
the order i n uhich the compiler w i l l s s t up th? t e s t s . There i s n3 log ica l
difference :n the fcllou3ng rearrangem~nt of statement 9 :
5: IF(B--~) 3, 2, 1
yet t h i s arrangement might take about half the time of the statement as
or ig ina l ly written! . .
Take the most extreme example:
Example 3
Timing' (p sec )
TEST
BZ, 2 . 1.0
ere the time taken fo r A < 0. i s 2 .1 psec and the time taken
f o r A > 0 i s 4 .1 psec, a considerable difference.
A complication (e i ther helpful or harmf'ul so f a r a s the t o t a l
execution time) appears when' one of the branch points immediately follows
the IF. For example :
Example 4 . .
9 IF@-B)I, 2 , 3
r" CONTINUE
This could be . t rans la ted :
Example 4a.
r n , B - 7
BMj 1 1.. 0 . . .
BZ, 2 1 . 0
BP, 3
2: e tc .
SUE, B
2: e tc .
Timings CPsec)
Exanple : 4.3. 4b.
In moet case;, compile-s make aoEe use of <he f a c t tha t the following
s ta te ren t na;r be one ci' the I F stat.ement branch ~ o i n t s , so z gain i n
I execution ;ine ,c,r, a t . least , no lcsss) i s possible i f a brzrc'n point of an
IF imcnedia-,ely follows the I F statemeat.
In o r e r t o Lee how c i f f i c u l t l i f e can .be, make the following
assumption the A > B branch I s most l i ke ly . If we write:
Example 5
9 IF:S-4) 2, 2, 1
2 CmTm
which t ranslz ted, might be :
LOAD, I3 .7'
SUB, A 7 BZ,2 1 .0
BP, 1 1.0
3 et.2.
The time t o reach stctement 3 i s 3.4 , ~ s e c .
I f we were t o move statement 3 away from t h e IF t e s t :
Example 5a 9. IF(B-A) 3, 2, 1
4 CONTINUE
and, i f t he compiler makes t h e minus t e s t f i r s t , t h e most frequently
followed path i s performed i n 2.4 psec . In many cases two of t he branch points of t he ari thmetic
'IF may be iden t ica l :
IF(A-B) 1, 2, 1
This i s generally t rans la ted as:
LOAD, A
SUB, B
BPM, 1
BZ, 2
The same s o r t of consideration holds as i n the discussion above where t h e
branch points were d i f fe ren t . One t e s t has s t i l l t o be performed f i r s t and
a proper asrangement of statement locat ions (e i ther following t h e I F o r not
following t h e IF) must be undertaken f o r maximum efficiency.
S m q : I n order t o determine how t o minimize execution time
t h e following must be considered:
1) Should 'the t e s t be reversed?
2) Should one of t h e branch points follow the, IF?
3 ) If the answer t o 2 ) i s yes, which brmch point should
be chosen?
4) 'Look a t t he machine language coding actual ly produced and
discover whether t he compiler has "outsmarted" you!
Method 4. LOGICfL IF
Tne glneral form of t h i s IF d l o v s several l o . ~ i c a l expressions t o
appear i n what laoks l i k e one great t e s t but the yrograrrwr should bear care-
f u l l y in mind th;%t the order i n which the expressions amzar may be qui te
important :
EWU: IF(A.I;T.B.GR.C.LT.D;: G O T O 100
This stat?men% i q l i e s t ha t i f e i t he r A > B o r C < D the branch to; statement
100 shculd 5e performed. It i s a b i t simpler f o r discussion purposes t o break
t h i s I F i n to two I F statements perfcrming.exactly the s m e log i ca l function:
This i s e s s e n t i a 1 . l ~ the way i n -&ich most cc~mpilers w i l l hreak .up the o r ig ina l .
IF befc-e t r a r s l a ~ i n g t o machin? 1an.guage. There i s no s t r i c t Fortran ru l e
t ha t the -bests are performed i n order f'rom l e f t t o r i gh t so some compilers,
i n the t rans la t ing process, coulri jus t a s wel l reverse t h? order:
IF(C.LT.D) GO TO 100
IF(A.GT.B) GO TO 100
Naturally, i f fhe resul t of one of t he two t e s t s , say A > 3, i s expected t o be
t r ue a large percentage of the time, it behooves the programmer t o have t h i s
t .est performed f i r s t . Unllss the c h ~ r a c t e r i s t i c s of the compiler a r e known
(a process 05' compiling a few sample I F ' S and h o k i n g st the machinz language
r e su l t ) , it i s gmera l l y wiser to brsak complicated IF 'S i n t o t h e i r simpler
form t o cc8ntrcl t h e order of t es t ing .
Fc.r example :
IF(A.GT.B.AND.C.LT.D.OR. I .GE.O) GO TO 100
This i s , logical ly , r e a l l y three I F statements
I F ( A . ~ . B ) GO TO 1
IF(C.LT.D) GO TO 100
Obviously, i f I i s going t o be zero or pos i t ive a large percentage of the
time the t e s t a s wri t ten could take on the order of three times as long a s
i t needs t o take i n ac tua l s i tuat ions . While the s tmpl ic i ty of s t r ing ing
the various log ica l expressions .has same advantage i n i n ~ t a n t a n e ~ u s recognition
of what the code i s doing, the application of E, l i t t l e knowledge concerning
the probable outcome of t he t e s t s could provide overriding advantages i n the
reduction of execution time.
The remarks made' i n the discussion' of Method 3 'concerning e i t h e r
simplifying or removing IF t e s t s from innermost'D0 loops apply exact ly the
same a s fo r Method 4.
One question which aprogrammer shodd ask i s :
How do t he following s e t s 'of coding compare? . .
a) . . IF@-B) 1, 100, ,100
1 CONTINllE
b ) IF(A.GE.B) GO TO 100
1 CONTINUE
In most cases, the resu l t ing code would be comparable but a look a t a
few examples of the compiler output would point out my spec i s l cases
where a time advantage could be achieved.
12
One d e t z i l concerns e i t k e r type of I F statement imbedded ins ide
nested 30 loops: t h e LOAD &KC TEET ill take 1.1 psec and t h e LOAD and SUB
w i l l t a i e 1 . 4 psec. T h e follcwing example i l l u s t r a t e s poss ible gains:
Example 6
BEFORE DO 1 1=1,100
DO 1 J=1,100
DO 1 K=1,100
IF(FLAG-1. ) 10,?0,1o
1 0 COIflINUE
- FORTRAN STATE?rIEN;rS - GO TO 1
20 COmTINJE
- FORTw 3TATEMERTS - 1 CON'TINJE
AFTER :
ExampLe 6~
FLAGA=FLZG-1.
DO 1 I=1,100
DO 1 J = l , l G O
DO 1 K=1,100
IF (FLAGA) 10, 20, 10
e tc .
In f a c t , depending on the complexity of t he 1oop.and 03 the r e l a t i v e
mount of time spent in performing t h e t e s t as contrasted t o t h e
remainder of t he calculations in t he innermost loop, i t i s en t i r e ly
possible t h a t the following can be much f a s t e r :
. Example 6~
100
IF(FLAG-1. ) loo, 200, 100
DO 101 I=1,100
DO 101 J=1,100
DO 101 K=1,100
CONTINUE
FORTRAN STATEMENTS - CONTINUE
GO TO 7
DO 201 I=1,100
DO 201 J=1,100
DO 201 K=1,100
CONTINUE
FORTRAN STATEMENTS - CONTINUE
CONTINUE
Jus t as constant sub-expressions i n ari thmetic replacement
statements should be removed from inner loops, so a l so should IF
expressions be simplif ied.
DO 1 I=1,10
DO 1 J=1,20
z = A(I,J)+(I:J)
DO 1 K=1,100
IF(Z-x(I,J,K~+~.) 2, 3, 4
e t c .
should b? rewrit ten as:
1'0 1 I=1,10
DO 1 J=1,20
z = A(I, J)+B(I, J)
"EETZ = Z + 1 .. DO 1 K=1,100
IF(TESTZ-x(I,J,K)) 2, 3, 4
2tc.
a1 timir-.gs given &re app-oximate m d depend on verious conditions
and assumptions as t o average use ~ . f t h e com~uter . The memory tirnings used
here a re fron - IBM SYSTEM/~~O NODEL FUNCTIONAL CHARACTERISTICS, Form
A22-6889-0. Repeated attempt a t o c'btain more deta i led information from
IBM concerning t h e assumptions used have met with f a i l u r e . No t i n ing
given here may be taken as ccrrect but t he general cons ide r~ t i ons are correct .
INTERNAL DISTRIBUTION
Central Research Library Document Reference Section Laboratory Records Laboratory Records - Record Copy Division of Technical Information Extension Laboratory and University Div,, OR) ORNL Patent Office J. G. Sullivan V. R. Cain W. B. Gardner A. A. Brooks ORCID Lis t
EXTERNAL DISTRIBUTION
731. J. W. Givens, Jr. - Applied Mathematics Division Argonne National Laboratory, 9700 South Cass Avenue Argonne, I l l i n o i s
732. R. P. Leinius - University of Wisconsin Computing Center 1210 W. Dayton S t ree t , Madison Wisconsin
733. J. A. Thompson - Control Data Corporatior- 4201 N. Lexington Avenue, St . Paul, Minn~aota
734. W. F. Miller - Stanford Linear Accelerator Center P. 0. Box 4349, Stanford, California