13
The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for Bureau applications specialists for development and optimisation on SUN HPC platform Ilia Bermous Senior ITO, CAWCR 22 January 2010

The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

Embed Size (px)

Citation preview

Page 1: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

The Centre for Australian Weather and Climate ResearchA partnership between CSIRO and the Bureau of Meteorology

Basic Intel software requirements for Bureau applications specialists for development and

optimisation on SUN HPC platform

Ilia BermousSenior ITO, CAWCR22 January 2010

Page 2: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

Stages in Optimisation ProcessStages in Optimisation Process

Compilation and building an executable

Execution

Optimisation or code restructuring for better performance

Page 3: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

Compilation and Building Executable (1) Compilation and Building Executable (1)

Are any static analysis tools similar in flavour to Cray cflint available?

User should be able easily to identify what and how

Optimised

Vectorised

Parallelised

by the compiler an example can be transformation and formatted listing

provided with NEC SX compilers and cross-compilers

Page 4: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

ExampleExample

subroutine vec_par(a,b,c,n,m)

real, dimension (n,n,n) :: a,b,c

!cdir concur

do i=1,n

do k=1,m

do j=1,n

a(k,j,i)=b(k,j,i) + c(k,j,i)

end do

end do

end do

end

parallelisation directive

Page 5: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

Transformation ListingTransformation Listing

. do i = 1, n . if (n .gt. 0) then . J1 = and(n,3) . do j = 1, J1 . !CDIR NODEP . do k = 1, m . a(k,j,i) = b(k,j,i) + c(k,j,i) . end do . end do . do j = 1, n/4 . !CDIR NODEP . do k = 1, m . a(k,(j-1)*4+J1+1,i) = b(k,(j-1)*4+J1+1,i) + c(k,(j-1)* . 1 4+J1+1,i) . a(k,(j-1)*4+2+J1,i) = b(k,(j-1)*4+2+J1,i) + c(k,(j-1)* . 1 4+2+J1,i) . a(k,(j-1)*4+3+J1,i) = b(k,(j-1)*4+3+J1,i) + c(k,(j-1)* . 1 4+3+J1,i) . a(k,j*4+J1,i) = b(k,j*4+J1,i) + c(k,j*4+J1,i) . end do . end do . endif . end do . end do

loop interchange

loop unrolling

Page 6: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

Formatted ListingFormatted Listing

. . . 4: P------> do i=1,n

5: |X-----> do k=1,m

6: ||+----> do j=1,n

7: ||| a(k,j,i)=b(k,j,i) + c(k,j,i)

8: ||+---- end do

9: |X----- end do

10: P------ end do

. . .

loops 5-9 are interchanged and vectorised

loop 4-10 is parallelised

Page 7: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

Compilation and Building Executable (2) Compilation and Building Executable (2)

Requirement for more robust Fortran/C compilers we have been affected by compiler bugs: when we started to use Fortran compiler for our applications immediately an optimisation bug was detected, also a number of other compiler problems have been reported to Intel

Current Intel compiler versions still have too many bugs

According to a recent report (*), 169 bugs were fixed in the latest 11th version of Fortran compiler. From my point of view, some of them are very dangerous.

(*) http://software.intel.com/en-us/articles/intel-professional-edition-compilers-111-fixes-list/

Release notes for each compiler revision should include a section stating what has actually been implemented for this particular revision

Page 8: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

ExecutionExecution

At the end of execution important performance characteristics should be readily available to the user to be able to identify whether the application has run efficiently or not

On NEC SX for any Fortran application the following characteristics are printed out with an environment term setting

MFLOPS Vector Operation Ratio & Average Vector Length Instruction/Operand Cache miss time Bank Conflict time

without any impact on the application performance

NEC ftrace tool provides similar performance characteristics for any program unit compiled with a special “-ftrace” option

with code directives, this info can be obtained for any code sections, starting and ending anywhere in any program unit

Page 9: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

Program Information OutputProgram Information Output

Global Data of 3 processes : Min [U,R] Max [U,R] Average ==========================

Real Time (sec) : 544.678 [0,1] 554.766 [0,2] 549.728 User Time (sec) : 3383.378 [0,0] 3598.739 [0,2] 3479.353 System Time (sec) : 14.129 [0,1] 14.478 [0,2] 14.305 Vector Time (sec) : 334.675 [0,1] 346.198 [0,0] 340.617 Instruction Count : 38739990085 [0,1] 40868170971 [0,0] 40022002145 Vector Instruction Count : 7456076063 [0,1] 7942498968 [0,2] 7725412162 Vector Element Count : 997371328007 [0,1] 1069853462653 [0,2] 1031688872790 FLOP Count : 337475560162 [0,0] 342575608843 [0,2] 339235679815 MOPS : 297.648 [0,1] 313.572 [0,0] 305.847 MFLOPS : 95.193 [0,2] 99.745 [0,0] 97.547 Average Vector Length : 132.153 [0,0] 134.700 [0,2] 133.540 Vector Operation Ratio (%) : 96.881 [0,0] 97.050 [0,2] 96.963 Memory size used (MB) : 13040.000 [0,0] 13056.000 [0,1] 13050.667 MIPS : 11.210 [0,1] 12.079 [0,0] 11.510 Instruction Cache miss (sec): 23.864 [0,1] 24.546 [0,2] 24.153 Operand Cache miss (sec): 25.692 [0,0] 26.588 [0,2] 26.193 Bank Conflict Time (sec): 8.762 [0,0] 11.791 [0,2] 9.997 Max. Concurrent Processes : 8 [0,0] 8 [0,0] 8 MOPS (concurrent) : 2090.405 [0,0] 2141.803 [0,2] 2114.935 MFLOPS (concurrent) : 664.944 [0,0] 693.459 [0,1] 674.666 MIPS (concurrent) : 78.606 [0,2] 80.524 [0,0] 79.564 Event Busy Count : 0 [0,0] 0 [0,0] 0 Event Wait (sec) : 0.000 [0,0] 0.000 [0,0] 0.000 Lock Busy Count : 35636 [0,2] 39030 [0,0] 36770 Lock Wait (sec) : 2.106 [0,0] 2.487 [0,1] 2.331 Barrier Busy Count : 0 [0,0] 0 [0,0] 0 Barrier Wait (sec) : 0.000 [0,0] 0.000 [0,0] 0.000

Page 10: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

I/O InformationI/O Information ****** File Information ****** Unit No. : 20 File Name : BX2005092518 Named : YES Current Directory : /bm/flush3/iliab/gasp/test/2005092612

I/O Exec. Count : READ WRITE OPEN CLOSE INQUIRE 178 0 1 0 0 FIND DEFINE FILE 0 0

Format : UNFORMATTED Blank : ---- Access : DIRECT Recl (Byte) : 45056 Max Record No. : 3911 File Size (Byte) : 179818496 File Descriptor : 3 File System Type : NFS Open Mode : READWRITE Terminal Assignment : NO

I/O Buffer Size (KByte,F_SETBUF) : 1024

Total(In/Out) Input Output Total Data Size (Byte) : 8010552, 8010552, 0 Max Data Size (Byte) : 45056, 0 Min Data Size (Byte) : 35640, 0 Ave Data Size (Byte) : 45003, 45003, 0 Transfer Rate (KByte/sec) : 5746.793, 5746.793, 0.000

Total(In/Out/Aux) Input Output RTP-call Count : 535, 534, 0 System-call Count (read/write) Exec. Count : 68, 0 Ave Data Size (Byte) : 1048576, 0

Real Time (sec) : 1.367772, 1.361247, 0.000000 User Time (sec) : 0.007263, 0.007138, 0.000000

F_INPUT Option : NO F_OUTPUT Option : NO F_NORCW Option : NO F_PARTRCW Option : NO F_EXPRCW Option : NO F_UFMTFLOAT1 Option : NO F_UFMTFLOAT2 Option : NO F_UFMTIEEE Option : NO F_UFMTENDIAN Option : NO F_UFMTADJUST Option : NO F_HSDIR Option : NO F_VSPACING Option : NO F_PROMOTE Option : NO

Page 11: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

Optimisation (1)Optimisation (1)

Need to know

What are the primary performance characteristics for an application performance improvement?

How should these primary performance characteristics be measured?

How should these primary performance characteristics be addressed?

Page 12: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

Optimisation (2)Optimisation (2)

Optimisation manuals and documentation:

Manuals need to include description of technique illustrated by simple examples.

Significant improvement is required in the existing manuals, for example “Intel(R) Fortran Compiler Optimizing Applications Document Number: 307781-003US”

The document "Consistency of Floating-Point Results using the Intel Compiler" was very useful for understanding on how to get reproducible results, but it content should be included in the main manuals

Are there Intel websites available where further related information can be found?

Manuals and Release notes should be in one place with good indexing and searching.

Page 13: The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Basic Intel software requirements for

SummarySummary

We need to have a user friendly software environment at each stage during performance tuning procedure