45
MIT Lincoln Laboratory 999999-1 XYZ 05/16/22 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working Group MIT Lincoln Laboratory January 4, 2007

MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

Embed Size (px)

Citation preview

Page 1: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory999999-1

XYZ 04/10/23

SSCA #3Sensor Processing

Knowledge Formationand Data I/OSerial v1.0

HPCS Productivity Benchmarks Working Group

MIT Lincoln LaboratoryJanuary 4, 2007

Page 2: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Outline

• Scalable Synthetic Compact Applications

• SSCA #3

– Overview

– Quick Recipe Data I/O Mode

• Implementation and Results

Page 3: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Full Apps

HPCSCompact

Apps

MicroBMKs

AP

P S

IZE

/CO

MP

LE

XIT

Y

SYSTEMSIZE/COMPLEXITY

NextGenApps

Identify which dimensions that must be examined at full

complexity and which dimensions that can be examined at reduced

scale while providing understanding of both full

applications today and future applications

Scalable Synthetic Compact Applications Goals

• Building on a motivation slide from Fred Johnson(15 January 2004)

Page 4: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

HPCS Benchmark SpectrumSSCA #3

Data Generator

1. Image Formation

2. Image Storage

3. Image Retrieval

4. Target ID

Data Generator

1. Image Formation

2. Image Storage

3. Image Retrieval

4. Target ID

Data Generator

1. Kernel

2. Kernel

3. Kernel

4. Kernel

Data Generator

1. Kernel

2. Kernel

3. Kernel

4. Kernel

Data Generator

1. Kernel

2. Kernel

3. Kernel

4. Kernel

Data Generator

1. Kernel

2. Kernel

3. Kernel

4. Kernel

Data Generator

1. Kernel

2. Kernel

3. Kernel

4. Kernel

Data Generator

1. Kernel

2. Kernel

3. Kernel

4. Kernel

Data Generator

1. Kernel

2. Kernel

3. Kernel

4. Kernel

Data Generator

1. Kernel

2. Kernel

3. Kernel

4. Kernel

Data Generator

1. Kernel

2. Kernel

3. Kernel

4. Kernel

Data Generator

1. Kernel

2. Kernel

3. Kernel

4. Kernel

HPCchallengeBenchmarks

Micro &Kernel

BenchmarksMission Partner

ApplicationBenchmarks

2.Graph

Analysis

2.Graph

Analysis

6.Signal

ProcessingKnowledgeFormation

Exi

stin

g A

pp

lica

tio

ns

Em

erg

ing

Ap

pli

cati

on

s

Fu

ture

Ap

pli

cati

on

s

Sim

ula

tio

nIn

telli

gen

ceR

eco

nn

aiss

ance5.

SimulationMulti-Physics

1.OptimalPattern

Matching

1.OptimalPattern

Matching

4.SimulationNAS PB AU

3.SimulationNWCHEM

Scalable SyntheticCompact Applications

HPCSSpanning

Set ofKernels

Kernels

DiscreteMath…GraphAnalysis…LinearSolvers…SignalProcessing…Simulation…I/O

ExecutionPerformance

Bounds

ExecutionPerformance

Indicators

LocalDGEMMSTREAM

RandomAccess1D FFT

GlobalLinpackPTRANS

RandomAccess1D FFT

CurrentUM2000GAMESS

OVERFLOWLBMHDRFCTHHYCOM

Near-FutureNWChemALEGRA

CCSM

Execution andDevelopment

Performance Indicators

System Bounds

Commercial ApplicationsMedical Imaging

Astronomical Image ProcessingEnvironmental Monitoring

Commercial ApplicationsMedical Imaging

Astronomical Image ProcessingEnvironmental Monitoring

Page 5: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Outline

• The Vision

• SSCA #3

– Overview

– Quick Recipe Data I/O Mode

• Implementation and Results

Page 6: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

• SSCA #3 Focuses on two stages:– Front end image processing and storage (Stage 1)– Back end image retrieval and knowledge formation (Stage 2)

• It is representative of many areas:– Medical imaging (e.g.: tumor growth)

Image many patients daily Later compare images of same patient over time

– Astronomical image processing (e.g.: monitor supernovae) Image many regions of the sky daily Later compare images of a region over time

– Reconnaissance monitoring (e.g.: enemy movement) Image many areas daily Later compare images of a given region over time

Overview

Page 7: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

• Benchmark stresses computation, communication, and data I/O

• Can be run in 3 modes:– System Mode: A combination of Compute & Data I/O Modes– Compute Mode (minimized Data I/O Mode)– Data I/O Mode (minimized Compute Mode)

• Principal performance goal is throughput– Maximize rate at which answers are generated– May overlap operation of data I/O and compute kernels– Data I/O and compute kernels may run on different systems– Some data is required to be contiguous

Overview

Page 8: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

SSCA #3 – System Mode

ComputationData I/O

Community has traditionally focused on

Computation …

… but Data I/O performance is

increasingly important

Coeffs,Group ofTemplates

Image Pair

Stage 1: Front-End Sensor Processing

Indices,Group of

Templates

Stage 2: Back-End Knowledge Formation

Validation

Group ofTemplates

RawData

SARImage

Scalable Data and Template

Generator

Kernel #2Image Storage

Groups of Templates Detection

Sub-Images

Grid ofImages

Detection Sub-Images

Detections,Template

Indices

Kernel #4 Detection

SARImage

TemplateInsertion

Kernel #3Image

Retrieval Templates &Indices

RawData

Image

ImagePair

Kernel #1 Data Readand Image Formation

Templates

Group of Templates

RawComplex

Data

CoeffsTemplate Positional Indices

Template Indices

Coeffs

Page 9: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

SARImage

Knowledge Formation

SARImage

File

RawSARFile

TemplateFiles

Groups ofTemplateFiles

RawSARFile

Kernel #2Image Storage

SARImage

FileDetection

File

Kernel #3Image

Retrieval

TemplateFiles

TemplateFiles

Groups of Template

Files

Sub-ImageDetectionFiles

Image Files

Sensor Processing

Raw SAR Data Files

ValidationDetectionsKernel #4

Detection

SARImage Pair

Templates

SSCA #3 – Compute Mode

RawSAR

Templates

SARImage

TemplateInsertion

Scalable Data and Template

Generator

Kernel #1 Image

Formation

Templates

Page 10: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

SSCA #3: Compute Mode Challenges

ValidationDetectionsKernel #4

Detection

SARImage

Templates

RawSAR

Templates

SARImage

TemplateInsertion

Scalable Data and Template

Generator

Kernel #1 Image

Formation

Templates

• Pulse compression• Polar Interpolation• FFT, IFFT (corner turn)

• Sequential store• Non-sequential retrieve• Large & small I/O

• Large Images difference & Threshold

• Many small correlations on selected pieces of a large image

• Scalable synthetic data generation

Front-End Sensor Processing

Back-End Knowledge Formation

Page 11: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

SSCA #3 – Data I/O Mode

Image Pair

Stage 1: Front-End

Group of Small Data

Stage 2: Back-End

Groups of Small Data

Groups ofSmall Data

LargeData

ImageScalable Data and Template

Generator

Kernel #2Image Storage

Groups of Small Data

Sub-Images Grid of Images

Sub-Images

Kernel #4 Kernel #3

Image Retrieval

LargeData

Image

ImagePair

Kernel #1 Data Readand Image Formation

LargeComplex

Data

Page 12: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

• The Vision

• SSCA #3

– Overview

– Quick Recipe Data I/O Mode

• Implementation and Results

Outline

Page 13: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Ingredients

To run Data I/O Mode, the user only needs set:

1) SCALE, 2) N_SDG_GROUPS, and 3) grid

Where:

• SCALE = a parameter that sets the size of raw input data, and image. It should be set so that these are a significant fraction of a single processor’s memory.

• N_SDG_GROUPS = number of raw input data and templates groups. It should be set large enough to avoid disk cache effects.

• And the number of images in the grid is: GRID_SIDE_SIZE x GRID_SIDE_SIZE x AV_GRID_DEPTH

AV_GRID

_DEPTH

GRID_SIDE_SIZE

GR

ID_S

IDE

_SIZ

E

Page 14: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Ingredients

Parameters to Code:

• PICTURE_SIZE = GRID_SIDE_SIZE2

is the number of images in a picture

• EST_TOT_GRID_SIZE = PICTURE_SIZE x AV_GRID_DEPTHis the total number of times that the input data will be retrieved, and the total number of images stored to the grid

• mc x n = is the size of the raw complex valued input datamc = 2 x ceil(80 x SCALE)n = 2 x ceil(158.496 x SCALE + 60)

• ROTATION_STEP is the templates’ rotation angle increment in degrees

• nDistinctLetters x nDistinctRotations is total number of pixelated templatesnDistinctLetters = number of least correlated letters in alphabet (21)nDistinctRotations = num of ROTATION_STEP angles between 0 and 360 degs

• FONT_SIZE x FONT_SIZE = size of a single template in pixels

Page 15: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Ingredients

Parameters to Code (Cont.):

• m x nx = size of an image

m = 2*ceil(mc/0.8405246)

k1n = 8.3776 x (1.5 -1/n)kxmin = sqrt(70.1841812-6.3165469 x (m/mc)2)kxmax = sqrt((4 x k1n.^2)-25.2661877 x (1/mc)2)nx = 2 x ceil(20 x SCALE*(kxmax-kxmin)/pi) + 20

• nSubImages = floor( pOccupancy x p2ndNot1st x (m /(SARLOBE_DISTANCE x FONT_SIZE)) x (nx/(SARLOBE_DISTANCE x FONT_SIZE)) )

= number of smaller images to be stored (by the last kernel), where:pOccupancy = 0.5 is the probability of template occupancy, andp2ndNot1st = 0.5 is the probability that a template appear in

the second image but not in the first

Total memory required, in bytes =

N_SDG_GROUPS x (8 x mc x n + 4 x nDistinctLetters x nDistinctRotations x FONT_SIZE2)+ EST_TOT_GRID_SIZE x (4 x m x nx + 4*nSubImages x (4 x FONT_SIZE)2)+ (coefficients, support and verification parameters; stored once)

• Grows with SCALE2

Page 16: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Directions

SDG

• Create a group– Create a random single precision complex valued (large) mc x n matrix– Store the data– Create a random real valued (small) FONT_SIZE x FONT_SIZE matrix – Store small matrix nDistinctLetters x nDistinctRotations times

• Copy the above group N_SDG_GROUPS times

STAGE 1

for iImage = 1 to EST_TOT_GRID_SIZE

KERNEL 1– Randomly pick and retrieve one of the N_SDG_GROUPS groups– Create a random single precision real valued m x nx matrix

KERNEL 2– Randomly select i and j values in the range [1, GRID_SIDE_SIZE] and use

these to create a filename.– Store the image matrix

end

Page 17: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Directions

STAGE 2

for iImageSeq = 1 to PICTURE_SIZE– Randomly select i and j values in the range [1, GRID_SIDE_SIZE]– Find the grid depth at this particular point

for k = 1 to gridPointDepth-2

KERNEL 3– Retrieve a pair of images, and an SDG group of templates

KERNEL 4

for l = 1 to nSubImages– Create a random (4 x FONT_SIZE) x (4 x FONT_SIZE) matrix– Store the sub image

end endend

Page 18: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Outline

• The Vision

• SSCA #3

– Overview

– Quick Recipe Data I/O Mode

• Implementation and Results

Page 19: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Types of Data I/O Implemented:

• FWRITE, binary, IEEE floating point with appropriate big or little-endian byte ordering and 32-bit data type

• HDF5, HDF5 32 bit float format

Modes:• System Mode

– Includes both Compute (SAR Processing), and Data I/O Modes.

• Compute Mode– Dials the smallest possible Grid of 2 images, thus minimizing data I/O.

• Data I/O Mode– Generates random data, thus foregoing SAR processing.

Outputs metrics at each level in the system’s hierarchy – Kernels, Stages, and Overall SSCA #3:

– Bytes, seconds, bandwidth (bytes/sec)

SSCA #3 Serial Release v1.0

Page 20: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

• One of many possible implementations

• Over 2200 lines of well commented MATLAB code. Carefully picked functional breakdown, data structures, variable names, and comments

• Coding standard: Modified “Programming in C++, Rules and Recommendations” by Mats Henricson and Erik Nyquist of Ellemtel Telecommunication System Laboratories, 1990-1992

• Development tools used– MATLAB Version 7.1.0.246 (R14) Service Pack 3 (version required)– Octave Version 2.9.5– Pentium® 4 2.66GHz CPU with 1.00GB of RAM, and 2.5GB of virtual RAM,

running on MS Windows XP Professional Version 2002 Service Pack 1– On a dedicated dual processor hyperthreaded P4 Xeon, 2.8 GHz, ½ MB

cache, GNU/Linux 2.4.20-28.9 (Redhat 9)

• Accompanying documentation: – Written Specification, and these slides– MANIFEST.txt – list of files with brief description– README.txt – installation and run time instructions; code overview– RELEASE_NOTES.txt – known outstanding issues in current release

SSCA #3 Serial Release v1.0

Page 21: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Serial FILE_IO_MODESCALE of 16, N_SDG_GROUPS of 2, and 8 image grid

0

10

20

30

40

50

60

70

0 1 2 3 4 5 6 7 8 9

Stage 1 Pass Number

Kernel 1 Read

Kernel 2 Write

Serial FILE_IO_MODESCALE of 16, N_SDG_GROUPS of 2, and 8 image grid

0

10

20

30

40

50

0 1 2 3 4

Stage 2 Pass Number

Kernel 3 Read

Kernel 4 Write

2 Processor Parallel FILE_IO_MODESCALE of 16, N_SDG_GROUPS of 2, and 8 image grid

0

10

20

30

40

50

60

0 1 2 3 4 5 6 7 8 9

Stage 1 Pass Number

Kernel 1 Read

Kernel 2 Write

2 Processor Parallel FILE_IO_MODESCALE of 16, N_SDG_GROUPS of 2, and 8 image grid

0

10

20

30

40

50

60

0 1 2 3 4

Stage 2 Pass Number

Kernel 3 Read

Kernel 4 Write

SSCA #3 Release v1.0a

Page 22: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Summary

Challenges:• Large scale parallel two-dimensional (2D) Inverse Fast Fourier Transform (IFFT); may require a ‘corner turn’ or

a ‘gather scatter’ (depending on architecture), with large quantities of data. Polar interpolation is known to be even more computationally intense than IFFT (Kernel 1).

• Streaming image data storage to a data I/O device (write) may involve large block data transfers, storing one large image after another (Kernel 2).

• Random location image sequence retrieval from a data I/O device (read) also involving large quantities of data, with possibly stressful spatial or temporal memory access patterns, and locality issues (Kernel 3).

• Small data I/O in all four kernels. Large data I/O in three of the four kernels.

• Many small convolutions on random pieces of a large image (Kernel 4).

Status:

• Written and Matlab Executable Specification v1.0 released June 22, 2006

• Architecture of Data I/O Mode – Martha Bancroft of Shomo Tech Systems, and Jeremy Kepner

• Works with Octave 2.9.5

• Written Specification – SAR Editor – Glenn Schrader, MIT Lincoln Laboratory

• C version based on release v1.0a (unofficial) – Meng-Ju of UMD, and Janice Onanian McMahon of USC/ISI

Page 23: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

SSCA #3

Backup Slides

Page 24: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

SSCA #3 Specification

• Intent• Overview• Compute Mode Main Components

– Synthetic Scalable Data Generator– Kernel 1 — SAR Image Formation– Template Insertion– Kernel 4 — Detection– Validation

• Data I/O Mode Main Components– Kernel 1 — Large & Small Data Retrieval– Image Grid– Kernel 2 — Image Storage– Kernel 3 — Image Retrieval– Kernel 4 — Small Image Storage

Page 25: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

The Vision ― Scalable Synthetic Compact Applications

• Bridge the gap between scalable synthetic kernel benchmarks and (non-scalable) real applications, and become an important benchmarking tool

• Is representative of real application workloads while not being numerically rigorous– memory access characteristics– communications characteristics– I/O characteristics

• Multi-processor compact application, designed to be easily scalable and verifiable

• No limits on the distribution to vendors and universities

• SSCAs represent a wide spectrum of potential HPCS Mission Partner applications

Page 26: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Executable Specification

What is an Executable Specification:• It implements the Written Specification, illustrating all specified properties;

it is just one of many possible implementations• It provides developers further insight into the corresponding Written

Specification• It is a tool for developers with which to validate their own work• It includes a serial version, and may include one or more approaches to a

parallel version• It must be easily readable and intelligible, through its choice of functional

structure, variable names, comments, and supporting documentation

Structure:• Scalable Data Generator

– Creates synthetic data that can be scaled to stress any computer from a single workstation to a petascale multiprocessor

• Kernels – timed computational algorithms• Verification – checks the correctness of select results• Validation – validates the resulting solution

Page 27: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

SSCA #3 Specification

• Intent• Overview• Compute Mode Main Components

– Synthetic Scalable Data Generator– Kernel 1 — SAR Image Formation– Template Insertion– Kernel 4 — Detection– Validation

• Data I/O Mode Main Components– Kernel 1 — Large & Small Data Retrieval– Image Grid– Kernel 2 — Image Storage– Kernel 3 — Image Retrieval– Kernel 4 — Small Image Storage

Page 28: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

SARImage

Knowledge Formation

SARImage

File

RawSARFile

TemplateFiles

Groups ofTemplateFiles

RawSARFile

Kernel #2Image Storage

SARImage

FileDetection

File

Kernel #3Image

Retrieval

TemplateFiles

TemplateFiles

Groups of Template

Files

Sub-ImageDetectionFiles

Image Files

Sensor Processing

Raw SAR Data Files

ValidationDetectionsKernel #4

Detection

SARImage Pair

Templates

SSCA #3 – Compute Only Mode

RawSAR

Templates

SARImage

TemplateInsertion

Scalable Data and Template

Generator

Kernel #1 Image

Formation

Templates

Page 29: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Spotlight SAR

Page 30: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

• Radar captures echo returns from a ‘swath’ on the ground

• Notional linear FM chirp pulse train, plus two ideally non-overlapping echoes returned from different positions on the swath

• Summation and scaling of echo returns realizes a challengingly long antenna aperture along the flight path

Compute Mode - SAR Overview

. . .

pulses swath

mntpmnuts )),(),(),(

delayed transmitted SAR waveform

reflection coefficient scale factor, different for each return from the swathreceived

‘raw’ SAR

Cross-Range, Y = 2Y0

Fixed to Broadside

Range, X = 2X0

Synthetic Aperture, L

Page 31: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Scalable Synthetic Data Generator

• Generates synthetic raw SAR complex data

• Data size is scalable to enable rigorous testing of high performance computing systems

– User defined scale factor determines the size of images generated

• Generates ‘templates’ that consist of rotated and pixelated capitalized letters

Cross-RangeR

ang

e

Spotlight SAR Returns

Page 32: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Kernel 1 — SAR Image Formation

s(,ku) f(x,y)

F(kx,ky)

Interpolationkx = sqrt(4k2 –ku

2)ky = ku

Matched Filtering

Fourier Transform(t,u)(ku)

Inverse Fourier Transform

(kx,ky) (x,y)

s*0(,ku)

s(t,u)

Received Samples Fit a Polar Swath

Processed SamplesFit a Rectangular Swath f

o

kx

ky

Range, Pixels

Cro

ss-R

ang

e, P

ixel

s

Spotlight SAR Reconstruction

Spatial Frequency Domain Interpolation

Page 33: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Template Insertion( not timed)

• Inserts rotated pixelated capital letter templates into each SAR image

– Non-overlapping locations and rotations– Randomly selects 50%– Used as ideal detection targets in Kernel 4

Y P

ixel

s

Y P

ixel

s

X Pixels X Pixels

Hypothetical %100 Insertion of Templates

Image Inserted with only %50-Random Templates

Page 34: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Kernel 4 — Detection

• Detects targets in SAR images1. Image difference2. Threshold3. Sub-regions 4. Correlate with every template

max is target ID

• Computationally difficult– Many small correlations over

random pieces of a large image• Requires 100% recognition and

no false alarms including objects that cross distributed• memory boundariesImage Difference

Image A

Image B

Thresholded

Sub-region Correlated

Page 35: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

ValidationDetectionsKernel #4

Detection

SARImage

Templates

RawSAR

Templates

SARImage

TemplateInsertion

Scalable Data and Template

Generator

Kernel #1 Image

Formation

Templates

Computational Challenges

• Pulse compression• Polar Interpolation• FFT, IFFT (corner turn)

• Sequential store• Non-sequential retrieve• Large & small IO

• Large Images difference & Threshold

• Many small correlations on selected pieces of a large image

• Scalable synthetic data generation

Front-End Sensor Processing

Back-End Knowledge Formation

Page 36: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

SSCA #3 Specification

• Intent• Overview• Compute Mode Main Components

– Synthetic Scalable Data Generator– Kernel 1 — SAR Image Formation– Template Insertion– Kernel 4 — Detection– Validation

• Data I/O Mode Main Components– Kernel 1 — Large & Small Data Retrieval– Image Grid– Kernel 2 — Image Storage– Kernel 3 — Image Retrieval– Kernel 4 — Small Image Storage

Page 37: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

SSCA #3 – Data I/O Mode

Image Pair

Stage 1: Front-End

Group of Small Data

Stage 2: Back-End

Groups of Small Data

Groups ofSmall Data

LargeData

ImageScalable Data and Template

Generator

Kernel #2Image Storage

Groups of Small Data

Sub-Images Grid of Images

Sub-Images

Kernel #4 Kernel #3

Image Retrieval

LargeData

Image

ImagePair

Kernel #1 Data Readand Image Formation

LargeComplex

Data

Page 38: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

LargeData

Kernel #1

Scalable Data Generator

Scalable Synthetic Data Generator

Associated Groups of Small Data

• Generates large complex data, and groups of small data.

• Writes a ‘dialed’ number of large complex data to external memory.

• For each large data, it writes a group of small data to external memory.

• Single precision

• Not timedLargeComplex

Data

Groups of Small

Data

Page 39: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Kernel 1 — Data Retrieval

• Randomly reads one large complex data from external memory, at each Stage 1 pass.

• Also reads associated group of small data from external memory, at each Stage 1 pass.

• Generates a single precision random image (of the size dialed by SCALE).

• I/O is timed

ImageKernel #1 Data Read

Stage 1: Front-End

LargeComplex

Data

LargeData

SmallData

Associated Groups of Small Data

Page 40: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Image Grid

• External memory image Grid is accessed by Kernels 2 & 3.

• It is scalable by image size, number of images.

• Image size requires a non-trivial amount of memory.

• Intended for dealing with enormous quantity of data, with simultaneous reads and writes.

Image grid, shown scaled to 80 images

Grid

Image

AV_GRID

_DEPTH

GRID_SIDE_SIZE

GR

ID_

SID

E_

SIZ

E

Page 41: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Kernel 2 — Image Storage

• Writes a different image to a random location in the external memory on the Grid at each Stage 1 pass.

• Images may be stored together, or in separate pieces (to allow simultaneous reading/writing of the same image).

• I/O is timed

Image

Image

Kernel #2Image Storage

Imagesin Grid

Stage 1: Front-End

• Computes filenames and addresses, and writes streaming data to random locations on Grid at each Stage 1 Front-End processing pass.

Page 42: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Kernel 3 — Image Retrieval

• From a random location in the Grid, it computes the address of an image sequence and reads a pair of its images until it reaches its full depth, at each Stage 2 pass.

• An image sequence is read through its entire Grid’s Depth.

• Also reads a group of small data at each Stage 2 pass.

• I/O is timed

Group of small

data

Stage 2: Back-End

Image PairKernel #3Image

Retrieval

Image

Image Grid

N_image x

N_image

N_grid x

N_grid

Templates

ImagesIn Grid

Page 43: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Kernels 2 and 3

Kernel 3Image Pair

Input

Additional notes:

• If an optimal scheme is picked for data storage, it may not be optimal for data retrieval, and vice versa.

• “Read behind Write” is allowed.

Kernel 2Image Output

Page 44: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

Kernel 4 — Small Image

Image pair

Sub-Image

Kernel #4 Small Image

Output

Sub-Images

• Writes labeled sub-images. This is repeated for each image pair, at each grid point, at each Stage 2 pass.

• I/O is timed

Stage 2: Back-End

Page 45: MIT Lincoln Laboratory 999999-1 XYZ 1/6/2014 SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0 HPCS Productivity Benchmarks Working

MIT Lincoln Laboratory04/10/23

References

• Carrara, Walter G., Ron S. Goodman and Ronald M. Majewski, Spotlight Synthetic Aperture Radar: Signal Processing Algorithms. Boston: Artech House, 1995.

• Corlander, John C. and Robert N. McDonough, Synthetic Aperture Radar: Systems and Signal Processing. New York: Wiley, 1991.

• Haney, R., Meuse T., Kepner, J., and Lebak, J., The HPEC Challenge Benchmark Suite, High Performance Embedded Computing Conference, Lexington, MA 2005.

• Jakowatz, Charles V., Jr., et al., Spotlight-Mode Synthetic Aperture Radar: A Signal Processing Approach. Boston Kluwer Academic Publishers,1996.

• Rihaczek, August W., Principles of High-Resolution Radar. Boston: Artech House 1996. Originally published: New York: McGraw-Hill, 1969.

• Stimson, George W., III, Introduction to Airborne Radar Second Edition. World Color Book Services, 1998.