Upload
landon-hammond
View
228
Download
3
Tags:
Embed Size (px)
Citation preview
MIT Lincoln Laboratory999999-1
XYZ 04/10/23
SSCA #3Sensor Processing
Knowledge Formationand Data I/OSerial v1.0
HPCS Productivity Benchmarks Working Group
MIT Lincoln LaboratoryJanuary 4, 2007
MIT Lincoln Laboratory04/10/23
Outline
• Scalable Synthetic Compact Applications
• SSCA #3
– Overview
– Quick Recipe Data I/O Mode
• Implementation and Results
MIT Lincoln Laboratory04/10/23
Full Apps
HPCSCompact
Apps
MicroBMKs
AP
P S
IZE
/CO
MP
LE
XIT
Y
SYSTEMSIZE/COMPLEXITY
NextGenApps
Identify which dimensions that must be examined at full
complexity and which dimensions that can be examined at reduced
scale while providing understanding of both full
applications today and future applications
Scalable Synthetic Compact Applications Goals
• Building on a motivation slide from Fred Johnson(15 January 2004)
MIT Lincoln Laboratory04/10/23
HPCS Benchmark SpectrumSSCA #3
Data Generator
1. Image Formation
2. Image Storage
3. Image Retrieval
4. Target ID
Data Generator
1. Image Formation
2. Image Storage
3. Image Retrieval
4. Target ID
Data Generator
1. Kernel
2. Kernel
3. Kernel
4. Kernel
Data Generator
1. Kernel
2. Kernel
3. Kernel
4. Kernel
Data Generator
1. Kernel
2. Kernel
3. Kernel
4. Kernel
Data Generator
1. Kernel
2. Kernel
3. Kernel
4. Kernel
Data Generator
1. Kernel
2. Kernel
3. Kernel
4. Kernel
Data Generator
1. Kernel
2. Kernel
3. Kernel
4. Kernel
Data Generator
1. Kernel
2. Kernel
3. Kernel
4. Kernel
Data Generator
1. Kernel
2. Kernel
3. Kernel
4. Kernel
Data Generator
1. Kernel
2. Kernel
3. Kernel
4. Kernel
Data Generator
1. Kernel
2. Kernel
3. Kernel
4. Kernel
HPCchallengeBenchmarks
Micro &Kernel
BenchmarksMission Partner
ApplicationBenchmarks
2.Graph
Analysis
2.Graph
Analysis
6.Signal
ProcessingKnowledgeFormation
Exi
stin
g A
pp
lica
tio
ns
Em
erg
ing
Ap
pli
cati
on
s
Fu
ture
Ap
pli
cati
on
s
Sim
ula
tio
nIn
telli
gen
ceR
eco
nn
aiss
ance5.
SimulationMulti-Physics
1.OptimalPattern
Matching
1.OptimalPattern
Matching
4.SimulationNAS PB AU
3.SimulationNWCHEM
Scalable SyntheticCompact Applications
HPCSSpanning
Set ofKernels
Kernels
DiscreteMath…GraphAnalysis…LinearSolvers…SignalProcessing…Simulation…I/O
ExecutionPerformance
Bounds
ExecutionPerformance
Indicators
LocalDGEMMSTREAM
RandomAccess1D FFT
GlobalLinpackPTRANS
RandomAccess1D FFT
CurrentUM2000GAMESS
OVERFLOWLBMHDRFCTHHYCOM
Near-FutureNWChemALEGRA
CCSM
Execution andDevelopment
Performance Indicators
System Bounds
Commercial ApplicationsMedical Imaging
Astronomical Image ProcessingEnvironmental Monitoring
Commercial ApplicationsMedical Imaging
Astronomical Image ProcessingEnvironmental Monitoring
MIT Lincoln Laboratory04/10/23
Outline
• The Vision
• SSCA #3
– Overview
– Quick Recipe Data I/O Mode
• Implementation and Results
MIT Lincoln Laboratory04/10/23
• SSCA #3 Focuses on two stages:– Front end image processing and storage (Stage 1)– Back end image retrieval and knowledge formation (Stage 2)
• It is representative of many areas:– Medical imaging (e.g.: tumor growth)
Image many patients daily Later compare images of same patient over time
– Astronomical image processing (e.g.: monitor supernovae) Image many regions of the sky daily Later compare images of a region over time
– Reconnaissance monitoring (e.g.: enemy movement) Image many areas daily Later compare images of a given region over time
Overview
MIT Lincoln Laboratory04/10/23
• Benchmark stresses computation, communication, and data I/O
• Can be run in 3 modes:– System Mode: A combination of Compute & Data I/O Modes– Compute Mode (minimized Data I/O Mode)– Data I/O Mode (minimized Compute Mode)
• Principal performance goal is throughput– Maximize rate at which answers are generated– May overlap operation of data I/O and compute kernels– Data I/O and compute kernels may run on different systems– Some data is required to be contiguous
Overview
MIT Lincoln Laboratory04/10/23
SSCA #3 – System Mode
ComputationData I/O
Community has traditionally focused on
Computation …
… but Data I/O performance is
increasingly important
Coeffs,Group ofTemplates
Image Pair
Stage 1: Front-End Sensor Processing
Indices,Group of
Templates
Stage 2: Back-End Knowledge Formation
Validation
Group ofTemplates
RawData
SARImage
Scalable Data and Template
Generator
Kernel #2Image Storage
Groups of Templates Detection
Sub-Images
Grid ofImages
Detection Sub-Images
Detections,Template
Indices
Kernel #4 Detection
SARImage
TemplateInsertion
Kernel #3Image
Retrieval Templates &Indices
RawData
Image
ImagePair
Kernel #1 Data Readand Image Formation
Templates
Group of Templates
RawComplex
Data
CoeffsTemplate Positional Indices
Template Indices
Coeffs
MIT Lincoln Laboratory04/10/23
SARImage
Knowledge Formation
SARImage
File
RawSARFile
TemplateFiles
Groups ofTemplateFiles
RawSARFile
Kernel #2Image Storage
SARImage
FileDetection
File
Kernel #3Image
Retrieval
TemplateFiles
TemplateFiles
Groups of Template
Files
Sub-ImageDetectionFiles
Image Files
Sensor Processing
Raw SAR Data Files
ValidationDetectionsKernel #4
Detection
SARImage Pair
Templates
SSCA #3 – Compute Mode
RawSAR
Templates
SARImage
TemplateInsertion
Scalable Data and Template
Generator
Kernel #1 Image
Formation
Templates
MIT Lincoln Laboratory04/10/23
SSCA #3: Compute Mode Challenges
ValidationDetectionsKernel #4
Detection
SARImage
Templates
RawSAR
Templates
SARImage
TemplateInsertion
Scalable Data and Template
Generator
Kernel #1 Image
Formation
Templates
• Pulse compression• Polar Interpolation• FFT, IFFT (corner turn)
• Sequential store• Non-sequential retrieve• Large & small I/O
• Large Images difference & Threshold
• Many small correlations on selected pieces of a large image
• Scalable synthetic data generation
Front-End Sensor Processing
Back-End Knowledge Formation
MIT Lincoln Laboratory04/10/23
SSCA #3 – Data I/O Mode
Image Pair
Stage 1: Front-End
Group of Small Data
Stage 2: Back-End
Groups of Small Data
Groups ofSmall Data
LargeData
ImageScalable Data and Template
Generator
Kernel #2Image Storage
Groups of Small Data
Sub-Images Grid of Images
Sub-Images
Kernel #4 Kernel #3
Image Retrieval
LargeData
Image
ImagePair
Kernel #1 Data Readand Image Formation
LargeComplex
Data
MIT Lincoln Laboratory04/10/23
• The Vision
• SSCA #3
– Overview
– Quick Recipe Data I/O Mode
• Implementation and Results
Outline
MIT Lincoln Laboratory04/10/23
Ingredients
To run Data I/O Mode, the user only needs set:
1) SCALE, 2) N_SDG_GROUPS, and 3) grid
Where:
• SCALE = a parameter that sets the size of raw input data, and image. It should be set so that these are a significant fraction of a single processor’s memory.
• N_SDG_GROUPS = number of raw input data and templates groups. It should be set large enough to avoid disk cache effects.
• And the number of images in the grid is: GRID_SIDE_SIZE x GRID_SIDE_SIZE x AV_GRID_DEPTH
AV_GRID
_DEPTH
GRID_SIDE_SIZE
GR
ID_S
IDE
_SIZ
E
MIT Lincoln Laboratory04/10/23
Ingredients
Parameters to Code:
• PICTURE_SIZE = GRID_SIDE_SIZE2
is the number of images in a picture
• EST_TOT_GRID_SIZE = PICTURE_SIZE x AV_GRID_DEPTHis the total number of times that the input data will be retrieved, and the total number of images stored to the grid
• mc x n = is the size of the raw complex valued input datamc = 2 x ceil(80 x SCALE)n = 2 x ceil(158.496 x SCALE + 60)
• ROTATION_STEP is the templates’ rotation angle increment in degrees
• nDistinctLetters x nDistinctRotations is total number of pixelated templatesnDistinctLetters = number of least correlated letters in alphabet (21)nDistinctRotations = num of ROTATION_STEP angles between 0 and 360 degs
• FONT_SIZE x FONT_SIZE = size of a single template in pixels
MIT Lincoln Laboratory04/10/23
Ingredients
Parameters to Code (Cont.):
• m x nx = size of an image
m = 2*ceil(mc/0.8405246)
k1n = 8.3776 x (1.5 -1/n)kxmin = sqrt(70.1841812-6.3165469 x (m/mc)2)kxmax = sqrt((4 x k1n.^2)-25.2661877 x (1/mc)2)nx = 2 x ceil(20 x SCALE*(kxmax-kxmin)/pi) + 20
• nSubImages = floor( pOccupancy x p2ndNot1st x (m /(SARLOBE_DISTANCE x FONT_SIZE)) x (nx/(SARLOBE_DISTANCE x FONT_SIZE)) )
= number of smaller images to be stored (by the last kernel), where:pOccupancy = 0.5 is the probability of template occupancy, andp2ndNot1st = 0.5 is the probability that a template appear in
the second image but not in the first
Total memory required, in bytes =
N_SDG_GROUPS x (8 x mc x n + 4 x nDistinctLetters x nDistinctRotations x FONT_SIZE2)+ EST_TOT_GRID_SIZE x (4 x m x nx + 4*nSubImages x (4 x FONT_SIZE)2)+ (coefficients, support and verification parameters; stored once)
• Grows with SCALE2
MIT Lincoln Laboratory04/10/23
Directions
SDG
• Create a group– Create a random single precision complex valued (large) mc x n matrix– Store the data– Create a random real valued (small) FONT_SIZE x FONT_SIZE matrix – Store small matrix nDistinctLetters x nDistinctRotations times
• Copy the above group N_SDG_GROUPS times
STAGE 1
for iImage = 1 to EST_TOT_GRID_SIZE
KERNEL 1– Randomly pick and retrieve one of the N_SDG_GROUPS groups– Create a random single precision real valued m x nx matrix
KERNEL 2– Randomly select i and j values in the range [1, GRID_SIDE_SIZE] and use
these to create a filename.– Store the image matrix
end
MIT Lincoln Laboratory04/10/23
Directions
STAGE 2
for iImageSeq = 1 to PICTURE_SIZE– Randomly select i and j values in the range [1, GRID_SIDE_SIZE]– Find the grid depth at this particular point
for k = 1 to gridPointDepth-2
KERNEL 3– Retrieve a pair of images, and an SDG group of templates
KERNEL 4
for l = 1 to nSubImages– Create a random (4 x FONT_SIZE) x (4 x FONT_SIZE) matrix– Store the sub image
end endend
MIT Lincoln Laboratory04/10/23
Outline
• The Vision
• SSCA #3
– Overview
– Quick Recipe Data I/O Mode
• Implementation and Results
MIT Lincoln Laboratory04/10/23
Types of Data I/O Implemented:
• FWRITE, binary, IEEE floating point with appropriate big or little-endian byte ordering and 32-bit data type
• HDF5, HDF5 32 bit float format
Modes:• System Mode
– Includes both Compute (SAR Processing), and Data I/O Modes.
• Compute Mode– Dials the smallest possible Grid of 2 images, thus minimizing data I/O.
• Data I/O Mode– Generates random data, thus foregoing SAR processing.
Outputs metrics at each level in the system’s hierarchy – Kernels, Stages, and Overall SSCA #3:
– Bytes, seconds, bandwidth (bytes/sec)
SSCA #3 Serial Release v1.0
MIT Lincoln Laboratory04/10/23
• One of many possible implementations
• Over 2200 lines of well commented MATLAB code. Carefully picked functional breakdown, data structures, variable names, and comments
• Coding standard: Modified “Programming in C++, Rules and Recommendations” by Mats Henricson and Erik Nyquist of Ellemtel Telecommunication System Laboratories, 1990-1992
• Development tools used– MATLAB Version 7.1.0.246 (R14) Service Pack 3 (version required)– Octave Version 2.9.5– Pentium® 4 2.66GHz CPU with 1.00GB of RAM, and 2.5GB of virtual RAM,
running on MS Windows XP Professional Version 2002 Service Pack 1– On a dedicated dual processor hyperthreaded P4 Xeon, 2.8 GHz, ½ MB
cache, GNU/Linux 2.4.20-28.9 (Redhat 9)
• Accompanying documentation: – Written Specification, and these slides– MANIFEST.txt – list of files with brief description– README.txt – installation and run time instructions; code overview– RELEASE_NOTES.txt – known outstanding issues in current release
SSCA #3 Serial Release v1.0
MIT Lincoln Laboratory04/10/23
Serial FILE_IO_MODESCALE of 16, N_SDG_GROUPS of 2, and 8 image grid
0
10
20
30
40
50
60
70
0 1 2 3 4 5 6 7 8 9
Stage 1 Pass Number
Kernel 1 Read
Kernel 2 Write
Serial FILE_IO_MODESCALE of 16, N_SDG_GROUPS of 2, and 8 image grid
0
10
20
30
40
50
0 1 2 3 4
Stage 2 Pass Number
Kernel 3 Read
Kernel 4 Write
2 Processor Parallel FILE_IO_MODESCALE of 16, N_SDG_GROUPS of 2, and 8 image grid
0
10
20
30
40
50
60
0 1 2 3 4 5 6 7 8 9
Stage 1 Pass Number
Kernel 1 Read
Kernel 2 Write
2 Processor Parallel FILE_IO_MODESCALE of 16, N_SDG_GROUPS of 2, and 8 image grid
0
10
20
30
40
50
60
0 1 2 3 4
Stage 2 Pass Number
Kernel 3 Read
Kernel 4 Write
SSCA #3 Release v1.0a
MIT Lincoln Laboratory04/10/23
Summary
Challenges:• Large scale parallel two-dimensional (2D) Inverse Fast Fourier Transform (IFFT); may require a ‘corner turn’ or
a ‘gather scatter’ (depending on architecture), with large quantities of data. Polar interpolation is known to be even more computationally intense than IFFT (Kernel 1).
• Streaming image data storage to a data I/O device (write) may involve large block data transfers, storing one large image after another (Kernel 2).
• Random location image sequence retrieval from a data I/O device (read) also involving large quantities of data, with possibly stressful spatial or temporal memory access patterns, and locality issues (Kernel 3).
• Small data I/O in all four kernels. Large data I/O in three of the four kernels.
• Many small convolutions on random pieces of a large image (Kernel 4).
Status:
• Written and Matlab Executable Specification v1.0 released June 22, 2006
• Architecture of Data I/O Mode – Martha Bancroft of Shomo Tech Systems, and Jeremy Kepner
• Works with Octave 2.9.5
• Written Specification – SAR Editor – Glenn Schrader, MIT Lincoln Laboratory
• C version based on release v1.0a (unofficial) – Meng-Ju of UMD, and Janice Onanian McMahon of USC/ISI
MIT Lincoln Laboratory04/10/23
SSCA #3
Backup Slides
MIT Lincoln Laboratory04/10/23
SSCA #3 Specification
• Intent• Overview• Compute Mode Main Components
– Synthetic Scalable Data Generator– Kernel 1 — SAR Image Formation– Template Insertion– Kernel 4 — Detection– Validation
• Data I/O Mode Main Components– Kernel 1 — Large & Small Data Retrieval– Image Grid– Kernel 2 — Image Storage– Kernel 3 — Image Retrieval– Kernel 4 — Small Image Storage
MIT Lincoln Laboratory04/10/23
The Vision ― Scalable Synthetic Compact Applications
• Bridge the gap between scalable synthetic kernel benchmarks and (non-scalable) real applications, and become an important benchmarking tool
• Is representative of real application workloads while not being numerically rigorous– memory access characteristics– communications characteristics– I/O characteristics
• Multi-processor compact application, designed to be easily scalable and verifiable
• No limits on the distribution to vendors and universities
• SSCAs represent a wide spectrum of potential HPCS Mission Partner applications
MIT Lincoln Laboratory04/10/23
Executable Specification
What is an Executable Specification:• It implements the Written Specification, illustrating all specified properties;
it is just one of many possible implementations• It provides developers further insight into the corresponding Written
Specification• It is a tool for developers with which to validate their own work• It includes a serial version, and may include one or more approaches to a
parallel version• It must be easily readable and intelligible, through its choice of functional
structure, variable names, comments, and supporting documentation
Structure:• Scalable Data Generator
– Creates synthetic data that can be scaled to stress any computer from a single workstation to a petascale multiprocessor
• Kernels – timed computational algorithms• Verification – checks the correctness of select results• Validation – validates the resulting solution
MIT Lincoln Laboratory04/10/23
SSCA #3 Specification
• Intent• Overview• Compute Mode Main Components
– Synthetic Scalable Data Generator– Kernel 1 — SAR Image Formation– Template Insertion– Kernel 4 — Detection– Validation
• Data I/O Mode Main Components– Kernel 1 — Large & Small Data Retrieval– Image Grid– Kernel 2 — Image Storage– Kernel 3 — Image Retrieval– Kernel 4 — Small Image Storage
MIT Lincoln Laboratory04/10/23
SARImage
Knowledge Formation
SARImage
File
RawSARFile
TemplateFiles
Groups ofTemplateFiles
RawSARFile
Kernel #2Image Storage
SARImage
FileDetection
File
Kernel #3Image
Retrieval
TemplateFiles
TemplateFiles
Groups of Template
Files
Sub-ImageDetectionFiles
Image Files
Sensor Processing
Raw SAR Data Files
ValidationDetectionsKernel #4
Detection
SARImage Pair
Templates
SSCA #3 – Compute Only Mode
RawSAR
Templates
SARImage
TemplateInsertion
Scalable Data and Template
Generator
Kernel #1 Image
Formation
Templates
MIT Lincoln Laboratory04/10/23
Spotlight SAR
MIT Lincoln Laboratory04/10/23
• Radar captures echo returns from a ‘swath’ on the ground
• Notional linear FM chirp pulse train, plus two ideally non-overlapping echoes returned from different positions on the swath
• Summation and scaling of echo returns realizes a challengingly long antenna aperture along the flight path
Compute Mode - SAR Overview
. . .
pulses swath
mntpmnuts )),(),(),(
delayed transmitted SAR waveform
reflection coefficient scale factor, different for each return from the swathreceived
‘raw’ SAR
Cross-Range, Y = 2Y0
Fixed to Broadside
Range, X = 2X0
Synthetic Aperture, L
MIT Lincoln Laboratory04/10/23
Scalable Synthetic Data Generator
• Generates synthetic raw SAR complex data
• Data size is scalable to enable rigorous testing of high performance computing systems
– User defined scale factor determines the size of images generated
• Generates ‘templates’ that consist of rotated and pixelated capitalized letters
Cross-RangeR
ang
e
Spotlight SAR Returns
MIT Lincoln Laboratory04/10/23
Kernel 1 — SAR Image Formation
s(,ku) f(x,y)
F(kx,ky)
Interpolationkx = sqrt(4k2 –ku
2)ky = ku
Matched Filtering
Fourier Transform(t,u)(ku)
Inverse Fourier Transform
(kx,ky) (x,y)
s*0(,ku)
s(t,u)
Received Samples Fit a Polar Swath
Processed SamplesFit a Rectangular Swath f
o
kx
ky
Range, Pixels
Cro
ss-R
ang
e, P
ixel
s
Spotlight SAR Reconstruction
Spatial Frequency Domain Interpolation
MIT Lincoln Laboratory04/10/23
Template Insertion( not timed)
• Inserts rotated pixelated capital letter templates into each SAR image
– Non-overlapping locations and rotations– Randomly selects 50%– Used as ideal detection targets in Kernel 4
Y P
ixel
s
Y P
ixel
s
X Pixels X Pixels
Hypothetical %100 Insertion of Templates
Image Inserted with only %50-Random Templates
MIT Lincoln Laboratory04/10/23
Kernel 4 — Detection
• Detects targets in SAR images1. Image difference2. Threshold3. Sub-regions 4. Correlate with every template
max is target ID
• Computationally difficult– Many small correlations over
random pieces of a large image• Requires 100% recognition and
no false alarms including objects that cross distributed• memory boundariesImage Difference
Image A
Image B
Thresholded
Sub-region Correlated
MIT Lincoln Laboratory04/10/23
ValidationDetectionsKernel #4
Detection
SARImage
Templates
RawSAR
Templates
SARImage
TemplateInsertion
Scalable Data and Template
Generator
Kernel #1 Image
Formation
Templates
Computational Challenges
• Pulse compression• Polar Interpolation• FFT, IFFT (corner turn)
• Sequential store• Non-sequential retrieve• Large & small IO
• Large Images difference & Threshold
• Many small correlations on selected pieces of a large image
• Scalable synthetic data generation
Front-End Sensor Processing
Back-End Knowledge Formation
MIT Lincoln Laboratory04/10/23
SSCA #3 Specification
• Intent• Overview• Compute Mode Main Components
– Synthetic Scalable Data Generator– Kernel 1 — SAR Image Formation– Template Insertion– Kernel 4 — Detection– Validation
• Data I/O Mode Main Components– Kernel 1 — Large & Small Data Retrieval– Image Grid– Kernel 2 — Image Storage– Kernel 3 — Image Retrieval– Kernel 4 — Small Image Storage
MIT Lincoln Laboratory04/10/23
SSCA #3 – Data I/O Mode
Image Pair
Stage 1: Front-End
Group of Small Data
Stage 2: Back-End
Groups of Small Data
Groups ofSmall Data
LargeData
ImageScalable Data and Template
Generator
Kernel #2Image Storage
Groups of Small Data
Sub-Images Grid of Images
Sub-Images
Kernel #4 Kernel #3
Image Retrieval
LargeData
Image
ImagePair
Kernel #1 Data Readand Image Formation
LargeComplex
Data
MIT Lincoln Laboratory04/10/23
LargeData
Kernel #1
Scalable Data Generator
Scalable Synthetic Data Generator
Associated Groups of Small Data
• Generates large complex data, and groups of small data.
• Writes a ‘dialed’ number of large complex data to external memory.
• For each large data, it writes a group of small data to external memory.
• Single precision
• Not timedLargeComplex
Data
Groups of Small
Data
MIT Lincoln Laboratory04/10/23
Kernel 1 — Data Retrieval
• Randomly reads one large complex data from external memory, at each Stage 1 pass.
• Also reads associated group of small data from external memory, at each Stage 1 pass.
• Generates a single precision random image (of the size dialed by SCALE).
• I/O is timed
ImageKernel #1 Data Read
Stage 1: Front-End
LargeComplex
Data
LargeData
SmallData
Associated Groups of Small Data
MIT Lincoln Laboratory04/10/23
Image Grid
• External memory image Grid is accessed by Kernels 2 & 3.
• It is scalable by image size, number of images.
• Image size requires a non-trivial amount of memory.
• Intended for dealing with enormous quantity of data, with simultaneous reads and writes.
Image grid, shown scaled to 80 images
Grid
Image
AV_GRID
_DEPTH
GRID_SIDE_SIZE
GR
ID_
SID
E_
SIZ
E
MIT Lincoln Laboratory04/10/23
Kernel 2 — Image Storage
• Writes a different image to a random location in the external memory on the Grid at each Stage 1 pass.
• Images may be stored together, or in separate pieces (to allow simultaneous reading/writing of the same image).
• I/O is timed
Image
Image
Kernel #2Image Storage
Imagesin Grid
Stage 1: Front-End
• Computes filenames and addresses, and writes streaming data to random locations on Grid at each Stage 1 Front-End processing pass.
MIT Lincoln Laboratory04/10/23
Kernel 3 — Image Retrieval
• From a random location in the Grid, it computes the address of an image sequence and reads a pair of its images until it reaches its full depth, at each Stage 2 pass.
• An image sequence is read through its entire Grid’s Depth.
• Also reads a group of small data at each Stage 2 pass.
• I/O is timed
Group of small
data
Stage 2: Back-End
Image PairKernel #3Image
Retrieval
Image
Image Grid
N_image x
N_image
N_grid x
N_grid
Templates
ImagesIn Grid
MIT Lincoln Laboratory04/10/23
Kernels 2 and 3
Kernel 3Image Pair
Input
Additional notes:
• If an optimal scheme is picked for data storage, it may not be optimal for data retrieval, and vice versa.
• “Read behind Write” is allowed.
Kernel 2Image Output
MIT Lincoln Laboratory04/10/23
Kernel 4 — Small Image
Image pair
Sub-Image
Kernel #4 Small Image
Output
Sub-Images
• Writes labeled sub-images. This is repeated for each image pair, at each grid point, at each Stage 2 pass.
• I/O is timed
Stage 2: Back-End
MIT Lincoln Laboratory04/10/23
References
• Carrara, Walter G., Ron S. Goodman and Ronald M. Majewski, Spotlight Synthetic Aperture Radar: Signal Processing Algorithms. Boston: Artech House, 1995.
• Corlander, John C. and Robert N. McDonough, Synthetic Aperture Radar: Systems and Signal Processing. New York: Wiley, 1991.
• Haney, R., Meuse T., Kepner, J., and Lebak, J., The HPEC Challenge Benchmark Suite, High Performance Embedded Computing Conference, Lexington, MA 2005.
• Jakowatz, Charles V., Jr., et al., Spotlight-Mode Synthetic Aperture Radar: A Signal Processing Approach. Boston Kluwer Academic Publishers,1996.
• Rihaczek, August W., Principles of High-Resolution Radar. Boston: Artech House 1996. Originally published: New York: McGraw-Hill, 1969.
• Stimson, George W., III, Introduction to Airborne Radar Second Edition. World Color Book Services, 1998.