Upload
buimien
View
216
Download
0
Embed Size (px)
Citation preview
COPR
IEEE 19th I AppliArchi Leuve IEEE CatISBN: Library o © 2008 reprint/repcollective componen
ONFROC
Internat
ication-itecture
en, Belg
talog numbe
of Congress
IEEE, Personpublish this mwords for rest of this work i
EREEED
tional Co
-Specifies and
ium, Jul
er: CF 97
: 20
nal use of material for adsale or redistrin other works
ENCDING
onferenc
ic SysteProcess
y 2 - 4,
FP08063 78-1-4244-1007908829
this materialdvertising or ribution to ser must be obtai
CE GS
ce on
ems, sors
2008
1898-5
is permittepromotional prvers or lists, ined from the I
ed. However, purposes or fo or to reuse IEEE.
permission or creating neany copyright
to ew ted
2008 International Conference on Application-specific Systems, Architectures and Processors (ASAP) Copyright © 2008 by the Institute of Electrical and Electronics Engineers, Inc. All rights reserved. Copyright and Reprint Permission : Abstracting is permitted with credit to the source. Libraries are permitted to photocopy beyond the limit of U.S. copyright law for private use of patrons those articles in this volume that carry a code at the bottom of the first page, provided the per-copy fee indicated in the code is paid through Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. For other copying, reprint or republication permission, write to IEEE Copyrights Manager, IEEE Operations Center, 445 Hoes Lane, P.O. Box, 1331, Piscataway, NJ 08855-1331. All rights reserved. IEEE Catalog number: CFP08063 ISBN: 978-1-4244-1898-5 Library of Congress: 2007908829 Additional copies of this publication are available from: IMEC vzw, tav Fred Loosen, Kapeldreef 75, B-3001 Leuven, Belgium. Phone: +32 16 281498. E-mail: [email protected]
COPR IEEE19th ApplArch Leuv
ONFROC
Interna
licationhitectur
ven, Belg
i
FERECEED
ational C
n-Specifres and
gium, Ju
ENCDIN
Conferen
fic Syst Proces
uly 2 - 4
CE NGS
nce on
tems, ssors
4, 2008
Messag Welcomspecific historic the Intelater deProcessoin 1996
The ASacademspecific a wide vhad a sautomatembeddcryptog
After mofor papecountrieprocess presentaLee (Pri
ge from t
me to ASAP Systems, university ernational Weveloped iors. With i. Since the
SAP conferia and ind computingvariety of aignificant ition, auto
ded systeraphic hard
ore than 2ers, 113 sues in Asia, resulted ination at thnceton Uni
the ASAP
P 2008: thArchitectu town of LeWorkshop nto the Ints current
en it has alt
ences proustry interg systems applicationmpact on
omatic paems, comdware and
0 years, Aubmissions Europe, Nn 36 papere confereniversity) an
P ’08 Gen
he 19th IEres and Preuven, Belon Systolicnternationatitle, it waternated b
vide a forrested in boand archit
n domains. systems arallelizatio
mputer ar systems.
SAP is still s were receNorth Amers for lectunce. These nd Gert Go
ii
neral and
EEE Internaocessors. Tgium. The c Arrays, oal Conferes organizeetween Eu
rum for reoth the funtectures, as Results prrchitecturen, and oithmetic,
a thriving eived. Theserica, and re presenta are compl
oossens (Ta
Technica
ational CoThis year’s history of organized ience on Ad for the frope and N
esearchersndamental s well as tresented ate, reconfiguon applica
signal a
conferencse submittSouth Amations and lemented barget Comp
al Progra
nference os event tak the event in 1986 in pplication first time inNorth-Ame
s and prac principles heir practit these conurable comation domand image
ce. In respoted papers
merica. A r 17 papersby keynotepiler Techn
am Chairs
on Applicatkes place in traces bac Oxford, USpecific A
n Chicago, rica.
ctitioners f of applicatcal adoptionferences h
mputing, deains suche process
onse to the came fromigorous re
s for interace talks by Rnologies).
s
tion-n the ck to K. It
Array USA
from tion-on in have esign h as sing,
e call m 25 view ctive Ruby
The ses
• A• S• A• N• N• N• A• A• A• In• N• Im
We thanpapers. Refereetime. WIMEC. WFred LoRavestewithout
We are the devlook for
Ingrid V
sion titles
Application-System-LevAdvances inNew CompuNovel ApplicNew DirectiAccelerationAdvanced CArithmetic nterconnec
Novel Procemage and
nk the key We also s who genWe are graWe particuloosen, Praveijn and M the selfles
proud of helopment oward to yo
Verbauwhed
summarize
-Specific Prvel Interconn Cryptograutational Mcations ons n of ScientiCommunica
ct and Mapessor and MVideo Proc
ynote speathank me
nerously oateful to oarly thankveen Ragh
Marcel Gorss contribut
istory of Aof new taleour particip
de, Dieder
e the variet
rocessor Innnect and aphy ethods
ific and DSations Appl
ping Memory Sycessing
kers and aembers of offered theur primary
k the great havan, Annrt. The cotions of the
ASAP, whichents, and apation in AS
ik Verkest
iii
ty of the to
nstruction-SMapping in
SP Applicatiications
ystem Tech
all the auth the Progeir expertisy sponsors efforts of nemie Stanference wese individ
h has prova platform SAP 2008 a
and Steve
opics cover
Sets n SoCs
ions
niques
hors who rram Commse and co – the IEEMurali Jays, Myriamwould notuals.
ided opporfor dissem
and in futu
Wilton
red:
responded mittee andmpleted t
EE Computyapala, And Janowski have run
rtunities fomination of
re years.
to the cald the Extehe reviewser Society dy Lambrec, Daniella n so smoo
r collaboranew ideas
ll for ernal s on and chts, Van othly
tion, . We
IE1
AA
L T A
A
KPr
KG
S
· ·
In
· · · · · ·
· ·
IEEISBLib
© 2crefro
EEE 9th Inte
ApplicatiArchitect
euven, B
TABLE O
SAP Organ
SAP Techn
eynote 1: rocessors,
eynote 2: Tood Tools,
ession 1: A
Fast CustBit Matrix
nteractive S
SynthesisFloating PFast MultiFault-ToleSecurity PFully-PipeTransformReconfiguRun-time
EE Catalog numbeBN: brary of Congress:
2008 IEEE, Personaeating new collectivem the IEEE.
rnationa
ion-Spetures an
Belgium,
OF CONT
nizing and S
ical Progra
Security an Ruby B. Le
The Art of Gert Goos
Application
om Instructx Multiplicati
Session 1 .
s of ApplicatPoint Multiplivariate Signerant DynamProcessor welined Efficiem ..............urable Viterb Thread Sor
er: CFP0806978-1-424
: 20079088
al use of this materiae words for resale or
al Confer
ecific Synd Proc
July 2 -
TENTS
Steering Co
am Commit
nd Ubiquityee ...........
Applicationssens .......
-Specific P
tion Identificion in Comm
..............
tion Acceleraication Rounature Genemically Recoith Quantum
ent Architect................bi Decoder orting to Expo
63 44-1898-5 829
al is permitted. Howe redistribution to ser
iv
ence on
ystems, essors
4, 2008
ommittees
ttee .........
y Opportun...............
n-Specific ...............
Processor In
cation by Comodity Proce
...............
ators on Runding Schemeration in Haonfigurable Nm Key Distritures for FP................on Mesh Conose Data-lev
ever, permission to rervers or lists, or to re
v
8
s ..............
...............
nities for A...............
Processor D...............
nstruction
onvex Subgessors ........
...............
ntime Reconmes for Inteardware: ThNoC-based Sibution .......
PGA Realizat................nnected Muvel Parallelis
eprint/republish this euse any copyrighted
...............
...............
pplication-...............
Design: Gr...............
Sets ........
raph Enume................
...............
nfigurable Herval Arithmhe Case of RSoC ..........................tion of Discr................ltiprocessorsm .........
material for advertisd component of this
..............
..............
Specific ..............
reat Artists..............
..............
eration .......................
..............
Hardware ....metic ..........Rainbow .....................................
rete Hadama................
r Architectur..............
sing or promotional pwork in other works
....... viii
........ ix
........ xi
s use ....... xiii
......... 1
........... 1
........... 7
....... 13
......... 13
......... 19
......... 25
......... 31
......... 37 ard ......... 43 re ...... 49 ........ 55
purposes or for must be obtained
v
Session 2: System-level Interconnect and Mapping in SoCs ........................... 61
· A New High-Performance Scalable Dynamic Interconnection for FPGA-based Reconfigurable Systems .................................................................................. 61
· Extending the SIMPPL SoC Architectural Framework to Support Application-Specific Architectures on Multi-FPGA Platforms ................................................... 67
· PERMAP: A Performance-Aware Mapping for Application-Specific SoCs .................. 73
Session 3: Advances in Cryptography ........................................................ 79
· Low-cost Implementations of NTRU for Pervasive Security .................................. 79 · On the High-Throughput Implementation of RIPEMD-160 Hash Algorithm ............. 85 · Zodiac: System Architecture Implementation for a High-Performance Network
Security Processor ........................................................................................ 91
Session 4: New Computational Methods ..................................................... 97
· Efficient Systolization of Cyclic Convolution for Systolic Implementation of Sinusoidal Transforms .................................................................................... 97
· Resource Efficient Generators for the Floating-point Uniform and Exponential Distributions ............................................................................................... 102
· Low Discrepancy Sequences for Monte Carlo Simulations on Reconfigurable Platforms .................................................................................................... 108
Session 5: Novel Applications ................................................................. 114
· A Subsampling Pulsed UWB Demodulator Based on a Flexible Complex SVD ......... 114 · Dynamically Reconfigurable Regular Expression Matching Architecture ................ 120 · An MPSoC Architecture for the Multiple Target Tracking Application in Driver
Assistant System ......................................................................................... 126
Session 6: New Directions in Application-Specific Design ............................ 132
· Managing Multi-Core Soft-Error Reliability Through Utility-driven Cross Domain Optimization ............................................................................................... 132
Interactive Session 2 ............................................................................. 138
· An Efficient Implementation Of A Phase Unwrapping Kernel On Reconfigurable Hardware ................................................................................................... 138
· A Parallel Hardware Architecture for Connected Component Labeling Based on Fast Label Merging ....................................................................................... 144
· Operation Shuffling over Cycle Boundaries for Low Energy L0 Clustering ............. 150 · An Efficient Digital Circuit for Implementing Sequence Alignment Algorithm in an
Extended Processor ...................................................................................... 156 · Concurrent Systolic Architecture for High-Throughput Implementation of 3-
Dimensional DWT ........................................................................................ 162 · Hierarchical Design Space Exploration of a Cooperative MIMO Receiver for
Reconfigurable Architectures ......................................................................... 167 · A Dynamic Holographic Reconfiguration on a Four-Context ODRGA ..................... 173
vi
· FGPA-based Hardware Accelerator of the Heat Equation with Applications on Infrared Thermography ................................................................................ 179
· FPGA Based Singular Value Decomposition for Image Processing Applications ....... 185
Session 7: Acceleration of Scientific and DSP Applications ........................... 191
· Accelerating Nussinov RNA secondary structure prediction with systolic arrays on FPGAs .................................................................................................... 191
· A Multi-FPGA Application-Specific Architecture for Accelerating a Floating Point Fourier Integral Operator .............................................................................. 197
· Reconfigurable Acceleration of Microphone Array Algorithms for Speech Enhancement .............................................................................................. 203
Session 8: Advanced Communications Applications .................................... 209
· Configurable and Scalable High Throughput Turbo Decoder Architecture for Multiple 4G Wireless Standards ...................................................................... 209
· Architecture and VLSI Realization of a High-Speed Programmable Decoder for LDPC Convolutional Codes ............................................................................ 215
· Buffer allocation for Advanced Packet Segmentation in Network Processors .......... 221
Session 9: Arithmetic............................................................................. 227
· New Insights on Ling Adders ......................................................................... 227 · Integer and Floating-Point Constant Multipliers for FPGAs .................................. 233 · An Efficient Method for Evaluating Polynomial and Rational Function
Approximations ........................................................................................... 239
Session 10: Interconnect and Mapping ..................................................... 245
· Mapping of the AES Cryptographic Algorithm on a Coarse-Grain Reconfigurable Array Processor ........................................................................................... 245
· RECONNECT: A NoC for polymorphic ASICs using a Low Overhead Single Cycle Router ....................................................................................................... 251
· Loop-Oriented Metrics for Exploring and Application-Specific Architecture Design-Space .............................................................................................. 257
Session 11: Novel Processor and Memory System Techniques ..................... 263
· Rapid Estimation of Instruction Cache Hit Rates Using Loop Profiling ................... 263 · Reducing Power Consumption of Embedded Processors through Register File
Partitioning and Compiler Support .................................................................. 269 · Lightweight DMA Management Mechanisms for Multiprocessors on FPGA .............. 275 · Memory Copies in Multi-Level Memory Systems ............................................... 281
Session 12: Image and Video Processing .................................................. 287
· Architecture of a Polymorphic ASIC for interoperability across multi-mode H.264 decoders .................................................................................................... 287
· An FPGA Architecture for CABAC Decoding in Many-core Systems ....................... 293
vii
· Novel Approach on Lifting-Based DWT and IDWT Processor with Multi-Context Configuration to Support Different Wavelet Filters ............................................ 299
· Throughput-Scalable Hybrid-Pipeline Architecture for Multilevel Lifting 2-D DWT of JPEG 2000 Coder ..................................................................................... 305
Author Index ........................................................................................ 310
ASAP Genera
Diederik Technic
Ingrid VSteve W Publica
Andy La Web Ma
Praveen Publicit
Murali J Local A
AnnemieFred Loo
ASAP Jose ForS-Y KunWayne LMichael Earl Swa
’08 Org
al Chair
k Verkest,
cal Progra
VerbauwhedWilton, Univ
ation Chai
ambrechts,
anagemen
n Raghavan
ty Chair
ayapala, IM
Arrangeme
e Stas, IMEosen, IMEC
Steerin
rtes, Univeng, PrincetoLuk, Imper Schulte, Uartzlander,
ganizing
IMEC, Belg
am Chairs
de, K.U.Leversity of B
r
IMEC, Bel
nt Chair
n, IMEC, Be
MEC, Belgi
ents
EC, BelgiumC, Belgium
ng Comm
ersity of Floon Universrial CollegeUniversity o, University
g Comm
gium
s
uven, BelgBritish Colu
gium
elgium
um
m
mittee
orida ity e of Wisconsiy of Texas
viii
mittee
ium umbia, Can
in
ada
ASAP El Mosta
Amirali
Jürgen B
Koen Be
Shuvra
Gordon
Geoffrey
Peter Ca
Joseph C
Chaitali
Karam C
Liang-G
George
Jean-Pie
Gerhard
Georgi N
Peter Ha
Paolo Ie
Tom Ke
Israel K
Georgi K
Pierre L
Ruby B
Miriam L
Philip Le
Dake Liu
Wayne L
Liam Ma
Grant M
Oskar M
Heinrich
Technic
apha Aboul
Baniasadi,
Becker, U.
ertels, Delf
Bhattachar
Brebner, X
y Brown, In
appello, U.
Cavallaro,
Chakrabar
Chatha, Ar
ee Chen, N
Constantin
erre David,
d Fettweis,
N. Gaydadj
allschmid,
enne, École
an, Algotro
oren, U. of
Kuzmanov,
anglois, Éc
Lee, Prince
Leeser, No
eong, Chine
u, Linköpin
Luk, Imper
arnane, U.
Martin, Tens
Mencer, Im
h Meyr, Rhe
cal Prog
lhamid, U.
U. of Victo
Karlsruhe
ft U. of Tec
ryya, U. of
Xilinx Inc.
ndia U.
of Californ
Rice U.
rti, Arizona
izona State
National Ta
nides, Impe
, École Poly
Dresden U
jiev, Delft U
U. of Britis
e Polytechn
onix
f Massachu
, Delft U. o
cole Polytec
eton U.
rtheastern
ese U. of H
ng Universi
rial College
College Co
silica
perial Colle
einisch-We
gram Co
de Montré
oria
chnology
f Maryland
nia at Sant
a State U.
e U.
aiwan U.
erial Colleg
ytechnique
U. of Techn
U. of Techn
sh Columbi
nique Fédér
usetts at Am
of Technolo
chnique de
U.
Hong Kong
ty
e
ork
ege
estfälische
ix
ommitte
éal
a Barbara
ge
e de Montré
nology
nology
a
rale de Lau
mherst
ogy
e Montréal
Technische
ee
éal
usanne
e Hochschuule Aachen
Tulika M
Jean-Mi
Alex Nic
Gabriela
Tobias N
Jari Nur
Peter Pi
Gang Qu
Patrice Q
Sanjay
Daler Ra
Tanguy
Frédéric
Kentaro
Yvon Sa
Lesley S
Dirk Str
Henry S
Juergen
Alexand
Lothar T
Tom Va
Ingrid V
Doran W
Steve W
Roger W
Pen-Chu
Cedric Y
Clifford
Mitra, Natio
chel Muller
colau, U. of
a Nicolescu
Noll, Rhein
rmi, Tampe
rsch, U. of
u, U. of Ma
Quinton, IR
Rajopadhy
akhmatov,
Risset, ÉN
c Rousseau
o Sano, Toh
avaria, Éco
Shannon, S
roobandt, G
Styles, Xilin
Teich, Erla
der Tenca,
Thiele, ETH
nder Aa, IM
Verbauwhed
Wilde, Brigh
Wilton, U. o
Woods, Que
ung Yew, U
Yiu, Hong K
Young, D.
onal U. of S
r, ÉNS de L
f California
u, École Pol
isch-Westf
ere Univers
f Hannover
aryland
RISA
ye, Colorad
U. of Victo
S de Lyon
u, TIMA
hoku U.
le Polytech
Simon Fras
Ghent Univ
nx Inc.
angen U.
Synopsys I
H
MEC
de, K.U.Le
ham Young
of British Co
een's U. of
U. of Minne
Kong Polyte
E. Shaw g
Singapore
Lyon
, Irvine
lytechnique
fälische Tec
sity of Tech
o State U.
oria
hnique de M
er Univers
versity
Inc.
uven
g U.
olumbia
Belfast
esota at Tw
echnic Univ
roup
x
e de Montr
chnische H
hnology
Montréal
ity
win Cities
versity
réal
ochschule Aachen
KeynoSecurSpecif Abstra
Applicatof a givimprovenumberissue ofinstructsecurityapplicatmake ASgeneraliprocessointo proto morefrom ap
Speak
Ruby BElectricaComputLaboratodesignininformacomputiEditor-in
Prior to HewlettmultimePA-RISCmultimemultimemultime
ote 1: rity and fic Proc
act
tion-specifiven importements in r of attacksf security ion proces
y? Can thtion, beyonSIPs more ized? This or architec
ocessors. Ite ubiquitoupplication-d
ker’s Bio
. Lee is thal Engineerter Scienceory for Mng securitytion, and ing paradign-Chief of I
joining th-Packard, edia architeC architectedia instruedia. She edia and da
Ubiquitcessors,
c processotant applicpower cons on compis now of
ssors (ASIhe hardwand cryptogr ubiquitoustalk will d
ctures to int will also ds deployme
driven desig
he Forrest ring at Prine DepartmMultimedia y-aware pdesigning
gms. She iIEEE Micro
he Princetoresponsib
ecture andure used fctions to co-led an
ata parallel
ty Oppo Ruby B
ors have bation or csumption,
puting, comf paramouPs) be deare providraphic acces? Can insiiscuss somnnovate andiscuss hoent of novegns.
G. Hamricnceton Unient. She and Secu
rocessors, innovativs a Fellow
o.
on faculty ible at did security for HP wormicroproce
n Intel-HP lism for 64
xi
ortunitieB. Lee
een designlass of ap cost, size
mmunicationt importa
esigned to de fundameleration? Aghts gaine
me of the ond lead thw ASIP deel and effic
ck Professoiversity, wis the direurity (PAL memories
ve instruct of the ACM
in 1998, Dfferent tiarchitecturrkstations essors, fac architectu
4-bit Intel m
es for Ap
ned for implications. and efficieons and enance. How significanmental seAt the samd from app
opportunitiee way in b
esigns can cient proce
or of Enginith an affilector of thLMS). Her s and systion-set arM, Fellow o
Dr. Lee sermes for re. She waand servercilitating uure team microproce
pplicatio
proving th They havency. Withntertainmenw can applntly improvcurity ena
me time, is plication-dres for applbuilding sebe “genera
essing tech
neering aniated appo
he Princeto current rstems, prorchitecture of the IEEE
rved as chprocessor
as a key ars. She pioubiquitous
designingssors. Sim
on-
he performe also ena the escalant devices,lication-speve applicatablers forthere a wariven desiglication-speecurity featalized”, leaniques glea
nd Professoointment inon Architecresearch i
otecting cr for emer
E and Asso
ief architer architectarchitect ofoneered adand perva new ISAultaneous
ance abled ating , the ecific tions r an ay to gn be ecific tures ading aned
or of n the cture is in ritical rging ciate
ct at ture, f the dding asive A for with
her full-at StanComputCornell 120 Unconferenmultime
-time HP teford Unive
ter ScienceUniversity
nited Statnce and edia and se
enure, sheersity. She e, both from, where shtes and ijournal p
ecurity topi
e was also has a Phm Stanfordhe was a internationapers on ics.
xii
Consultingh.D. in Eled UniversityCollege Sc
nal patent computer
g Professorectrical Engy, and an Acholar. Shes, and hr architec
r of Electricgineering A.B. with de has beenhas authoture, proc
cal Engineeand a M.S
distinction n granted red numecessor de
ering S. in from over
erous sign,
KeynoThe AArtist
Abstra
Applicatbuildingsystems
We belcontempcompile
Retargework, aconfigurenable based simpleme
Speaker
Gert GoproviderBefore fresearchsoftwaremodellinelectronelectrica
ote 2: Art of Apts use G
act
tion-Specif blocks o
s. In this p
ieve that porary ASIr that drive
etable softwnd extend rable procethe creatiosoftware dentations.
r’s Bio
oossens is tr of retargefounding Tah centre, we compilating and desnic design aal engineer
pplicatioood Too
ic Processof multi-cresentation
retargetaIP design. Aes the arch
ware tools the designessor tempon of diffedevelopme
the CEO anetable toolsarget in 19
where he heion. Gert Gsign, and hautomationring from K
on-Specols, Gert
sors (ASIPcore systen, we will r
ble softwaA central ehitectural e
introduce ner's capaplates as oerentiating nt tools a
nd a co-fous for the de
996, Gert Geaded rese
Goossens has authore
n. He receivK.U. Leuven
xiii
cific Proct Gooss
Ps) are clems-on-chireview met
are tools element of exploration
formalismbilities bey
offered by intellectuaand autom
under of Taesign of ap
Goossens wearch groupolds severa
ed or co-auved a mastn, in 1984
cessor Dsens
early becop that pthodologies
are an a such a too process.
m and corryond the aintellectuaal property
matically g
rget Comppplication-swas affiliateps on behaal patents
uthored aroters and a and 1989
Design:
oming accower todas for the de
absolute pol flow is a
rectness, erchitectural property y, fully sugenerated
piler Technospecific proed with theavioural synin the area
ound 40 pa Ph.D. degrrespective
Great
cepted as ay's electesign of AS
rerequisite retargetab
eliminate gal limitation vendors. T
upported bRTL hardw
ologies, a ocessors. e IMEC nthesis anda of procespers in ree in ly.
key ronic SIPs.
e for ble C
guess ns of They y C-ware
d ssor