Upload
terena
View
31
Download
2
Embed Size (px)
DESCRIPTION
Parallel Optimization Tools for High Performance Design of Integrated Circuits. Azadeh Davoodi Assistant Professor (joint work with my student Tai-Hsuan Wu) Department of Electrical and Computer Engineering. WISCAD VLSI Design Automation Lab http://wiscad.ece.wisc.edu. Thanks to: - PowerPoint PPT Presentation
Citation preview
Parallel Optimization Tools for High Performance Design of
Integrated Circuits
WISCAD VLSI Design Automation Lab http://wiscad.ece.wisc.edu
Azadeh DavoodiAssistant Professor
(joint work with my student Tai-Hsuan Wu)
Department of Electrical and Computer Engineering
Thanks to:Jeff Linderoth
Research: Optimality in IC Design
Optimality:– required to assess the quality of existing design techniques– currently use heuristics to solve large-scale, non-linear and
discrete optimization problems– have no idea how far might
be from the optimal solution
2
Months Late
Pro
fit
0 3 6 9 12 15-20%
0%
20%
40%
60%
80%
100%
120%
So
urc
e: M
IPS
Tec
hn
olo
gie
s “Optimality matters to shorten the design cycle of Integrated Circuits and meet stringent time-to-market
requirements.”
Azadeh Davoodi--WISCAD
Optimization for High Performance Design
• Discrete optimization problem• Typically the relaxed continuous
version is solved as a convex program and the result is discretized
..
.... ...
2n
k1
k3
k2
Kn-1
Kn
sL sH
sL sH sL sH
},...,,{
)(
)()(2
)(1
iki
iii
iconsPj
j
QQQx
PTxdi
j
dj
Tcons)(min xCost
Azadeh Davoodi--WISCAD 3
Examples of Optimization Complexity
4
Bench # of Variables
Exhaustive Search Size
Reduced Search Size
Level in Search Tree
c5315 705 > E+230 E+10 35.11c7552 822 > E+230 E+08 26.93c6288 1256 > E+230 E+11 33.98s1488 307 E+230 E+11 32.19s1494 309 E+227 E+09 30.23s9234 740 > E+230 E+07 18.77s5378 930 > E+230 E+09 29.39s38584 6950 > E+230 E+09 47.94s35932 7260 > E+230 E+10 59.17b20 24484 > E+230 E+12 68.34
Azadeh Davoodi--WISCAD
Using Master-Worker Framework of Condor for Grid Optimization
C++ APIs which facilitate:– dynamic and
opportunistic resource utilization
– fault tolerant implementation via checkpointing and job migration
Master
T1 T2 T3 T4
Unprocessed Tasks Finished Tasks
Tasks in process
Workers
W1 W2 W3 W4
T5 T6 T7 T8T9
http://www.cs.wisc.edu/condor/mw
5Azadeh Davoodi--WISCAD
Master-Worker Implementation for High Performance IC Design
Master:– imposes variable ordering in the
branch-and-bound search tree– applies pruning of sub-optimal
branches– check points after every 5000
completed tasks by workers
Azadeh Davoodi--WISCAD 6
..
.
... ...
2n
sL sH
sL sH sL sH
Worker:– each worker computes upper and lower bounds for K
number of nodes in the search tree sequentially and communicates the bounds to the Master
Dealing with Communication Overhead
• 3 types of data exchange between the Master and each Worker:– scalar upper and lower bounds – circuit information (optimization problem
description)– partial variable assignment
• Send above only once when the worker is allocated and reuse each worker for future tasks as much as possible
Azadeh Davoodi--WISCAD 7
MW Implementation in CondorMASTER SUBMIT FILE
Universe = Scheduler
Executable = master_DGS_socket
Image_Size = 100000
+MemoryRequirements = 100
Input = in_master.socket
Output = out_master.socket
Error = out_worker.socket
Log = _DGS.log
Requirements = (Arch == "INTEL" && OPSYS=="LINUX")
getenv = True
Queue
WORKER SUBMIT FILE
Universe = Vanilla
#Worker 1Executable= exec0.$$(Opsys).$$(Arch).exe arguments = 0 8997 8997 144.92.240.35
Log = log_file
Output = output_file.0
Error = error_file.0
Requirements = ( Arch=="INTEL“ && OPSYS=="LINUX")
should_transfer_files = Yes when_to_transfer_output = ON_EXIT
rank = Mips
on_exit_remove = false
Queue
#Worker 2 …
Azadeh Davoodi--WISCAD 8
Resource Information: • 179 CAE machines Intel/Linux• If all CAE are in use, “Flocks” to the queue of Intel/Linux machines in CS
Results
Azadeh Davoodi--WISCAD 9
Bench # variables Runtime Max # Workers
Average # Workers
c5315 705 36min 118 105.74c6288 1256 66min 126 114.94c7552 822 31min 113 101.97s5378 930 39min 129 95.57s9234 740 52min 119 95.8s15850 617 48min 139 112.67s35932 7260 36min 163 108.15s38584 6950 62min 133 113.86b18 47191 52hours 192 189.82b20 15699 28hours 187 167.29b22 24484 38hours 190 173.73
On-an-average each variable had 4.5 discrete options to choose from.
Future Plans• Install and work with personalized Condor• Work with larger circuits and more number of
sites in addition to CAE and CS• Study possibilities for optimization on a grid of
multi-core machines• Better understand and work around the priority
scheduling of jobs at Condor
Azadeh Davoodi--WISCAD 10