10
Parallel Optimization Tools for High Performance Design of Integrated Circuits WISCAD VLSI Design Automation Lab http://wiscad.ece.wisc.edu Azadeh Davoodi Assistant Professor (joint work with my student Tai-Hsuan Wu) Department of Electrical and Computer Engineering Thanks to: Jeff Linderoth

Parallel Optimization Tools for High Performance Design of Integrated Circuits

  • Upload
    terena

  • View
    31

  • Download
    2

Embed Size (px)

DESCRIPTION

Parallel Optimization Tools for High Performance Design of Integrated Circuits. Azadeh Davoodi Assistant Professor (joint work with my student Tai-Hsuan Wu) Department of Electrical and Computer Engineering. WISCAD VLSI Design Automation Lab http://wiscad.ece.wisc.edu. Thanks to: - PowerPoint PPT Presentation

Citation preview

Page 1: Parallel Optimization Tools for High Performance Design of Integrated Circuits

Parallel Optimization Tools for High Performance Design of

Integrated Circuits

WISCAD VLSI Design Automation Lab http://wiscad.ece.wisc.edu

Azadeh DavoodiAssistant Professor

(joint work with my student Tai-Hsuan Wu)

Department of Electrical and Computer Engineering

Thanks to:Jeff Linderoth

Page 2: Parallel Optimization Tools for High Performance Design of Integrated Circuits

Research: Optimality in IC Design

Optimality:– required to assess the quality of existing design techniques– currently use heuristics to solve large-scale, non-linear and

discrete optimization problems– have no idea how far might

be from the optimal solution

2

Months Late

Pro

fit

0 3 6 9 12 15-20%

0%

20%

40%

60%

80%

100%

120%

So

urc

e: M

IPS

Tec

hn

olo

gie

s “Optimality matters to shorten the design cycle of Integrated Circuits and meet stringent time-to-market

requirements.”

Azadeh Davoodi--WISCAD

Page 3: Parallel Optimization Tools for High Performance Design of Integrated Circuits

Optimization for High Performance Design

• Discrete optimization problem• Typically the relaxed continuous

version is solved as a convex program and the result is discretized

..

.... ...

2n

k1

k3

k2

Kn-1

Kn

sL sH

sL sH sL sH

},...,,{

)(

)()(2

)(1

iki

iii

iconsPj

j

QQQx

PTxdi

j

dj

Tcons)(min xCost

Azadeh Davoodi--WISCAD 3

Page 4: Parallel Optimization Tools for High Performance Design of Integrated Circuits

Examples of Optimization Complexity

4

Bench # of Variables

Exhaustive Search Size

Reduced Search Size

Level in Search Tree

c5315 705 > E+230 E+10 35.11c7552 822 > E+230 E+08 26.93c6288 1256 > E+230 E+11 33.98s1488 307 E+230 E+11 32.19s1494 309 E+227 E+09 30.23s9234 740 > E+230 E+07 18.77s5378 930 > E+230 E+09 29.39s38584 6950 > E+230 E+09 47.94s35932 7260 > E+230 E+10 59.17b20 24484 > E+230 E+12 68.34

Azadeh Davoodi--WISCAD

Page 5: Parallel Optimization Tools for High Performance Design of Integrated Circuits

Using Master-Worker Framework of Condor for Grid Optimization

C++ APIs which facilitate:– dynamic and

opportunistic resource utilization

– fault tolerant implementation via checkpointing and job migration

Master

T1 T2 T3 T4

Unprocessed Tasks Finished Tasks

Tasks in process

Workers

W1 W2 W3 W4

T5 T6 T7 T8T9

http://www.cs.wisc.edu/condor/mw

5Azadeh Davoodi--WISCAD

Page 6: Parallel Optimization Tools for High Performance Design of Integrated Circuits

Master-Worker Implementation for High Performance IC Design

Master:– imposes variable ordering in the

branch-and-bound search tree– applies pruning of sub-optimal

branches– check points after every 5000

completed tasks by workers

Azadeh Davoodi--WISCAD 6

..

.

... ...

2n

sL sH

sL sH sL sH

Worker:– each worker computes upper and lower bounds for K

number of nodes in the search tree sequentially and communicates the bounds to the Master

Page 7: Parallel Optimization Tools for High Performance Design of Integrated Circuits

Dealing with Communication Overhead

• 3 types of data exchange between the Master and each Worker:– scalar upper and lower bounds – circuit information (optimization problem

description)– partial variable assignment

• Send above only once when the worker is allocated and reuse each worker for future tasks as much as possible

Azadeh Davoodi--WISCAD 7

Page 8: Parallel Optimization Tools for High Performance Design of Integrated Circuits

MW Implementation in CondorMASTER SUBMIT FILE

Universe   = Scheduler

Executable = master_DGS_socket

Image_Size = 100000

+MemoryRequirements = 100

Input   = in_master.socket

Output  = out_master.socket

Error  = out_worker.socket

Log   = _DGS.log

Requirements = (Arch == "INTEL" && OPSYS=="LINUX")

getenv  = True

Queue

WORKER SUBMIT FILE

Universe = Vanilla

#Worker 1Executable= exec0.$$(Opsys).$$(Arch).exe arguments = 0 8997 8997 144.92.240.35

Log = log_file

Output = output_file.0

Error = error_file.0

Requirements = ( Arch=="INTEL“ && OPSYS=="LINUX")

should_transfer_files = Yes when_to_transfer_output = ON_EXIT

rank = Mips

on_exit_remove = false

Queue

#Worker 2 …

Azadeh Davoodi--WISCAD 8

Resource Information: • 179 CAE machines Intel/Linux• If all CAE are in use, “Flocks” to the queue of Intel/Linux machines in CS

Page 9: Parallel Optimization Tools for High Performance Design of Integrated Circuits

Results

Azadeh Davoodi--WISCAD 9

Bench # variables Runtime Max # Workers

Average # Workers

c5315 705 36min 118 105.74c6288 1256 66min 126 114.94c7552 822 31min 113 101.97s5378 930 39min 129 95.57s9234 740 52min 119 95.8s15850 617 48min 139 112.67s35932 7260 36min 163 108.15s38584 6950 62min 133 113.86b18 47191 52hours 192 189.82b20 15699 28hours 187 167.29b22 24484 38hours 190 173.73

On-an-average each variable had 4.5 discrete options to choose from.

Page 10: Parallel Optimization Tools for High Performance Design of Integrated Circuits

Future Plans• Install and work with personalized Condor• Work with larger circuits and more number of

sites in addition to CAE and CS• Study possibilities for optimization on a grid of

multi-core machines• Better understand and work around the priority

scheduling of jobs at Condor

Azadeh Davoodi--WISCAD 10