Download ppt - APLACE: A General and Extensible Large-Scale Placer Andrew B. KahngSherief Reda Qinke Wang VLSICAD lab University of CA, San Diego

APLACE: A General and Extensible Large-Scale Placer

Andrew B. Kahng Sherief Reda Qinke Wang

VLSICAD lab

University of CA, San Diego

Goals and Plan

Goals:• Scalable and robust implementation • Leave no stone unturned / QOR on the table• Leave nothing for competitors

Plan and Schedule:• Use APlace as an initial framework• One month for coding + one month for tuning

Implementation Framework

APlace weaknesses:• Weak clustering• Poor legalization / detailed placement

Clustering

Adaptive APlace engine

WS arrangement

Cell order polishing

Unclustering

Global moving

Legalization

GlobalPhase

DetailedPhase

New APlace Flow

New APlace:1. New clustering2. Adaptive parameter

setting for scalability3. New legalization +

iterative detailed placement

Clustering/Unclustering A multi-level paradigm with clustering ratio = 10

Top-level clusters 2000

Similar in spirit to [HuM04] and [AlpertKNRV05]

For each clustering level:

Algorithm Sketch

Calculate the clustering score of each node to its neighbors based on the number of connections Sort all scores and process nodes in order as long as cluster size upper bounds are not violated If a node’s score needs updating then update score and insert in order

Adaptive Tuning / Legalization

Adaptive Parameterization:

Legalization:1. Sort all cells from left to right: move each cell in order

(or a group of cells) to the closest legal position(s)2. Sort all cells from right to left: move each cell in order

(or a group of cells) to the closest legal position(s)3. Pick the best of (1) and (2)

1. Automatically decide the initial weight for the wirelength objective according to the gradients

2. Decrease wirelength weight based on the current placement process

Whitespace Compaction: For each layout row:

Optimally arrange whitespace to minimize wirelength while maintaining relative cell order. [KahngTZ99], [KahngRM04].

Cell Order Polishing: For a window of neighboring cells

Optimally arrange cell orders and whitespace to minimize wirelength

Detailed Placement

Global Moving:

Optimally move a cell to a better available position to minimize wirelength

Parameterization and ParallelizingTuning Knobs:

Clustering ratio, # top-level clusters, cluster area constraints Initial wirelength weight, wirelength weight reduction ratio Max # CG iterations for each wirelength weight Target placement discrepancy Detailed placement parameters, etc.

Resources: SDSC ROCKS Cluster: 8 Xeon CPUs at 2.8GHz Michigan Prof. Sylvester’s Group: 8 various CPUs UCSD FWGrid: 60 Opteron CPUs at 1.6GHz UCSD VLSICAD Group: 8 Xeon CPUs at 2.4GHz

Wirelength Improvement after Tuning : 2-3%

Artificial Benchmark Synthesis

Created a number of artificial benchmarks to test code scalability and performance

Used statistics of benchmarks to create synthesized versions of bigblue3 and bigblue4

Mimicked fixed blocks layout diagrams in the artificial benchmark synthesis

Proved useful since we identified a problem in clustering if there are many fixed blocks

Results

CircuitGP

HPWLLeg

HPWLDP

HPWL CPU (h)

adaptec1 80.20 81.80 79.50 3

adaptec2 84.70 92.18 87.31 3

adaptec3 218.00 230.00 218.00 10

adaptec4 182.90 194.75 187.71 13

bigblue1 93.67 97.85 94.64 5

bigblue2 140.68 147.85 143.80 12

bigblue3 357.28 407.09 357.89 22

bigblue4 813.91 868.07 833.21 50

Bigblue4 Placement

HPWL = 833.21