View
217
Download
1
Embed Size (px)
Citation preview
Placement Feedback: A Concept and Method for
Better Min-Cut Placements
Andrew B. Kahng Sherief RedaCSE & ECE Departments
University of CA, San DiegoLa Jolla, CA [email protected]
CSE DepartmentUniversity of CA, San Diego
La Jolla, CA [email protected]
VLSI CAD Laboratory at UCSDhttp://vlsicad.ucsd.edu
Outline
Min-cut Placement and Terminal Propagation
Ambiguous Terminal Propagation
Placement Feedback
Iterated Controlled Feedback
Accelerated Feedback
Experimental Results
Conclusions
Min-Cut Placement: Objective
Steiner tree represents the minimum wirelength need to connect a number of cells
Total wirelength is the sum of the length of Steiner trees
Routed wirelength is the typically larger than total wirelength due to detours arising from contention on routing resources Half-Perimeter Wirelength (HPWL) correlates well with the routed wirelength, represents a lower bound on the net length and fast to calculate
Min-cut Placement Objective: Total wirelength minimization
Min-Cut Placement: Method
Input Level 1
Min-Cut Placement Method: Sequential min-cut partitioning
Level 2
block
Key Issues: How to partition a hypergraph?
• Multilevel hypergraph partitioning using the Fiduccia/Mattheyses heuristic
How to propagate net connectivity information from one block to another?
Netlist (hyper-graph)
block
Terminal Propagation
A B
C D
Simple hypergraph
A
BC
D
1 2
After first placement level
1 2
A B
C D
Case II
Case II: Information about cells in one block are accounted for in the other block → local partitioning results are translated to global wirelength results
1
Well-studied problem: Terminal propagation (Dunlop/Kernighan85) Global objectives/cycling (HuangK97, Zheng/Dutt00, Yildiz/Madden01)
2
A
BC
D
Case I
Case I: Blocks are partitioned in isolation → optimal local partitioning results but far from optimal global results
1
Terminal Propagation Mechanism
B1 B2u
v
uf
B1 has been partitioned; B2 is to be partitioned
u is propagated as a fixed vertex uf to the subblock that is closer
uf biases the partitioner to move v upward
X?
Ambiguous propagation occurs when terminals, e.g. Y4, are equally close to the two subblocks of a block under partitioning
Traditional solution: either propagate to both subblocks or not to propagate at all
Ambiguous Terminal Propagation
Y1Y2
partitionfuzziness
Y4
Y3
f1f2
f3
Effect of Ambiguous Terminal Propagations
L R
Given an edge e with a set of cells I:
● cells are closer to L than R
Conclusion: Ambiguous propagations lead to indeterminism in propagation decisions → wirelength increase
● cells are closer to R than L
● cells are equally proximate to both L and R
1. Only ● → L2. Only ● → R3. ● and ● → neither
Terminal Propagation decisions
(without ambiguous)
1. ● and ● → L or neither2. ● and ● → R or neither3. ● ● and ● → neither 4. ● → neither or L or R
Terminal Propagation decisions
(with ambiguous)
Min-Cut Placement Flow
Level 1Partitioning
Terminal Propagation
Level 2Partitioning
Terminal Propagation
Level mPartitioning
Terminal Propagation
The input to the flow is the I/O pad locations, and the circuit netlist where all are collapsed at the center of the chip
The output of the flow is a global placement, where groups of cells are assigned portions of the chip’s rows
A detailed placer determines the exact locations of all cells
Outline
Min-cut Placement and Terminal Propagation
Ambiguous Terminal Propagation
Placement Feedback
Iterated Controlled Feedback
Accelerated Feedback
Experimental Results
Conclusions
Mitigating Ambiguous Terminal Propagation
Two hyperedges: {A, B, C}, {X, A, B}. B1 is partitioned before B2
B
A
C
X
C is ambiguously propagated
B
A
C
X
2 1 1
B
AC X
Further partitioning
Cuts = 3, Wirelength
= 6
1
Undo
B
A
C
XC
Repartition
X
A
C
BC
C is propagated to the top
Further partitioning
X
AC B
Cuts = 2, Wirelength
= 5
B1B2
Placement Feedback
Traditional Placement Flow
Level 1Partitioning
Terminal Propagation
Level 2Partitioning
Terminal Propagation
Level mPartitioning
Terminal Propagation
Placement Flow with Feedback
For each placement level:
- Undo all partitioning/block bisecting results, but retain the new cell locations for terminal propagations
- Use the new cell locations to re-do the level’s placement
Placement Feedback Assessment
Metrics:
Reduction in ambiguous terminal propagations
Associated reduction in HPWL
Experimental Setup
We implement feedback in Capo (version 8.7)
For each placement level:- Measure the number of ambiguous terminal propagations before and after feedback- Measure the HPWL estimate before and after feedback (assuming all previous placements levels had feedback)
Feedback Effects
0
20
40
60
80
100
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Percentage reduction in ambiguous propagations
Reductions in ambiguous terminals and HPWL per level are strongly correlated
Placement Level
Percentage reduction in HPWL
0
0.5
1
1.5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Placement Level
Outline
Min-cut Placement and Terminal Propagation
Ambiguous Terminal Propagation
Placement Feedback
Iterated Controlled Feedback
Accelerated Feedback
Experimental Results
Conclusions
Since the feedback loop produces new outputs → iterate over the feedback loop a number of times
If the feedback response is not desirable → insert a feedback controller to enhance the response.
Iterative Placement Feedback
Feedback controller should:
Evaluate and optimize some placement quality or objective
Decide when to terminate feedback iterating
FeedbackController
Placement Flow with Feedback Controllers
Level 1Partitioning
Terminal Propagation
Level 2Partitioning
Terminal Propagation
Level mPartitioning
Terminal Propagation
Feedback Controller Objectives
c1
c2
d1
d2
Cut partitioning objective: QP = c1 + c2
HPWL objective: QH = c1 × d1 + c2 × d2
QP and QH are not correlated!
• Example: Assume d1 = 6 and d2 = 8 c1 = c2 = 100 → QP = 200 and QH = 1400 c1 = 85, c2 = 112 → QP = 197 and QH = 1406
Two possible objectives (placement qualities) to optimize:
B1
B2
Feedback Controller Stopping Criteria
FeedbackController
Placement Flow with Feedback Controllers
Level 1Partitioning
Terminal Propagation
Level 2Partitioning
Terminal Propagation
Level mPartitioning
Terminal Propagation
A. Monotonic Improvement Criterion: Iterate per placement level until there is no further improvement in QP (or QH)
QP or QH
Iteration0 1 2 3 4 5
Feedback Controller Stopping Criteria
FeedbackController
Placement Flow with Feedback Controllers
Level 1Partitioning
Terminal Propagation
Level 2Partitioning
Terminal Propagation
Level mPartitioning
Terminal Propagation
B. Best Improvement Criterion: Iterate per placement level a fixed number of times but pass the best results seen QP (or QH)
QP or QH
Iteration0 1 2 3 4 5
Feedback Controller Stopping Criteria
FeedbackController
Placement Flow with Feedback Controllers
Level 1Partitioning
Terminal Propagation
Level 2Partitioning
Terminal Propagation
Level mPartitioning
Terminal Propagation
C. Unconstrained Criterion: Iterate per placement level a fixed number of times and pass the last results
QP or QH
Iteration0 1 2 3 4 5
Controller Type Comparison
FeedbackController
Placement Flow with Feedback Controllers
Level 1Partitioning
Terminal Propagation
Level 2Partitioning
Terminal Propagation
Level mPartitioning
Terminal Propagation
3 Stopping Criteria 2 Objectives
Monotonic Improvement Total Cut (QP)
HPWL Estimate (QH)
Best Improvement Total Cut (QP)
HPWL Estimate (QH)
Unconstrained -
Combinations of the 3 stopping criteria and 2 objectives yield 5 controllers
We study the aggregate impact of the different controllers on the final HPWL
QP (based on partitioning) controllers dominate QH (based on HPWL) controllers
Best Improvement controllers outperform monotonic improvement controllers
Best Improvement QP controller slightly outperforms the unconstrained controller
Effect of Controller on Final Wirelength
Monotonic QH
Best QH
Monotonic QP
Best QP
Unconstrained
Final HPWL versus number of iterations for different controllers
14000000
14500000
15000000
15500000
16000000
0 1 2 3 4 5 6Iteration
Results are average of 6 seeds for up to 12 iterations using the best improvement QP controller
Final value slightly oscillates around a fixed value with a 8-9% improvement in HPWL in comparison to traditional placement flow
Asymptotic Controller Behavior
14000000
14500000
15000000
15500000
16000000
0 1 2 3 4 5 6 7 8 9 10 11 12
Final HPWL versus number of iterations for different controllers
Best QP
Iteration
Typically, placers call the multilevel partitioner a number of times and utilize the best cluster-tree partitioning results
In iterated feedback, only the last feedback iteration determines the partitioning results; other loops determine accurate terminal propagation.
Accelerated Feedback
V Cycle
Feedback runtime α number of feedback iterations
Coarsening Uncoarsening
To speedup our feedback implementation:
→ Call the multi-level partitioner once (1 V-Cycle) for each feedback loop
→ Restore to default placer settings (2 V-Cycles) for the last feedback iteration
Outline
Min-cut Placement and Terminal Propagation
Ambiguous Terminal Propagation
Placement Feedback
Iterated Controlled Feedback
Accelerated Feedback
Experimental Results
Conclusions
We test our methodology in Capo version 8.7
Placement results are average of 6 seeds
Experimental Setup
Cadence’s WarpRoute is used for routed wirelength evaluation
All experiments conducted on 2.4 GHz Xeon Linux workstation, 2 GB RAM
Code implementation took 130 lines of C++ code
We evaluate feedback on the IBM version 1, version 2, and PEKO benchmarks
We use 3 feedback iterations with the best improvement Qp feedback controller
0123456789
1011121314
ibm01 ibm03 ibm05 ibm07 ibm09 ibm11 ibm13 ibm15 ibm17
Percentage improvement in HPWL (Half-Perimeter Wirelength) in comparison to Capo
AFB
FB
HPWL Results (IBM Version 1)
%
Feedback: Max improvement 13.73% and average improvement 5.43% with 4.10x the original in Capo runtime
Accelerated Feedback: Max improvement 13.43% and average improvement 4.70% with 2.43x the original Capo runtime
PEKO benchmarks: Max improvement 10% and average improvement 5% for feedback at the expense of 2-3x increase in Capo runtime
HPWL Results (IBM Version 1)
0123456789
10
ibm01 ibm02 ibm07 ibm08
Routed Wirelength Results (IBM Version 2 - Hard)
%
Percentage improvement in routed wirelength in comparison to Capo
benchmark
Violations
Capo FeedBack
Ibm01 601 103
Ibm02 0 0
Ibm07 450 0
Ibm08 59 0
Number of routing violations
0
1
2
3
4
5
6
7
8
ibm01 ibm02 ibm07 ibm08
Percentage improvement in routed wirelength in comparison to Capo.
%benchmark
Violations
Capo FeedBack
Ibm01 1238 0
Ibm02 0 0
Ibm07 0 0
Ibm08 0 0
Number of routing violations
Routed Wirelength Results (IBM Version 2 - Easy)
Conclusions
• New understanding of how ambiguous terminal propagation leads to indeterminism in propagation results and degraded placer performance
• Idea: reduce indeterminism by undoing placement results, but still using them to guide future partitioning.
• Flavors of this approach proposed before, but for different contexts
• Our approach is captured as feedback, which we tune using controllers
• Detailed study of variant objectives that can be optimized by the controllers, as well as iterating criteria
• Accelerated feedback: efficient implementations to reduce runtime impact
• IBMv1 HPWL results: up to 14% (best) and 6% (avg) improvement over Capo
• IBMv2 routed WL results: up to 10% improvement over Capo, with improved routability and reduced via count
• Accelerated feedback is now the default mode in Capo
Acknowledgments
We thank Igor Markov (University of Michigan) for helpful discussions.
Thank You
Block Ordering
Results are inconclusive!
Regular ordering
Random ordering
Alternate ordering1 2 3 4