Upload
brittany-white
View
227
Download
0
Tags:
Embed Size (px)
Citation preview
Partitioning
• 1st homework: Deadline: 13 Esfand
2
Partitioning
• PROBLEM: Large complex systems:Break up into smaller subsystemsCalled partitioningRequirements for a good partition
− Original functionality of the system is not changed− Minimize the interconnection between subsystems− Process should be simple and efficient
3
Partitioning Issues
Each subsystem can be designed independently
− → speeding up the design process
4
60 IPs
260 =32000 years 6 x 210 = 5 ms
10 IPs 10 IPs
10 IPs 10 IPs
10 IPs 10 IPs
Partitioning Issues
Decomposition scheme has to minimize the interconnections between subsystems.
Decomposition may be carried out hierarchically until each subsystem is of manageable size.
5
Circuit:
Cut ca: four external connections
1
2
4
5
3
6
7 8
5
6
48
7 23
1
56
48
7 2
3 1
Cut ca
Cut cb
Block A Block B Block A Block B
Cut cb: two external connections
Introduction
6
Circuit Example
• Start with a circuit of 48 logic gates:
Partition 1 – 4 lines 15 gates
Partition 2 – 4 lines 16 gates
Partition 3 – 17 gates
7
System Hierarchy
8
Levels of Partitioning
System Level Partitioning
Board Level Partitioning
Chip Level Partitioning
System
PCBs
Chips
Subcircuits/ Blocks
9
System Level Partitioning
• Objectives:Optimize the system performance.Minimize the no. of boards.
• Constraints:Board size is fixed.Terminal count on each board is fixed.
10
System
PCBs
Board Level Partitioning
• Objectives:Minimize the interchip connections.Optimize the performance and reliability
by minimizing the no. of chips.• Constraints:
Chip sizeTerminal count on a chip Multi-FPGA Design
11
PCBs
Chips
Chip Level Partitioning
• Objectives:Minimizing the no. of nets cut by the
partitioning.Avoid cutting critical nets by the partitioning.
• Constraints:Terminal counts
12
Chips
Subcircuits/ Blocks
Graph Partitoining: Terminology
5 6
4
2
1
3 3
2
4
5 61
Graph G2: Nodes 1, 2, 6.
Graph G1: Nodes 3, 4, 5.
Collection of cut edges Cut set: (1,3), (2,3), (5,6),
Block (Partition)
Cells
13
Problem Formulation
شامل گره هايhypergraph G = (V, E)يک • V = {v1, …, vn}و
•hyperedge هايE = {e1, …, em} که هر( ei V ،)
V:را طوري تقسيم کنيد که
Vi Vj = , ij
ViVj = V
•Cut.مرز بخشها :
•Cut size تعداد يالهاي قطع کنندة :cut. 15
Objective Functions
pi
16
Constraints
17
Constraints
TSV (Through Silicon Vias) in 3D chips
Ababei, Mogal and Kia Bazargan, “Three-dimensional Place and Route for FPGAs,” TCAD 2005.
www.micromagic.com
18
Constraints: Area
• Bounded-Size Partitioning:
Aimin ≤ Area(Vi) ≤ Ai
max
• Balanced Partitioning:
19
Scope of the Problem
[©Rutenbar]
20
Multi-Terminal Nets
:two-terminal netsمدلسازي بر حسب •
1 )complete graph:
اشکال: تعداد زياد يالها غير واقعي •است.
.netحسن: تقارن •21
Multi-Terminal Nets
(2 Spanning tree:
netحسن: در نهايت، يک درخت براي •کافي است.
اشکال: از قبل نمي دانيم کدام درخت •بهترين است.
22
مسائل خاص هر سبک طراحي
• Full-custom:
:cut size محدوديت •
d
pT ii
امpartition iمحيط نهايي Pitch size
باشد.hierarchical مي تواند •
مساحت کل = مجموع مساحت اجزا + مجموع مساحت •routing.مساحتهاي تلف شده +
ثابتها را cutsizeاگر
کم کنيم، به احتمال زياد کم
مي شود.
کيفيت placement آن
را تعيين مي کند.
23
مسائل خاص هر سبک طراحي
• Standard Cell، Gate Array و FPGA:
شود که partition هدف: طوري مدار •cut sizeحداقل شود
24
تقسيم بندي الگوريتمها
Constructive Iterative
Improvement
Partitioning Algorithms
25
تقسيم بندي الگوريتمها
DeterministicStochastic
Partitioning Algorithms
26
تقسيم بندي الگوريتمها
Simulation-Based
Group Migration
Partitioning Algorithms
Multi-Level .…
.…
27
Example• Given this set of components• Divide it into 3 parts
4 logic cells per segment− Lots of solutions but only 1 optimal
A B C D
E F G H
I J K L
1 1
4
4
2
2 2
6 6
6
5 5 10
10 11
11
12
123
3
9
99
8 88
7
7 7
28
Solution
• Minimum number of external connections:
A B
C L
D F
H I
E G
J K
1
2 2 6
6
4 4 55
10
11
9
3
8
12 7
PROBLEM: How to find this solution?
29
Constructive Partitioning• Constructive Partitioning:
Uses a set of rules to find a solution− Most straightforward: seed growth (cluster growth)
• Steps:1. Start a new partition: a seed cell2. Among all cells not yet in a partition, select one at a
time.3. Calculate a gain function, g(m):
− the benefit of adding cell m to the current partition− perhaps the number of connections between m and the cells in
the current partition)
4. Add the cell with max gain to the current partition5. If not reach the size limit,
− Then repeat from step 2. − Else start a new partition from step 1.
30
EXAMPLE
• Given the 12 logic cell structure (A to L) on the previous slide find a partition into 3 sets with 4 cells per setuse the gain function g(m) defined as:
− let P(m) be the number of nets (not connections) from cell m to the current partition
− let N(m) be the number of nets from m to cells not yet in partitions
− then g(m) = P(m) - N(m)
select cell C as the seed
31
Partition Table 1
• Set up a table for bookkeeping to keep track of the gain at each step - label the final partitions X, Y, and ZPass A B C D E F G H I J K L 1
A B C D
E F G H
I J K L
1 1
4
4
2
2 2
6 6
6
5 5 10
10 11
11
12
123
3
9
99
8 88
7
7 7
C
X
0-2 = -2 A
-2
1-2 = -1
-1
D0-2 = -2
-2 -3 -5 -2 -2 -3 -3 -2 -1
L
A B DB D
32
Partition Table 2
Pass A B C D E F G H I J K L
A B C D
E F G H
I J K L
1 1
4
4
2
2 2
6 6
6
5 5 10
10 11
11
12
123
3
9
99
8 88
7
7 7
C
L
2 0-2 1-2 0-2 2-4 0-5 1-2 0-2 0-3 0-3 0-2 -2 -2 -2 -2 -5 -1 -2 -3 -3 -2
3 0-2 1-2 0-2 2-3 0-5 0-2 0-3 1-3 1-2 -2 -1 -2 -1 -5 -2 -3 -2 -1
GE
33
Partition Table 3
A B C D
E F G H
I J K L
1 1
4
4
2
2 2
6 6
6
5 5 10
10 11
11
12
123
3
9
99
8 88
7
7 7
C
L
GE
Pass A B C D E F G H I J K L 1 1-1 0-2 1-1 X 0-2 2-2 0-3 0-2 0 -2 0 -2 0 -3 -2
F
2 1-1 0-2 1-1 1-1 X 1-3 1-2 0 -2 0 0 -2 -1
I
3 X 1-2 1-1 1-1 1-3 1-2 -1 0 0 -2 -1
A D
34
RESULT
• The final partition has 8 external connections (1,2,5,7,8,9,11,12) worse than the other solution (5 connections) perhaps starting with a different seed cell improve the result
A D
I F
B H
J K
C G
E L
1 2
6
4
5
10
11
9
3
8 12
7
1 11
59
8
2
12
35
Iterative Partition
• Constructive Partitioning can produce suboptimal results Iterative partitioning algorithm is often
used to improve the final partition− E.g., swapping logic cells or (or groups of logic cells)
• Group migration methods:mostly based on the Kernighan-Lin Algorithm (K-L
Algorithm)divide a graph into two segments (min-cut
problem).
36
Iterative Improvement
Constructive Algorithm
Iterative Improvement
no
yes
TerminationCriterion Met?
Return Best-Seen Solution
Problem Instance
Initial Solution
37
Kernighan-Lin Algorithm
“An Efficient heuristic Procedure for Partitioning Graphs”, Kernighan and Lin, The Bell System Technical Journal, 49(2):291-307, 1970.
38
K-L Algorithm
• Start with a network already split into two partitions, A and B each with m nodes. Define cab to be the
weight of the connectionbetween nodes a and b
cab = 1 if there is a connectionand 0 if there is not
The cut cost is W where:
1
2
3
4
5
6
7
8
9
10
BbAa
W,
abc
39
Definitions
• For any node a in partition A, the external edge cost measures the connections from node a to B
Az
azcaI In the example I1 = 0 and I3 =2
The internal edge cost measures the internal connections to a
By
aE ayc In the example E1 = 1 and E3 = 0
1
2
3
4
5
6
7
8
9
10
40
Definitions
The cost difference is the difference between external and internal edge costs
1
2
3
4
5
6
7
8
9
10
aaa IED
41
Gain Function
• Select any node in A and any node in B if we swap these nodes, a and b, we need to measure
the reduction in the cut weight which is the gain g given below:
abb cDDga
2
As a and b may be connected
42
K-L Algorithm
• The K-L Algorithm finds a group of node pairs to swap which increases the gain
• STEPS1) Find two nodes ai from A and bi from B so that the gain from
swapping them is maximum2) Next pretend to swap ai and bi (even if the gain is 0 or
negative)3) Lock them.4) Repeat steps 1 and 2 until all the nodes of A and B pretend
swapped. 5) Now we can select which nodes to actually swap. Suppose
we only swap the first n pairs of nodes found in the above steps. The total gain would be Gn (see below), so select the n which maximizes Gn
n
iigG
n
1
43
Iterative Process
• If the maximum value of Gn > 0 then the n node pairs are swappedK-L is applied to this new partition
• If the maximum value of Gn ≤ 0 then we stop
44
Example
• Given the graph below, its initial partition has 4 external edges Calculate the gain from
swapping all pairs of nodes(there are 25 pairs)
1
2
3
4
5
6
7
8
9
10
Node E I D 1 1 0 1 2 2 0 2 3 0 2 -2 4 1 1 0 5 0 1 -1 6 1 0 1 7 1 1 0 8 2 1 1 9 0 2 -2 10 0 2 -2
Pair g1,6 1 + 1 - 0 = 21,7 1 + 0 - 2 = -11,8 1 + 1 - 0 = 21,9 1 - 2 - 0 = -11,10 1 - 2 - 0 = -12,6 2 + 1 - 2 = 1 . . .
This is the best 45
First Swap
• Swap 1 and 6 (fake)for a gain of 2
Redo the process withthe new graph
6
2
3
4
5
1
7
8
9
10
Node E I D 2 1 1 0 3 0 2 -2 4 1 1 0 5 0 1 -1 7 0 1 -1 8 2 1 1 9 0 2 -2 10 0 2 -2
Pair g2,7 0 - 1 - 0 = -12,8 0 + 1 - 2 = -12,9 0 - 2 - 0 = -22,10 0 - 2 - 0 = -23,7 -2 - 1 - 0 = -3 . . .5,8 -1 + 1 - 0 = 0 . . .
This is the best 46
Second Swap
• Swap 5 and 8 for again of 0
• Continue this process, the third and fourth swaps will also have a gain of 0 and the final swap will have a gain of -2. As the gains from each swap are summed they max out at
the first swap so keep the swap of 1 and 6 and repeat this process on the new partition.
6
2
3
4
8
1
7
5
9
10
47
End of First Iteration
Cut Size gi Node Pair i
4 - - - 0
2 2 2 (1, 6) 1
2 2 0 (5, 8) 2
2 2 0 3
4 0 -2 4
ig
• Find maximum and exchange (actually) up to that point.
48
Another Case
Cut Size gi Node Pair i
4 - - - 0
2 2 2 (1, 6) 1
3 1 -1 (5, 8) 2
0 4 3 3
2 2 -2 4
ig
49
Kernighan & Lin Partitioning
• If sum to m > 0, some gain, so repeat until sum to m=0
gkk
i
1
i1 2 3 m n
[©Newton]
50
KL Algorithm
-2cab
-2cab
51
Computational Complexity
For 2n nodes
Initial computation of the D’s takes O(n2) For each pass:
For i = 1 to n: Compute the gain for all free pairs takes O(n2)
- After swapping ai and bi in movei, at most (2n-2i) gains of free nodes are updated
- For n moves:
Pair comparison in movei: (n-i+1)2 = O(n2) pairs to choose from: For n pairs comparison:
Si=1,…,n (2n-2i) = O(n2)
Si=1,…,n (n-i+1)2 = O(n3)
52
Computational Complexity
Tentative Exchange, Lock, Log take O(1),
total time = # of passes O(n3)# of passes: usually constant (independent of n)
In practice: improvement almost nothing after 4 passes
To speedup pair comparison:
node pairs can be sorted ahead of time
O(n2log n)
53
KL Algorithm
• Drawbacks:• Unable to handle hyperedges.
• Partition sizes must be given in advance.
• But can handle unequal sizes
• Unable to handle vertex weights.
• Can assign unit area = gcd (cell_areas)
• edge weight s= infinity
• Time complexity: high = O(n3)
54
KL Algorithm
• Advantages:• Simple.
• Performs well for up to ~102 nodes
• can handle constraints as having some blocks together
• (e.g. components of an adder)
55
KL Software Package
• A software package : available from the class site. It will allow you to
− enter a connection matrix− set up an initial partition− run KL
56
Main Screen
Make orLoadAdjacency matrix
Then selectGIP
Then Run
57
Enter a Matrix
Enterthe size andthen theconnectionmatrix
58