Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
ECE 697F
Reconfigurable Computing
Lecture 6
Mapping to Embedded Memory and PLAs
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Outline
° Overview• Targeted to existing hybrid PLA/LUT FPGAs
• Area and timing-constrained mapping
° HybridMap • Graph-based approach
• Post-map product term estimation
° Results • Comparisons with existing approaches
• Mapping to Apex20KE
° Acknowledgement: Srini Krishnamoorthy
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Hybrid FPGAs with LUTs and PLAs
° LUTs
• Dense logic, small number of inputs
• Special circuitry for arithmetic logic
° PLAs
• For wide fanin, low logic density structures
• Control-logic (e.g. Finite State Machines)
° Device containing both resources
• Move maximum logic to PLAs
• Implement remaining logic in LUTs
Inputs
Reset
Output
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Hybrid FPGAs° Resources
• LUT blocks and Pterm macrocells
° Objective • Minimize 4-LUT area subject to design performance constraints
• Potential use: smaller devices
° Key factor• Efficient partitioning of design components
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Hybrid FPGA (Similar to APEX20KE Megalab)
Global Interconnect
LE 1
LE 2
LE 3
LE 4
LE 5
LE 6
LE 7
LE 8
LE 1
LE 2
LE 3
LE 4
LE 5
LE 6
LE 7
LE 8
PLA
LUTsLocalInterconnects
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
HybridMap Flow
SIS
Tech. IndependentOptimization
Node Partitioning
Design Entry
LUT Identificatio
n
Subgraph Generation
Pterm Estimatio
n
AreaEstimatio
n
Subgraph Merging
Pterm Estimatio
n
AreaEstimatio
n
LUT Mapping
PLAMapping
Place and
Route
PLASubgraphs
LUT Partition
Mapped Design
Vendor-Specific CAD
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
LUT Identification I1 I2 I3 I4 I5 I6 I7
O1 O2 O3
Design initially reduced to 2-input gates
Topological traversal starting from primary inputs
Input count to any LUT cluster is less than K
A node always belongs to the cluster that minimizes delay
LUT clusters are preliminary Used to estimate LUT count
Similar to DAG-map (Lawler’s alg)
LUTCluster
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Subgraph Generation
Maximum Fanout Free Cone (MFFC)
Maximum Fanout Free Subgraph (MFFS)
- All edges except output edge(s) stay within the cone
O O1 O2
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Subgraph Generation
VSearch
Root-set determination (Forward BFS)
Condition: Nodes at each level <= Opla
Extract a multi-input, multi-output subgraph
Root set determination Identify subgraph outputs Forward traversal
Subgraph identification Identify the subgraph inputs Backward traversal
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Subgraph Identification
VI3
I2I1I0
Search
Subgraph determination (Backward BFS)
O1 O2 O3 O4 O5
Condition: Inputs <= Ipla, Backward breadth first
search until input constraint met
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Reconvergent Paths
Hill-climbing approach
Start basic subgraph search algorithm from v
If nodes encountered > Opla Neglect PLA output constraint Continue search
When nodes encountered < Opla Revert back to basic search algorithm
V
I1 I2 I3 I4
O1 O2
basic
hillclimbing
basicSearch
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Subgraph Pruning Hill-climbing subgraph may
violate PLA I/O constraints Need to prune
Pruning steps Collapse subgraph to two-level form Remove outputs requiring least inputs
Outputs requiring K inputs (single LUT removal) Minimal multi-LUT removal
PLA
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Subgraph Combining• Smaller subgraphs can be packed onto a single PLA
• Combine based on cost function
- LUT savings due to merging
- Input sharing
- Pterm count
• Choose 2 subgraphs for merging based on combined feasibility in terms of PLA inputs, outputs and Pterms
- Invoke Pterm estimator
SG 1 SG 2
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Results: Effect of Initial Graph Node Size
Resources: 4-LUTs, 32 Input 16 Output 32 Pterm Macrocells 2-Input case
Identify PLA Partitions Map rest to LUTs using Flowmap
4-Input case Map initially to LUTs Extract PLA Partitions
13 MCNC benchmarks (8939 LUTs initially)
Total LUT
Savings 2357 1367 364 2358 1255 316
R=10 R=5 R=5 R=10 R=5 R=1
R Number of PLAs
2-input nodes4-input nodes
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Altera software: Quartus v2000.02Timing Constrained Case : 8% LUT ReductionUnconstrained Case : 14% LUT Reduction
Mapping to APEX20KE-1
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Summary
• Hybrid FPGAs: A challenging problem
• Subgraph based approach to technology mapping
• Pterm based macrocells, Pterm estimation
• Unconstrained and timing constrained mapping
- Delay and area estimation an important part of the mapping process
• Results
- Pterm estimation improves LUT coverage by about 12%
- Apex20KE devices– Unconstrained : 14% 4-LUT savings– Timing constrained : 8% 4-LUT savings
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Notes for Transcription Assignment
° Focus on one main point
° Write an outline
° Check English and spelling
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Transcription
• Introduction
• Lecture summary
• Summary of papers
• Contrast of papers
• Conclusion
• References
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Overview
• Paper should be focused on a primary goal
- May not be obvious until after reading
- What will the reader take away from this work
- Is there a common theme
• Introduction
- Should summarize the entire paper.
- Each paragraph presents a main idea
- Ideas should be detailed but sufficiently high-level.
- Results paragraph should present a result to support the theme.
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Tackling the Body
• Lecture summary
- Brief review of the lecture
- Focus on main point
- Should flow seamlessly into/with papers
- I can make powerpoint available
• Summary of papers
- Discussion of main approaches
- Important new techniques
- Main point made by author
- Main result/results
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Contrasting the Papers
• Pick categories for comparison
- Should be three to four
• Each section should contrast a different category.
- Try to focus on specific details of each paper
• Consider using a table to supplement text.
- Helps clarify detailed comparison
• Focus on comparison rather than restating the details
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004
Other Tips
• Have a friend review your work.
• Try to keep sentences focused
- Subject – verb – description
• Avoid using first person.
• Write a detailed outline before you write any text.
• Give yourself plenty of time
• You should have at least four or five references.