32
A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK TREE SYNTHESIS IN VERY LARGE SCALE INTEGRATION DESIGN CHEW EIK WEE UNIVERSITI TEKNOLOGI MALAYSIA

A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

A COMPUTER-AIDED DESIGN SOFTWARE MODULE

FOR CLOCK TREE SYNTHESIS

IN VERY LARGE SCALE INTEGRATION DESIGN

CHEW EIK WEE

UNIVERSITI TEKNOLOGI MALAYSIA

Page 2: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

A COMPUTER-AIDED DESIGN SOFTWARE MODULE

FOR CLOCK TREE SYNTHESIS

IN VERY LARGE SCALE INTEGRATION DESIGN

CHEW EIK WEE

A thesis submitted in fulfilment of the

requirements for the award of the degree of

Master of Engineering (Electrical)

Faculty of Electrical Engineering

Universiti Teknologi Malaysia

FEBRUARY 2008

Page 3: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

iii

Specially dedicated to

my beloved family

Page 4: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

iv

ACKNOWLEDGEMENTS

First and foremost, I would like to express my deepest gratitude and

appreciation to my supervisor, Prof. Dr. Mohamed Khalil Mohd Hani, for guiding

me during my master’s study and helping me in discovering the wonderful field in

the computer-aided design of electronic systems. This study would not be possible

without his invaluable suggestions and support. Besides the technical knowledge, he

is also my teacher of life.

Special thanks to Encik Nasir Shaikh Husin for his mentoring, support,

sharing his knowledge and consistent help to my work over the last two years. I also

deeply thank to my friends and colleagues, Yuan Wen, Heng Sun, Illiasaak,

Norashikin, Arif, and En Zulkifli Yusof for their best friendship, help, and support.

They made my life in the UTM colorful and enjoyable. Many thanks also to the

technicians in VLSI-ECAD Lab, En. Zulkffli bin Che Embong and En.

Khomarudden bin Mohd Khair Juhari, in creating a conducive learning and research

environment in the lab.

I would like to thank my parents and my family members for their love

during my years in graduate school. Their care always provides the warmest support

in my life and work, wherever I am.

Most importantly, I would like to special thank to my friend, Bee Leng, who

has provided her meaningful and precious suggestions and consistent support during

the last few years, to the completion of this research and thesis.

Page 5: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

v

ABSTRACT

With the evolution of Very Large Scale Integration (VLSI) fabrication

technology, circuit size has grown and line width has decreased. In effect, the

transistor transit time and the time to drive signal lines across chips have also

decreased. Thus, interconnections have become the dominating factor in

determining circuit performance and reliability in the design of a VLSI circuit.

Clock distribution network, which is one of the biggest and most important nets in

any synchronous VLSI chip, is sensitive to these variations. The increased line

resistance is one of the primary reasons for the increasing significance of clock

distribution networks on synchronous performance. Furthermore, failing to control

the clock skew can also severely limit the maximum performance of the entire

system and create catastrophic race conditions in which an incorrect data signal may

be latched by register. This thesis proposes a Computer Aided-Design (CAD)

software module for useful-skew tree synthesis in deep-sub-micron VLSI design.

Building the proposed CAD software module involves the implementations of the

abstract topology generation algorithm, skew constraints scheduling algorithm and

clock tree construction algorithm. Due to the lack of availability of circuit data and

necessary software tools, we introduced a different test methodology to test the

reliability of the proposed CAD software module. Two different test data have been

used to verify the functionality of the CAD software module. Tests on the random

input data show that our CAD software module successfully synthesizes clock trees

that satisfy the entire clock skew constraints, and at the same time, achieve a shorter

wire-length. Tests on benchmark circuit’s data show that our CAD software module

successfully synthesizes clock trees that not only satisfy the skew constraint value,

but also further reduced the computed applicable clock period of the circuit.

Page 6: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

vi

ABSTRAK

Evolusi perkembangan teknologi fabrikasi “Very Large Scale Integraion”

(VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir mengecil.

Keadaan ini menyebabkan masa melintasi transistor telah berkurangan dan seterusnya

masa memandu isyarat melintasi cip turut berkurangan. Oleh itu, penyambung-

penyambung dalaman cip menjadi faktor utama menentukan prestasi dan

kebolehpercayaan rekabentuk litar VLSI sekarang. Rangkaian pengagihan isyarat jam,

merupakan salah satu rangkaian terluas dan terpenting bagi cip segerak VLSI, amatlah

sensitif kepada perubahan perkembangan ini. Penambahan rintangan akibat daripada

pengurangan saiz lebar pengalir merupakan salah satu sebab utama penambahan

kepentingan oleh rangkaian isyarat jam dalam menentukan prestasi kesegerakan litar.

Tambahan pula, kegagalan mengawal pencongan jam menghadkan lagi maksima

prestasi keseluruhan sistem dan boleh mewujudkan keadaan pelarian data, iaitu data-

data yang salah akan diselak oleh pendaftar. Tesis ini mencadangkan satu modul

perisian Rekabentuk Bantuan Komputer (CAD) yang dapat mensintesiskan “useful-skew

tree” untuk rekabentuk VLSI sub-mikro-dalam. Pembangunan modul perisisian yang

dicadangan tersebut melibatkan implementasi algoritma penjanaan topologi abstrak,

penjadualan pencongan jam dan pembinaan pepohon jam. Disebabkan kekurangan data-

data litar dan perisian-perisian yang penting, kami memperkenalkan metodologi ujian

yang berbeza untuk menguji kebolehpercayaan modul perisian CAD yang dicadangkan.

Terdapat dua data ujian yang berbeza telah digunakan untuk mengenalpasti kewibawaan

modul perisisan CAD tersebut. Keputusan ujian pada data-data rawak masukan

menunjukkan pepohon jam yang disintesiskan oleh modul perisisan CAD dapat

memuaskan semua kekangan pencongan jam, dan pada masa yang sama dapat mencapai

panjangan wayar yang lebih pendek. Keputusan ujian pada data-data litar tanda aras

menunjukkan modul perisian CAD kami berjaya mensintesiskan pepohon jam yang

bukan sahaja dapat memuaskan semua kekangan pencongan jam, bahkan juga dapat

mengurangkan tempoh jam litar.

Page 7: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

vii

TABLE OF CONTENTS

CHAPTER TITLE PAGE

DECLARATION ii

DEDICATION iii

ACKNOWLEDGEMENTS iv

ABSTRACT v

ABSTRAK vi

TABLE OF CONTENTS vii

LIST OF TABLES xi

LIST OF FIGURES xii

LIST OF SYMBOLS xvi

LIST OF APPENDICES xvii

1 INTRODUCTION 1

1.1 Background

1.2 Problem Statement

1.3 Objective

1.4 Scope of Work

1.5 Project Approach and Tools

1.6 Research Contribution

1.7 Thesis Organization

1

4

5

5

6

8

8

2 LITERATURE REVIEW, THEORY AND CONCEPTS 10

2.1 Previous Related Work

2.2.1 Clock Scheduling

2.2.2 Clock Tree Construction

2.2 Clock Skew – Definitions and Concepts

10

10

12

16

Page 8: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

viii

2.2.1 Double-Clocking

2.2.2 Zero-Clocking

2.3 Concept of Useful-Skew

2.4 Elmore Delay Model

2.2.1 Numerical Example of Elmore Delay

Computation

17

19

20

25

26

3 ABSTRACT TOPOLOGY GENERATION AND CLOCK

SKEW SCHEDULING −−−− DEFINITIONS AND

ALGORITHMS

28

3.1 Introduction

3.2 Abstract Topology Generation

3.2.1 The Octagon of S

3.2.2 Reference Set, REFi

3.2.3 Bi-partitioning the Set S

3.3 Clock Skew Scheduling

3.3.1 Bellman-Ford Algorithm

28

29

38

41

41

45

52

4 USEFUL-SKEW TREE CONSTRUCTION −

ALGORITHM AND DESIGN

54

4.1 Overview

4.2 UST/DME Algorithm

4.2.1 Skew Constraint Matrix

4.2.2 Deferred-Merge Embedding (DME) Approach

4.3 Construction of Merging Region

4.3.1 Joining Segment Identification

4.3.2 Skew Commitment

4.3.3 Shortest Distance Region (SDR) Determination

4.3.4 Exact Merging Region Construction

4.4 Internal Node Embedding

54

55

58

59

60

66

67

70

70

77

5 CLOCK TREE SYNTHESIS PROGRAM DESIGN 81

5.1 Useful-Skew Tree (UST) Construction sub-module 81

Page 9: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

ix

5.1.1 Supporting Unit

5.1.2 Main Program Unit

5.1.3 Skew Constraint Matrix, D Computation Unit

5.1.4 Joining Segment Identification Unit

5.1.5 Incremental Skew Scheduling Unit

5.1.6 Merging Region Construction Unit

5.1.7 Internal Node Embedding Unit

5.2 Abstract Topology Generation sub-module

5.2.1 The Main Program

5.2.2 Sorting Subunit

5.2.3 Octagon Sinks Computation Unit

5.2.4 Partition Unit

5.2.5 Diameter Determination Unit

5.3 Clock Skew Scheduling sub-module

5.3.1 Constraint Graph Updating Unit

5.3.2 Graph Relaxation Unit

82

84

86

87

87

88

94

94

95

96

97

97

98

98

99

100

6 TESTING METHODOLOGIES 101

6.1 Overview

6.2 The Function of benchParse.pl

6.3 The Function of TPRParse.pl

6.4 The Function of CapacitanceParse.pl

6.5 The Function of delayTrace.pl

6.6 Source-Sink Delay Computation for Output

101

103

104

110

110

111

7 TEST AND RESULT 116

7.1 Random Data Test

7.1.1 Experimental Result for TEST_DATA_1

7.1.2 Experimental Result for TEST_DATA_2

7.1.3 Experimental Result for TEST_DATA_3

7.2 Real Circuit Data Test

7.2.1 Experiment Result of Circuit s208

7.2.2 Experiment Result of Circuit s298

116

117

120

122

124

125

126

Page 10: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

x

7.2.3 Experiment Result of Circuit s382

7.3 Experimental Result of Previous Relevant Researches

127

128

8 CONCLUSIONS 132

8.1 Concluding Remarks

8.2 Recommendations for Future Work

132

133

REFERENCES 137

Appendices A - H 143 – 236

Page 11: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

xi

LIST OF TABLES

TABLE NO. TITLE PAGE

3.1

3.2

3.3

7.1

7.2

7.3

7.4

7.5

Computation of partition cost for example in Figure 3.6

Coordination computation result of set S in Figure 3.7

The weight(s) of S based on REF1 = {s10, s17, s9}

Summary of skew evaluation result for TEST_DATA_1

Wire-length (µm) comparison between UST-BP and ZST

and BST [(Xi and Dai, 1997)]

Comparison of wire-lengths among ZST, BST/DME, UST-

BP, and UST/DME (with BST/DME topology) [(Tsao and

Koh, 2000)]

Comparison of wire-lengths among ZST, UST/DME, and

PV-UST/DME [(Chaturvedi and Hu, 2003)]

Results of PVR-UST/DME with buffer insertion

[(Chaturvedi and Hu 2003)]

37

40

42

120

129

129

130

131

Page 12: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

xii

LIST OF FIGURES

FIGURE NO. TITLE PAGE

1.1

1.2

2.1

2.2

2.3

2.4

2.5

2.6

2.7

2.8

2.9

2.10

2.11

2.12

3.1

3.2

3.3

3.4

3.5

3.6

3.7

3.8

Physical design cycle

Project workflow of CAD software module development

Symmetric H-tree clock distribution network

Definition of clock skew

Scenario of double-clocking a) initial stage b) when arrival

of first clock pulse c) wrong data latch-in by FFj

Scenario of zero-clocking a) initial stage b) when arrival of

first clock pulse c) when clock pulse arrives at si d) FFj still

latch-in the old data

Zero skew clock distribution design

Non-zero skew clock distribution design

Compensation of negative delay value

A synchronous circuit modeling

Permissible range of skewij

Skew constraint graph

Equivalent π-model of a wire segment

Example of interconnect wire for delay computation a) wire

b) wire modeling based on the π-Model

Example of clock tree routing by (Wu et al., 2003)

An example of sink nodes and its coordination

Overview of bi-partitioning process

Abstract topology generation

An example of set S with 3 sink nodes

An example of dividing a set of S into SL and SR

Example of an octagon of S for a 10 sink nodes set, S

Octagon of S’s boundary construction

2

7

13

17

18

20

21

21

22

22

24

24

25

27

29

30

31

32

36

37

38

39

Page 13: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

xiii

3.9

3.10

3.11

3.12

3.13

3.14

3.15

3.16

3.17

3.18

3.19

3.20

3.21

4.1

4.2

4.3

4.4

4.5

4.6

4.7

4.8

4.9

4.10

4.11

4.12

4.13

Partitioning result based on the set REF1

Balanced bipartition result that based on reference set REF2

Balanced bipartition result that based on reference set REF3

Balanced bipartition result that based on reference set REF4

Balanced bipartition result that based on reference set REF5

Balanced bipartition result that based on reference set REF6

(a) Example circuit and (b) its constraint graph

(a) Example circuit and (b) its constraint graph

Example circuit

Skew constraint graph (with p unknown), GC for circuit in

Figure 3.17

Applying Deoker’s algorithm

Skew constraint graph, GC (with p = 7.43) for the example

of Figure 3.17

Implementation of the optimized clock period

Design structure of proposed CAD software module

The DME merging process

Example input data to UST/DME algorithm a) clock pin

floorplan b) abstract topology c) skew constraint value

Merging Region Construction – Overview

Example for merging region construction process

Example of joining segment JS

Example of delay and skew computation based on joining

segment JS. (a) point of view in binary-tree structured. (b)

point of view in Manhattan plane

Example of shortest distance region SDR (u,w)

Start node visitation on Lu and the skew23 evaluation

Node visitation on arrow pointed node and the skew23

evaluation

Node visitation on arrow pointed node and the skew23

evaluation

Merging region, mr(v) is determined

Another set of merging region, mr(v) is determined

42

43

43

43

44

44

45

46

47

48

48

50

51

55

59

61

61

65

66

68

70

71

72

73

73

74

Page 14: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

xiv

4.14

4.15

4.16

4.17

4.18

4.19

4.20

4.21

5.1

5.2

5.3

5.4

5.5

5.6

5.7

5.8

5.9

5.10

5.11

5.12

5.13

5.14

6.1

6.2

6.3

Node visitation on arrow pointed node and the skew23

evaluation

Node visitation on arrow pointed node and the skew23

evaluation

Node visitation on arrow pointed node and the skew23

evaluation

Last node visitation on Lw and the skew23 evaluation

Final result of merging region, mr(v) construction

Merging region mr(w) re-computation (by comparing to the

Figure 4.6)

Internal node embedding process

Completed useful-skew clock tree

Design structure of useful-skew tree construction sub-

module

Binary tree-like structures two-dimensional array

Programming flowchart of UST/DME algorithm

C Code for Skew Constraint Matrix, D Computation Unit

C Code for Incremental Skew Scheduling Unit

Example of SDR between joining segments

Flowchart of merging region construction within SDR

Detour-wiring when skew value > upper skew constraint

bound

Delay evaluation when skew value < lower skew constraint

bound

Design structure of abstract topology generation sub-

module

Programming flowchart for sink nodes partitioning process

C Code fragment of bubble sort procedure

Design structure of clock skew scheduling sub-module

Programming flowchart of clock skew scheduling

A sample of sink nodes netlist file

A sample of delay netlist file

A sample of .bench netlist

74

75

75

76

76

77

79

80

82

83

85

85

87

90

91

92

93

94

95

96

98

99

102

103

104

Page 15: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

xv

6.4

6.5

6.6

6.7

6.8

6.9

6.10

6.11

6.12

6.13

6.14

6.15

6.16

6.17

6.18

6.19

7.1

7.2

7.3

7.4

7.5

7.6

7.7

7.8

7.9

7.10

7.11

7.12

8.1

8.2

A sample of .bch netlist

A sample of .bch netlist

A sample of .tpr netlist

Invoke standard place and route

Select a TPR netlist

SPR core setup

Run standard place and route (SPR)

Generated layout design through SPR

A sample of spice netlist

A sample of .pcp file

A sample of .cap File

A sample of pathDELAY.txt

Example of synthesized clock tree

C-code implementation of segmental delay computation

C-code implementation of source-to-sink delay

computation

Example of source-sink delay computation result

Synthesized useful-skew tree of TEST_DATA_1

Computation result of propagation delay of each sink node

Useful-skew tree in binary-tree structure view

Synthesized useful-skew tree of TEST_DATA_2

Computation result of propagation delay of each sink node

Useful-skew tree in binary-tree structure view

Synthesized useful-skew tree of TEST_DATA_3

Computation result of propagation delay of each sink node

Process flow of real circuit data testing

Synthesized useful-skew tree for circuit s208

Synthesized useful-skew tree for circuit s298

Synthesized useful-skew tree for circuit s382

Top-level block diagram of the proposed CAD software

module

Permissible skew range with safety margin ddition

104

105

105

106

106

107

107

108

109

109

110

111

112

113

114

114

118

118

119

120

121

121

123

123

125

126

127

128

133

134

Page 16: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

xvi

LIST OF SYMBOLS

VLSI - Very Large Scale Integration

CAD - Computer Aided-Design

EDA - Electronic Design Automation

IC - Integrated Circuit

PC - Personal Computer

GHz - Giga Hertz

UST - Useful-Skew Tree

ZST - Zero-Skew Tree

DME - Deferred-Merge Embedding

BB - Balanced Bipartition

SDR - Shortest Distance Region

FSR - Feasible Skew Range

ISCAS - International Symposium on Circuit and System

FF - Flip-Flop

Perl - Practical Extraction and Report Language

GUI - Graphics User Interface

Page 17: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

xvii

LIST OF APPENDICES

APPENDIX TITLE PAGE

A

B

C

D

E

F

G

H

Summary of CAD Software Module

Example of Balanced Bipartitioning Process

Numerical Example of Floyd-Warshall’s

Algorithm

Numerical Example of Incremental Scheduling

Method

Skew Evaluation Result for TEST_DATA_2

Skew Evaluation Result for TEST_DATA_3

C Code For Cad Software Module

Perl Script For myNetlistParser

143

145

150

156

160

162

165

217

Page 18: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

CHAPTER 1

INTRODUCTION

The thesis proposes a Computer Aided-Design (CAD) software module for

useful-skew tree (UST) synthesis in deep-sub-micron VLSI integrated circuit design.

The CAD software module is coded in C and Perl, and runs in a Microsoft Windows

2000 PC environment. In this chapter the project background, problem statement,

research objectives, scope of work, and thesis organization are presented.

1.1 Background

The last few decades brought explosive growth in the electronics industry due

to the rapid advances in integration technologies and the different benefits of large

scale system design. The manufacturing capability and complexity, combined with

the economic benefits of large electronic systems, are forcing a revolution in the

design of these systems and challenging system designers who are involved in the

design of integrated circuits. Integrated circuits today consist of hundreds of millions

of transistors. Due to the tremendous increase in complexity, automating the design

process has become a crucial issue.

The phase associated with the task of automatically designing a circuit using

Computer Aided Design (CAD) tools is called Electronic Design Automation (EDA).

The objective of the EDA research field is to fully automate the tasks for nearly

every aspect of the development cycle, from the circuit specification and design entry

Page 19: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

2

to the performance analysis, layout generation and verification. A large subset of

problems in VLSI EDA is computationally intensive, and future EDA tools will

require even more accuracy and computational capabilities. For a complicated

problem in the modern VLSI design, an appropriate approach is to use a “divide-and-

conquer” strategy, in which the whole design task is broken down into several sub-

tasks (Sherwani, 1996). These sub-tasks are then more manageable to be solved

using mathematical and heuristic techniques.

Due to the large number of components and the details required by the

fabrication process, the physical design needs the help of computers. As a result,

almost all phases of physical design extensively use CAD tools and many phases

have already been partially or fully automated. The design is carried out in stages.

The final circuit of such an IC can have up to a billion of components; it is delivered

in a step-by-step manner. This is accomplished in several stages such as partitioning,

floorplanning, placement, routing, and compaction. The different stages of physical

design cycle are shown in Figure 1.1. A fully tested and error-free design at the

switch level can be the starting point for a physical design. It is to be realized as the

final circuit using (typically) a million components in the foundry’s library.

Figure 1.1: Physical design cycle

Circuit Design

Floorplaning

&

Placement

Routing

Compaction

Extraction and

Verification

Physical Design Cycle

Partitioning

Fabrication

Page 20: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

3

Among all the stages, routing is a difficult problem, and much research has

been performed on this subject. This is because almost all problems in routing are

computationally complex. Essentially, the task of routing stage is to complete the

interconnections between blocks according to a specified netlist. The goal of a router

is to complete all these circuit connections using the shortest possible wire length,

routing with least number of layers and achieving timing optimization. This is

usually performed in three phases, referred to as the Global Routing, Detailed

Routing and Specialized Routing.

In physical design of VLSI system, there are two types of nets that need

special attention in routing: Clock nets and Power/Ground nets. Clock nets need to

be routed with great precision, since the actual length of the path of a net from its

entry point to its terminals determines the maximum clock frequency on which a chip

may operate. In a synchronous system, chip performance is directly proportional to

its clock frequency. A clock router needs to take several factors into account,

including the resistance and capacitance of the metal layers, noise and cross talk in

wires, and the type of load to be driven. In addition, the clock signal must arrive

simultaneously at all functional units with little or no waveform distortion. The clock

is also known as major power-consuming net. The clock net is only about 4% of the

total routing length of the entire interconnects, but power consumption of clock net is

not minor (Magen et al., 2004). In addition, compared to power and ground routing,

clock routing is relatively more complex.

As a result, designing a clock distribution network has thus become critical

not only for correct synchronization of data, but also can improve circuit

performance and reduces power dissipation. Consequently, one of the challenges that

EDA engineers have to face is to develop a new modeling capability and synthesis

techniques that help to control the clock distribution network routing effectively.

Page 21: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

4

1.2 Problem Statement

When the VLSI feature size becomes progressively smaller, moving into the

Deep Submicron or Ultra-Deep Submicron Technology (sometimes, referred as

nanometer VLSI) era, previously negligible variation effects start to affect circuit

performance and yield significantly. Clock skew, is one of the variation effects that is

always eliminated by previous clock distribution network design, where conventional

clock designs have placed emphasis on seeking zero clock skew. Due to deep-sub-

micron below 0.25 µm process, the long global interconnect lines become highly

resistive as line dimensions are decreased. This increased line resistance is one of the

primary reasons for the increasing significance of interconnection delay and now is

of greater importance than the device delays. The interconnection delays eventually

contribute to a large part of the clock signal delay. This leads to new problem to

arise, as achieving a near-zero skew design is more and more difficult. Thus, the

clock skew cannot longer be ignored. The control of clock skew can also severely

limit the maximum performance of the entire system and create catastrophic race

conditions in which an incorrect data signal may latch within a register.

As a result, in the modern high speed VLSI design era, clock design plays a

crucial role in determining chip performance and facilitating timing and design

convergence. Clock routing is important in the layout design of a synchronous digital

system as it influences correctness, area, speed, power dissipation, power/ground

supply noise, process variations and thermal issues of the synthesized system.

Consequently, the automatic synthesis of the clock tree network has gained

considerable attention in the design community.

The clock net is usually one of the first nets to be routed, and consequently

block own routing areas for nets routed subsequently. As it is one of the largest nets,

the area occupied is also a concern. Also, the clock net accounts for a significant

fraction of the system power dissipation as it switches most frequently and is a large

net. Carrying the heaviest load and switching at a high frequency, the clock typically

dissipates about 40% of the total interconnect power in a synchronous digital system

(Magen et al., 2004).

Page 22: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

5

In short, drastically increased requirements for high performance and high

speed VLSI circuits have posed challenges to the clock routing design, where clock

distribution network with elaborated timing characteristics and minimal wire-length

is crucial for multi-GHz VLSI designs. However, today's CAD systems do not help

much with the problems of time and often leave the final analysis to the designer.

High performance clock design is indeed and area of active research.

Responding to the problem stated above, a CAD software module specifically

for clock tree synthesis is proposed. By providing as inputs, the post-placement

layout data, together with the data paths propagation delay and the clock pins’

coordination and loading capacitance, a Useful-skew Tree (UST) is synthesized. The

output clock tree data is written in a netlist file, which will contain the information

on how the entire clock sinks are routed toward the clock source.

1.3 Objective

The objective of this thesis is to propose the development of a Computer

Aided-Design (CAD) software module for Useful-skew Tree (UST) synthesis in

deep-sub-micron VLSI integrated circuit design. The design of this CAD software

module applies several routing algorithms, including the UST/DME algorithm for

useful-skew tree generation, Balanced Bipartition (BB) method for abstract topology

generation, and Deoker’s graph-theoretic approach for clock skew scheduling.

1.4 Scope of Work

(i) A demonstration application prototype is developed to be used in the system

validation of the proposed CAD software module.

(ii) Ideally, the clock tree synthesis should be performed on the post-placement

layout data. Due to lack of availability of actual circuit data and necessary

Page 23: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

6

software tools, the performance evaluation to our CAD software module is

restricted to:-

(a) Actual circuit data, is applied with International Symposium on

Circuits and Systems (ISCAS89) benchmark circuits which are re-

created manually with Tanner Tools.

(b) The required data such as data path delays, coordination and loading

capacitance of each clock pin is extracted, and then transformed

manually to our CAD software module readable format.

(c) The reliability of synthesized clock tree is verified by manually

comparing the computed skew value with the prescribed skew

constraints.(as applying the synthesized clock tree to actual circuit

layout is impossible to be performed in current stage)

1.5 Project Approach and Tools

Figure 1.2 illustrates the overview of project workflow to the development of

the proposed CAD software module. On completing of the software sub-modules, the

sub-modules are integrated and tested. The tests are conducted using (i) 3

randomized input data and (ii) actual circuit data of ISCAS89 benchmark circuit. A

front-end GUI program is developed using Perl for use in demonstration of the

results. Satisfactory outcomes from the performance evaluation successfully

conclude the research.

Page 24: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

7

Figure 1.2: Project workflow of CAD software module development

The following software tools are used in this work.

i. Microsoft Visual C++ 6.0 – used to develop the entire core program

of proposed CAD software module.

ii. Perl scripting – used to develop the netlist parsers and the front-end

prototyping program.

iii. Tanner Tools – used to re-produce ISCAS89 benchmark circuit in

layout drawing design.

START

Problem Formulation

Work Scope Determination

Literature

Review

Documentation

&

Thesis

Writing

NO

CAD Subsystem Development

Abstract

Topology

Generator

Development

Useful-Skew Tree Synthesizer

Development

Clock

Skew

Scheduler

Development

Functionality Test

Random

.Data

Test

Actual

Circuit

Data

Test

Meet

Functionality?

END

YES

Page 25: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

8

1.6 Research Contribution

i. Investigation on the clock distribution network routing issue in deep

submicron technology VLSI design, and systematic study on the clock

skew scheduling and clock tree routing technique for an efficient

implementation.

ii. Delivering a CAD software module for useful-skew tree synthesis where

the synthesized clock tree layout is not only minimized in term of wire-

length, but also fulfills the prescribed skew constraint values.

1.7 Thesis Organization

This thesis is organized into seven chapters. First chapter is the introduction

chapter. It covers the background of the research, the problem statement, research

objectives, scope of work, project approach and tools, and the significance of the

research.

The second chapter reviews the fundamental concept of clock distribution

network design research. It consists of some previous related research work and

background theory.

Chapter three describes the algorithm of the abstract topology generation and

clock skew scheduling algorithm. Both of the algorithms are used to pre-process the

post-placement layout data so as to transform it to the required input data of

UST/DME algorithm.

Chapter four focuses on the description of our main clock tree construction

algorithm, UST/DME algorithm. The architecture design of the proposed CAD

Page 26: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

9

software module is presented where the UST/DME algorithm is applied as the main

engine of the useful-skew tree synthesis.

Chapter five delivers the detailed implementation work of the proposed CAD

software module which is integrated from the useful-skew tree construction sub-

module, abstract topology generation sub-module and clock skew scheduling sub-

module.

In Chapter Six, the testing methodology is presented. We discuss on how the

test platform is set-up and how the skew values of synthesized clock tree can be

computed.

Chapter Seven shows the experimental results of the functionality test to the

developed CAD software module. Part of ISCAS89 benchmark circuits is used in the

experiments.

The final chapter summarizes the research findings and suggests potential

future work.

Page 27: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

REFERENCES

Bakoglu, H., Walker, J.T. and Meindl, J.D. (1986). A symmetric clock-distribution

tree and optimal high-speed interconnection for reduced clock skew in ULSI and

WSI circuit. Proceedings of IEEE International Conference on Computer Design,

118-122.

Brglez, F., Bryan, D., and Kozminski, K. (1989) Combinational profiles of sequential

benchmark circuits. Proceedings of IEEE International Symposium on Circuits and

Systems. 1929–1934.

Ch’ng, H. S., Chew, E. W., Nasir, S. H., and Mohamed Khalil, H. (2006)

Accelerating graph algorithms with priority queue processor. In Proceedings of

Regional Postgraduate Conference on Engineering and Science (RPCES 2006) July

26-27 JB, Malaysia: UTM 257-262.

Chao, T. H., Hsu, Y. C., and Ho, J. M. (1992a) Zero skew clock net routing. In

proceedings of ACM/IEEE Design Automation Conferences. 518-523.

Chao, T. H., Hsu, Y. C. H., Ho, J. M., Boese, K. D., and Kahng, A. B. (1992b) Zero

skew clock routing with minimum wirelength. IEEE Transactions on Circuit

System,November 1992, 39(11):799–814.

Chaturvedi, R. and Hu, J. (2003) A simple yet effective merging scheme for

prescribed-skew clock routing. In Proceedings of the 21st IEEE International

Conference on Computer Design (ICCD’03) 282–287

Page 28: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

138

Chen, Y. P. and Wong, D. F. (1996) An Algorithm for Zero-Skew Clock Tree

Routing with Buffer Insertion. In Proceedings of European Design and Test

Conferences (ED&TC). 230-236

Cho, J. D., and Sarrafzadeh, M. (1995) A buffer distribution algorithm for high-

performance clock net optimization. IEEE Transactions on VLSI Systems Vol.3,

No.1, pp.84–97.

Cong, J., and Koh, C. K. (1995) Minimum-cost bounded-skew clock routing.

Proceeding of International Symposium Circuits and Systems. 322–327

Cong, J., Kahng, A. B., Koh, C.K., and Tsao. C. W. A. (1998) Bounded-skew clock

and Steiner routing. In ACM Transactions on Design Automation of Electronics

Systems 3(3):341–388.

Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2001) Introduction to

Algorithms. 2nd

Edition. The MIT Press, McGraw-Hill Book Company.

Deokar, R. B., and Sapatnekar, S. S. (1995). A graph-theoretic approach to clock

skew optimization. Proceedings of the 1994 IEEE International Symposium on

Circuits and System volume 1, pages 407–410.

Desai, U., Tam, S., Kim, R., Zhang, J., and Rusu, S. (2000) Itanium processor clock

design. In Proceedings of the 2000 international symposium on Physical design 94–

98.

Dhar, S., Franklin, M. A. and Wann, D. F. (1984). Reduction of clock delays in VLSI

structure. Proceedings of IEEE International Conference on Computer Design 778-

783.

Donno, M., Macii, E, and Mazzoni, L. (2004) Power-aware. clock tree planning In

Proceedings International Symposium on Physical Design 138-147

Page 29: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

139

Edahiro, M. (1993) A clustering-based optimization algorithm in zero skew routings.

Proceedings of the ACM/IEEE Design Automation Conference (DAC), 612–616.

Friedman, E. G. (1989). Performance limitations in synchronous digital systems.

University California, Irvine: Ph.D. dissertation.

Farrahi, A. H., Chen, C., Srivastava, A. (2001) Activity-driven clock design. In IEEE

Transactions nn Computer-Aided Design Of Integrated Circuits and Systems June

2001. Vol. 20, No. 6, pp. 705 – 714.

Fishburn, J. P. (1999). Clock skew optimization. IEEE Transactions on Computer

39(7):945–951.

Fisher, A. L., and Kung, H. T. (1982). Synchronizing large systolic arrays.

Proceedings of SPIE, 44-52.

Guthaus, M. R., Sylvester, D., and Brown, R. B. (2006) Clock buffer and wire sizing

using sequential programming. In Design Automation Conferences (DAC’06) July

24-28. San Francisco, California: ACM, 1041-1046.

Hatamian, M. (1988) Understanding clock skew in synchronous systems. Concurrent

Computations (Algorithms, Architecture and Technology). New York: Plenum, 87–

96.

Held, S., Korte, B., Maßberg, J., Ringe, Matthias., and, Vygen, J. (2003) Clock

scheduling and clock-tree construction for high performance ASICs In Proceedings

of the International Conference on Computer Aided Design (ICCAD’03). 232 – 239.

Hu, T. C. and Kuh, E. (1985) Theory and Concepts of Circuit Layout. VLSI Circuit

Layout:Theory and Design. IEEE Press, New York.

Huang, D. J. H., Kahng, A. B., and Tsao, C. W. A. (1995). On the bounded-skew

clock and steiner routing problems. Proceeding of 32nd Design Automation

Conferences 508–513.

Page 30: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

140

Jackson, M. A. B. Srinivasan, A. and Kuih, E. S. (1990). Clock routing for high

performance IC’s. In Proceedings of the ACM/IEEE Design Automation Conference

322-327.

Kahng, A. B., Cong, J. and Robins, G. (1993). Matching based Models for high

performance clock routing. IEEE Transaction on CAD of Integrated Circuit and

System, August 1993, 12:1157-1169.

Kourtev, I. S., and Friedman, E. G. (1999). Clock skew scheduling for improved

reliability via quadratic programming. Proceedings of the 1999 IEEE/ACM

International Conference on Computer-Aided Design, 239–243.

Lam, W. C. D. and Koh, C. K. (2006) Process variation robust clock tree routing. In

Proceedings of the 2005 Conference on Asia South Pacific Design Automation 606 –

611

Magen, N., Kolodny, A., Weiser, U., and Shamir, N. (2004) Interconnect-Power

Dissipation in a Microprocessor. In Proceedings of the 2004 International Workshop

On System Level Interconnect Prediction, 7–13.

Neves, J. L., and Friedman, E. G. (1996). Optimal clock skew scheduling tolerant to

process variations. Proceedings of the 33rd

Annual Conference on Design

Automation, 623–628.

Pullela, S., Menezes, N., Omar, J., and Pillage, L. T. (1993) Skew and delay

optimization for reliable buffered clock trees. IEEE International Conferences on

Computer Aided Design. 556–562.

Sakallah, K. A., Mudge, T. N. and Olukoton, O. A. (1990). checkTc and minTc:

Timing verification and optimal clocking of synchronous digital circuits.

Proceedings of IEEE International. Conferences. on Computer-Aided Design,

November 1990, 552–555.

Page 31: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

141

Shenoy, N. Brayton, R. K., and Sangiovanni-Vincentelli, A. L. (1992). Graph

algorithms for clock schedule optimization. Proceedings of IEEE International.

Conferences. on Computer-Aided Design, 132–136.

Sherwani, N. (1996). Algorithms for VLSI Physical Design Automation. 2nd

Edition.

Dordrecht, The Netherlands: Kluwer Academic Publishers.

Stinson, J. and Rusu, S. (2003) A 1.5GHz third generation Itanium-2 processor. In

Proceedings of the 40th conference on Design automation 706–709

Szymanski, T. G., and Shenoy, N. (1992). Verifying clock schedules. Proceedings of

IEEE International. Conferences. on Computer-Aided Design, November 1992, 124–

131.

Takahashi, A., Inoue, K., and Kajitani, Y. (1997) Clock-Tree Routing Realizing a

Clock-Schedule for Semi-Synchronous Circuits. In Proceedings of IEEE/ACM

International Conference on Computer Aided Design (ICCAD) '97. 260-265.

Tsai, J. L., Chen, T. H., and Chen, C. C. P. (2004) Zero-skew clock tree optimization

with buffer insertion/sizing and wire sizing. In IEEE Transactions on Computer-

Aided Design of Integrated Circuit and System. Vol. 23, No.4, pp. 565-527.

Tsai, Jeng-Liang (2005) Clock Tree Synthesis for Timing Convergence and Timing

Yield Improvement in Nanometer Technologies. University of Wisconsin-Madison:

Ph.D. Thesis.

Tsao, C. W. A., and Koh, C. K. (2000) UST/DME: a clock tree router for general

skew constraints. Proceedings of the IEEE/ACM ICCAD. 400–405.

Tsay, R. S. (1993). An exact zero-skew clock routing algorithm. IEEE Transactions

on Computer-Aided Design of Integration Circuit System, CAD-12, 242–249.

Page 32: A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK …eprints.utm.my/id/eprint/17986/1/ChewEikWeeMFKE2008.pdf · (VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir

142

Wang, K., and Marek-Sadowska, M. (2004) Buffer sizing for. clock power

minimization subject to general skew. constraints. In Design Automation

Conferences (DAC’04). June 7-11. San Diego, California: ACM, 159–164.

Xi, J. G., and Dai, W. W. M. (1995) Buffer insertion and sizing under process

variations for low power clock distribution,” In Proceedings of 32nd

Design

Automation Conferences. June.

Xi, J. G., and Dai, W. W. M. (1997) Useful-skew clock routing with gate sizing for

low power design. Journal of VLSI Signal Processing. 16(2/3):163–179,

Zanella, S., Nardi, A., Neviani, A., Quarantelli, M., Saxena, S., and Guardiani, C.

(2000) Analysis of the impact of process variations on clock skew. IEEE

Transactions on Semiconductor Manufacturing, vol. 13, no. 4, pp. 401—407.

Zhu, Q., Dai, W. W. M., and Xi, J. G. (1993) Optimal sizing of high speed clock

networks based on distributed rc and transmission line models. IEEE International

Conferences on Computer Aided Design. November 1993, 628–633.