CLOCK DISTRIBUTION USING VHDL

ABSTRACT

Very-large-scale integration (VLSI) is the process of creating integrated

circuits by combining thousands of transistor-based circuits into a single chip. VLSI

began in the 1970s when complex semiconductor and communication technologies were

being developed. The microprocessor is a VLSI device. The term is no longer as common

as it once was, as chips have increased in complexity into the hundreds of millions of

transistors.

VHDL stands for VHSIC Hardware Description Language. VHSIC is

an abbreviation for Very High Speed Integrated Circuit, a project sponsored by the US

Government and Air Force begun in 1980 to advance techniques for designing VLSI

silicon chips. VHDL is an IEEE standard.

The quality of the clock signals is the most important factor for

ensuring a chip’s successful operation. In a design net list, there are hundreds of

thousands or millions of cells. Those cells can be classified as two types: combinational

cells and sequential cells (including memories). The sequential cells are used for storing

information and they must operate on clocks. After the placement stage of the design

implementation process, all of the cells, including the sequential cells are spread around

the entire chip. The task of clock distribution is to distribute the clock signals to all of

these sequential cells. This work is commonly called clock tree synthesis .The principle

idea of how a clock tree is constructed.

Our project is dealing with how the clock is distributed.

DEPT OF ECE Page 1 of 66 AIET

Chapter 1

INTRODUCTION

In the past few decades, Integrated Circuit technology has been advancing

rapidly. In synchronous integrated circuits, clock is used to synchronize the data transfer.

The performance and functionality of the entire system depends on the clock

characteristics. Of late, clock distribution has become an exigent task for the VLSI

designers, as it consumes a large portion of resources like wiring, design time and power.

In the worst case, functional errors can also be caused due to the uncertainties in

clock network delays. These uncertainties also result in performance degradation.

Therefore it is imperative to model clock distributions accurately for their performance.

Clock distribution in high-speed digital systems is an exigent problem

overwhelming a rising fraction of assets for example design time, power, wiring, and

skew, which is the key parameter of interest. First of all, the issues related to clock skew

and it’s estimation in a digital circuit or network comes in our mind. Clock skew is the

difference in clock arrival time between different components across a chip. Due to this

difference in clock arrival time, delay comes into picture in getting the output of each

circuit, which results in speed irregularity of the digital system. [1] In this paper, we

introduce a synchronous counter circuit in which four D Flip-flops are used, driven with

the same clock and in this synchronous circuit we estimate the clock skew using

VHDL(Very High Speed Integrated Circuits Hardware Descriptive Language). This

paper is organized as follows. Section II describes the circuit design .Section III consists

the implementation of the circuit in VHDL. Section IV shows the analysis and simulation

results of the circuit. Section V describes the conclusion of our analysis.

Clock signals are typically loaded with the greatest fan-out and operate at the

highest speeds of any signal, either control or data, within the entire synchronous system.

Since the data signals are provided with a temporal reference by the clock signals, the

clock waveforms must be particularly clean and sharp. Furthermore, these clock signals

are particularly affected by technology scaling (Moore's law), in that long global

interconnect lines become significantly more resistive as line dimensions are decreased.

This increased line resistance is one of the primary reasons for the increasing significance


http://en.wikipedia.org/wiki/Moore's_law

http://en.wikipedia.org/wiki/Waveform

http://en.wikipedia.org/wiki/Fanout

of clock distribution on synchronous performance. Finally, the control of any differences

and uncertainty in the arrival times of the clock signals can severely limit the maximum

performance of the entire system and create catastrophic race conditions in which an

incorrect data signal may latch within a register. The clock distribution network often

takes a significant fraction of the power consumed by a chip. Furthermore, significant

power can be wasted in transitions within blocks, even when their output is not needed.

These observations have led to a power saving technique called clock gating, which

involves adding logic gates to the clock distribution tree, so portions of the tree can be

turned off when not needed (when a clock can be safely gated may be determined either

through automatic analysis of the circuit, or specified by the designer). The exact savings

are very design dependent, but around 20-30% is often achievable. Most

synchronous digital systems consist of cascaded banks of

sequential registers with combinational logic between each set of registers.

The functional requirements of the digital system are satisfied by the logic stages. Each

logic stage introduces delay that affects timing performance, and the timing performance

of the digital design can be evaluated relative to the timing requirements by a timing

analysis. Often special consideration must be made to meet the timing requirements. For

example, the global performance and local timing requirements may be satisfied by the

careful insertion of pipeline registers into equally spaced time windows to satisfy critical

worst-case timing constraints.

The proper design of the clock distribution network helps ensure that critical

timing requirements are satisfied and that no race conditions exist (clock skew). The

delay components that make up a general synchronous system are composed of the

following three individual subsystems: the memory storage elements, the logic elements,

and the clocking circuitry and distribution network. Interrelationships among these three

subsystems of a synchronous digital system are critical to achieving maximum levels of

performance and reliability.

Clock gating is a popular technique used in many synchronous circuits for

reducing dynamic power dissipation. Clock gating saves power by adding more logic to a

circuit to prune the clock tree. Pruning the clock disables portions of the circuitry so that

the flip-flops in them do not have to switch states. Switching states consumes power.


http://en.wikipedia.org/wiki/Flip-flop_(electronics)

http://en.wikipedia.org/wiki/Clock_tree

http://en.wikipedia.org/wiki/Power_dissipation

http://en.wikipedia.org/wiki/Clock_skew

http://en.wikipedia.org/w/index.php?title=Timing_constraints&action=edit&redlink=1

http://en.wikipedia.org/wiki/Pipeline_(computing)

http://en.wikipedia.org/wiki/Functional_requirements

http://en.wikipedia.org/wiki/Combinational_logic

http://en.wikipedia.org/wiki/Flip-flop_(electronics)

http://en.wikipedia.org/wiki/Digital

http://en.wikipedia.org/wiki/Clock_gating

http://en.wikipedia.org/wiki/Race_hazard

When not being switched, the switching power consumption goes to zero, and

only leakage currents are incurred.

Clock gating works by taking the enable conditions attached to registers, and uses

them to gate the clocks. Therefore it is imperative that a design must contain these enable

conditions in order to use and benefit from clock gating. This clock gating process can

also save significant die area as well as power, since it removes large numbers

of mixes and replaces them with clock gating logic. This clock gating logic is generally in

the form of "Integrated clock gating" (ICG) cells. However, note that the clock gating

logic will change the clock tree structure, since the clock gating logic will sit in the clock

tree.

1.1 Motivation

Clock distribution networks synchronize the data transfer between the data

paths. In Integrated Circuits, proper design of a clock distribution network is necessary as

it has a direct impact on the system performance. Significant research has been done both

in the industrial and academic communities in the area of design and the optimizations of

clock distribution networks. To the best of our knowledge, there are no previous works

on modeling clock distribution networks using Hardware Description Languages. The

need for accurate modeling of general characteristics of clock distribution networks has

motivated this research. Providing accurate models to identify any uncertainty in the

clock signal arrival times at different significant points in the clock distribution network

will be a great aid to the circuit designers. The models developed as part of this research

will be used to build the model library in Distributed Processing Laboratory (DPL) at

University of Cincinnati.

1.2 Objective

This thesis deals with the following research questions.

1. Is it possible to accurately model clock distributions using VHDL? If so,What

do the models look like?

2. Is VHDL versatile enough to model different clock distribution networks?


http://en.wikipedia.org/wiki/Multiplexer

http://en.wikipedia.org/wiki/Leakage_(semiconductors)

3. What are the characteristics of clock distribution that can be modeled by

VHDL?

1.3 Approach

The approach taken in this research to model any clock distribution network is as

follows: Firstly a library of components which are essential components of any clock

distribution network is build. The components considered in this research are as follows:

• Interconnects

• Buffers

• Phase Locked Loop

In addition to the above listed components, simple models for the oscillator and

the load for the clock distribution network are modeled. Using the library of components,

a generic model for a clock distribution network can be generated. This satisfies the first

research goal. To validate the second research goal, two case studies have been

considered in this research. An H-tree clock distribution network and a regular pattern

clock distribution network have been chosen for their versatility. By modeling these two

different types of clock distribution networks, the versatility of the VHDL-AMS is

proven. The important characteristic of a clock distribution network considered in this

research is the skew of the clock signal.

1.4 Overview of Results

Experiments have been setup to validate the goals of the research. It is shown how

the models developed for this research can be used to model clock distribution networks.

The simulation and CPU times of the various models are reported to validate the speed of

VHDL language. Two case studies were considered to prove the versatility of the clock

distribution network. The skew is an important performance limiting factor in any clock

distribution. The skew variation with varying levels of the H-tree, interconnect lengths,

load capacitance and the number of stages in a regular pattern clock distribution network

is analyzed.


CHAPTER 2

BACKGROUND

This chapter discusses the need for modeling clock distribution networks in A

brief description of the related work is provided in provides an overview of the VHDL-

language used in this research. The language constructs and the modeling techniques to

model a mixed-signal system are presented in this section. Background information of

clock distribution networks is presented in Section 2.4. And the components of the clock

distribution networks: interconnects, buffers and phase locked loops are discussed after

wards Finally the characteristics of clock distribution considered in this research are

discussed in Section 2.5.

2.1 Need for Modeling Clock Distribution Networks

Clock distribution over an entire chip is a very complex problem and is one of the

main challenges in the design of today’s high-performance processors. Clock distribution

has a significant impact on the performance of the entire system and heavily contributes

to the total power dissipation of the chip. Any inaccuracies of clock timing may be

critical to the circuit operations resulting in functionality errors. An accurate model of the

clock distribution network for any VLSI circuit is helpful for accurate performance

evaluation [1]. It will be of great help to the circuit designers to model uncertainties in the

clock signal arrival times between key points in a clock distribution network.

2.2 Related Work

In the literature, there are some works reported to model the impact of process

variations on Clock skew [4], and the effect of technology scaling on clock skew and

interconnect delay [5]. Research has also been done to model the effects of systematic

within-die interconnect parameter variation like metal thickness, metal line width, and


inter-layer dielectric thickness variations on circuit performance [6]. Buffer Insertion and

the effect of process variations on its sizing have also been studied [7]. Clock skew

analysis has also been reported in many research papers [8]. But most of the above works

use statistical analysis and Monte-Carlo simulations to find the effects of different

variations on skew. To the best of our knowledge, no work has been reported to model

clock distribution network using a hardware description language (HDL) taking

parameters like skew and process variations in to account. Research efforts have been

made to model the Phase Locked Loop and its components for mixed-mode simulation

[9, 10, 11]. But there are no existing works which link the phase locked loop with the

clock distribution network and model its effect on the performance of the clock

distribution. The novel feature of this research is that a clock distribution network is

being modeled using a HDL by modeling its components (Interconnects, Buffers, Phase

locked loop, Source and Load) and their impact on clock skew and rise and fall times of

the clock. Use of a HDL combines the advantages of the flexibility of the language in

modeling, with the accuracy and speed of simulation and modeling. The HDL used for

this research is VHDL.

2.3 Clock Distribution Networks

Designing a clock distribution network is a very complex task for the circuit

designers. The difficulties in the clock distribution design are being augmented by the

device technology improvements leading to smaller feature size; larger chip area and

increased component density, since they result in higher interconnect resistance and

higher clock loads [14]. Some of the clock distribution networks used in certain

microprocessors are discussed below. The clock distribution network used in Intel IA-64

microprocessor is shown in Figure 2.1. The significant segments are discussed as follows.

The global clock distribution connects the PLL clock generator to the de-skewing buffers.

The regional clock distribution connects the clock from de-skewing buffers to the local

clock regions using clock grids. The local clock distribution connects the clock from the

regional clock grid to the clocked elements using local clock buffers and local

interconnections

Figure 2.1: Clock distribution network in Intel IA-64 microprocessor [8] The clock

distribution network of the 600MHz Alpha microprocessor is depicted in Figure 2.2. The


clock is generated on-chip using a PLL which multiplies an 80-200MHz reference clock

to generate a frequency of 600MHz. The feedback loop of the PLL includes the clock

distribution network up to and including the global clock (GCLK) to control phase

alignment. A high-gain buffer network is used to route GCLK to a central point on the

die. From there the clock is distributed by buffered

X and H trees as shown in Figure 2.3 to the GCLK drivers located in a windowpane

pattern across the chip

Figure 1: Clock distribution network in 600MHz Alpha Microprocessor


Figure 2: Global Clock Network [15]

To summarize, any clock distribution network will usually contain the following

components: Phase-locked loop for on-chip clock generation, clock buffers to drive the

large capacitive load in the network and the local interconnects to connect the clock to

the clock driving points. Other components of the clock distribution networks include

De-skewing buffers, Delay locked loops, etc. In this research, interconnects, buffers and

PLLs will be studied in detail.

2.4 Components of a Clock Distribution Network

2.4.1 Buffers

Another important component of any clock-distribution network is a buffer.

Typically the clock is connected to a large number of components in the circuit resulting

in large loads. The characteristic load consists of the clock distribution wires, and the end

points which drive the logic blocks in the circuit. Buffers are inserted between the clock

source and the load, to ensure that the clock signals at different end points have smaller

rise and fall times. The clock buffer design involves the following steps: deciding on the

buffer delay and the output slopes, and choosing the buffer sizes based on the load

capacitance to be driven. The clock buffer delays affect the clock skew of the network

and hence are chosen such that the clock skew budget is met. The effect of clock buffer

delays on skew is shown in Figure 2.5.


Figure 3: Clock skew variation with buffer delay

2.4.2 Phase Locked Loop

A Phase Locked Loop (PLL) is used as a frequency synthesizer to multiply a

reference frequency generated by a crystal oscillator to much higher frequencies needed

for today’s higher end microprocessors. In principle, a PLL synchronizes the frequency

of the output signal generated by an oscillator with the frequency of a reference signal

and eliminates any phase misalignment. Some of the other applications of PLLs in

communications are carrier and clock recovery, frequency and phase modulation and

demodulation [17]. Thus, the main purpose of a PLL in integrated circuits is to generate

clock and to obtain accurate phase synchronizations between the off-chip reference clock

and the internal clock signals.

A rudimentary PLL (shown in Figure 2.6) typically has the following three basic

blocks in a feedback loop

• Phase Detector (PD)

• Low Pass Filter (LPF)

• Voltage Controlled Oscillator (VCO)

Ref. signal


PDLPF VCO

Figure 4: Block diagram of a basic PLL

The phase detector may be a simple analog multiplier. Based on the

requirements of the application, more complicated phase detectors are used in practice.

The loop filters are optional and are used to increase the bandwidth of the PLL and to

reduce the phase noise. The VCO is an oscillator producing an output frequency

proportional to its control voltage [18]. The main drawback of a rudimentary PLL using

an analog multiplier as a phase detector is that it has a finite phase error that is a function

of input frequency. The PLLs used contemporarily are charge-pump PLLs, which have

the capability to track the phase accurately, resulting in practically zero nominal phase

error regardless of its input

Frequency. A charge pump PLL shown in Figure 2.7 has the following blocks.

• Phase detector (PD)

• Charge pump (CP)

• Low Pass Filter (LPF)

• Voltage Controlled Oscillator (VCO)

Reference signal

OUTPUT

Figure 5: Block diagram of charge-pump PLL

A brief discussion about the different blocks of a PLL is given below

2.4.3 Phase Detector


PD

FREQUENCY

DIVIDER

CP

LPF

VCO

A phase detector of the PLL and outputs the phase error. There will be no output

from the PD if the two signals have same phase and frequency. Else, the phase error is

used to generate a control voltage for the voltage controlled oscillator such that the phase

error minimizes to zero.

2.4.4 The charge pump

The charge pump converts the phase error detected by the PD into current or

Voltage, to control the frequency generated by the VCO. The charge pump can be used to

set the phase detector gain, KD. The Charge Pump either charges or discharges the

filter’s capacitors based on the output of the PD. If the reference signal leads the VCO

output, the output of the PD will signal the Charge Pump to pump more charge into the

capacitors. If it lags, the equivalent amount of charge discharges from the filter’s

capacitors.

2.4.5 Low Pass Filter

The main purpose of a low pass filter is to modify the bandwidth of the PLL and

to reduce the phase error. The filter converts the charge of the charge-pump into voltage,

to control the frequency of the VCO. A passive RC filter can be used as a simple low

pass filter. The higher the order of the filter, the better is the noise rejection in the PLL.

The filters can be a passive RC filters or active filters using op-amps. If the control

voltage of the VCO is less than the voltage generated by the charge pump, a passive RC

filter will suffice, if not, an active filter is used

2.4.6 Voltage Controlled Oscillator

A Voltage Controlled Oscillator is the heart of the PLL as it dominates the phase

noise performance of the entire PLL. It produces the required frequency for the PLL. The

frequency of the VCO can be controlled by a control voltage. The output of the VCO is

fed back to the phase detector and the phase difference between the reference signal and

the output in such a way that the output matches the reference signal closely. This process

is called “acquiring of the VCO is changed into a DC output voltage. This DC voltage

controls the frequency of the VCO lock”.


2.4.7 Frequency Divider

Typically the output frequency of a PLL will be a multiple of the reference signal

Frequency. Hence, a frequency divider is used as a part of the feedback loop to divide the

frequency generated by the VCO into a value that can be comparable to the reference

signal. Of late frequency dividers have become inevitable in any PLL circuit as the output

clock frequencies of today’s microprocessors are much higher than their input clock

frequencies.

2.5 Characteristics of Clock Distribution Network

Clock distribution is very crucial for any digital system. Ideally, clock signals

should have zero skew, zero jitter, negligible rise and fall times and specified duty cycles.

But in reality, clock signals have non-zero skews, non-zero jitter, considerable rise and

fall times and varying duty cycles. Power consumption is another important performance

metric of any clock distribution network, as it may take a large portion of the total power

consumption of the entire chip. As the clock frequency increases, clock inaccuracy is

occupying a considerable percentage of the clock period. In present-day microprocessors,

clock skews take up as much as 10% of the available clock cycle time [22]. Thus, clock

distribution networks can be modeled for various characteristics like clock skew, clock

jitter, rise and fall times and variations in duty cycles, at different driving points in the

distribution network. The characteristic of clock distribution considered in this research is

clock skew.

2.5.1 Clock Skew

An ideal clock is defined as a signal which arrives at different register inputs at

the same time. But due to static mismatches in the clock paths and the clock load

variations, clocks are not ideal. The absolute delay of any clock path is not that important

compared to the relative arrival of the clock at different points in the circuit. Clock skew

is defined as the spatial variation in arrival time of the clock at different clock terminals

in the circuit. Clock skew results in phase shift . Clock skew is one of the important

performance limiting factors in the system performance. Usually, the circuit designers


have a clock skew budget to meet, beyond which the system will not have correct

functionality at a desired frequency.

The clock distributions usually target zero or minimum skew for efficient

performance. Zero skew is obtained when the phase delays of all the clock terminals from

the clock source calculated with a delay model, like the Elmore delay model, are equal

under ideal process condition. Clock skew results mostly from the different delays

associated with the clock buffers present on chip. The common design technique used to

reduce the skew is to equalize the capacitive load of clock signal as seen by each clock

buffer.

Skew can be caused either by a systematic effect which is predictable or a random

effect which is not predictable. Some of the reasons for skew include variations in

effective channel lengths of devices, Inter-layer dielectric (ILD) thickness variations,

process variations, threshold voltage variations, power supply voltage variations and

temperature variations across the die and design errors and capacitive coupling in the

circuit .



CHAPTER 3

3.1 CLOCK DISTRIBUTION

The quality of the clock signals is the most important factor for ensuring a chip’s

successful operation. In a design net list, there are hundreds of thousands or millions of

cells. Those cells can be classified as two types: combinational cells and sequential cells

(including memories). The sequential cells are used for storing information and they must

operate on clocks. After the placement stage of the design implementation process, all of

the cells, including the sequential cells, are spread around the entire chip. The task of

clock distribution is to distribute the clock signals to all of these sequential cells. This

work is commonly called clock tree synthesis. Figure 4.36 (page 141) shows the principle

idea of how a clock tree is constructed. As depicted, a clock network may be constructed

in tree fashion. Starting from the clock source, the first level of clock buffers are laid out,

then the second level, then the third level, and so on. In most designs, there are many

clock domains, and each domain has hundreds or thousands of sequential cells attached to

it. This many cells cannot be driven by a single buffer from the clock source, even with

the strongest buffer in the library.

A tree structure is used to deal with this problem by letting each buffer drive only

the number of loads that it is allowed to drive. As a result, the quality of the clock signal,

in term of slew rate (the rising and falling time of the clock edges), is not significantly

degraded when it reaches the leaf sequential cells. Figure 4.37 shows the commonly used

clock tree structures in the clock distribution networks: trunk, branch-tree, mesh, X-tree

and H-tree. Figure 4.38 is an example of how a real clock tree looks in a design block. In

this simple example, there is one level of clock buffers between the clock root and the

leaves. Another type of clock distribution network is the clock grid. In this approach, a

grid of metal structure, which covers the entire chip, is dedicated to the distribution of

clock signals, as graphically shown in Figure 3.1


Figure 6 A basic clock tree.


Figure 7 commonly used tree structures in clock distribution networks


Figure.8 An example of a clock tree in chip design.

A tree structure usually consumes less wiring and thus less capacitance and less

routing resources, which results in lower power and less latency. However, a tree

structure must be carefully tuned and it is very load (placement) dependent. In contrast, a

grid structure uses significantly more routing resources and thus has large capacitance

and large latency, but it tends to be less load dependent as any leaf cell can always find a

nearby tapping point to connect to directly.

As a result, a grid structure clock distribution network is typically used only for

high-end applications, such as microprocessors, whereas a tree structure is widely used

for ASIC-based designs. The clock distribution network consumes more than 10% of the

total power used by the chip in large designs. During each clock cycle, the capacitance

associated with the entire clock structure must be charged to the supply voltage and

subsequently dumped to ground, with the stored energy lost as heat. To ease this


problem, resonant clock distribution has been actively studied by some groups. In this

method, the traditional tree- or grid-driven clock structure is augmented with on-chip

inductors to resonate with the clock capacitance at the clock’s fundamental frequency.

The energy of the fundamental frequency resonates back and forth between its electric

and magnetic form rather than being dissipated as heat. The clock driver is only used for

adding the energy lost during the operation. This idea is depicted in Figure 3.3

3.2 The Key Requirements For Constructing A Clock Tree.

The key requirements for constructing a clock tree are clock skew and insertion

delay. Clock skew is the maximum timing difference among the arrive times of the leaf

cells in a clock domain. In Figure 4.41, the result of a SPICE analysis of a clock tree is

demonstrated. A clock pulse is injected into the clock tree at time 0 ns with a rise time of

1 ns. After traveling inside the tree, the clock signal arrives at the leaves (also called

clock sinks) at approximately 3.4 ns. However, it is clear that the arrive times for the

leaves are not the same due to the different physical locations of the leaf cells .They

spread within a range of approximately 1 ns, which is defined as clock skew. In other

words, the existence of skew means that not all of the sequential cells in a particular

clock domain receive their clock signals at exactly the same moment, as desired. Clock

skew is significant because it eats up the time budget assigned for logic operations. If

skew is over the desired budget, the chip might not function correctly at its designed

speed (a setup violation), or might not function at all (a hold violation).Clock tree

insertion delay is the measure of time difference between the clock signal started at the

source and the clock signal received at the leaf cells. The concept of insertion delay is

also depicted in Figure 4.41. Insertion delay is important because the designer might need

to balance clock tree delays between different clock domains for cross-domain

information exchange. Also, insertion delay impacts I/O timing constraints. These

scenarios are graphically demonstrated in Figure 4.42 where the insertion delays of

CLK1_TREE and CLK2_TREE must be balanced for the proper exchange of data

between the logic cells of the two domains. For the CLK2domain, the value of the

insertion delay must be known so that the communication between I/O cells (DATA_IN,

DATA_OUT) and logic cells can be carried out safely.


3.3 Difference between Time skew and length skew in a clock Tree

Clock tree synthesis is a crucial step in a chip’s physical design. The quality of the

clock tree has a great impact on the status of timing closure. One of the critical metrics in

measuring the clock tree quality is the time skew, which is the maximum arrive time

difference among the clock sinks. Physically, the time skew is caused by the different

locations of the clock sinks on chip. Figure 3.5 is an abstract view of the physical

locations of a clock tree’s leaf cells. Figure 3.6 presents the same information in a real

layout .As seen, the clock sinks are spread within a certain region. From the clock source

to various clock sinks, the physical distances are different. Hence, when connections are

completed by metal routing, the wire lengths are not the same. The maximum wire length

difference is referred to as length skew .Physically, the clock tree is composed of clock

buffers and routing wires. Therefore, the time delay from the clock source to any clock

sink is affected by two factors: the gate’s delay and the wire’s delay. Since these two

types of delay scale differently among different process, temperature, and voltage

(PTV) conditions, a time-balanced clock tree in one PTV corner might experience

significant time skew in another PTV corner if the clock tree is constructed with a

considerable amount of length skew. This scenario is worsened when the process

geometry becomes smaller because wire delay carries more weight in the total delay

equation. Ideally, among different branches of a clock tree, it is desired to match gate

delay with gate delay and wire delay with wire delay. In other words, time skew should

be minimized by using the approach of minimizing the length skew such that the amount

of time skew is preserved over different PTV conditions. This is especially helpful for the

on-chip variation (OCV) optimization .Figure 3.7 depicts the relationship between time

skew and length skew for the same clock tree in Figure 3.5 . As shown in this space–

timing plot this clock tree has six levels. Any vertical line in this plot represents a gate

delay since a gate has no length skew but time delay. Wire delays are expressed by nearly

horizontal lines, which have a large length difference but small time difference. At Level

4 and Level 5, the clock tree starts to grow different branches. Consequently, the length

skew is seen at these levels. The time skew for this tree is ~30 ps, whereas the length

skew is approximately250 _m. Figure 3.7 is the same space–time relationship in a three

dimensional(3D) world. Figure 3.8 is the 3D plot of a very large clock tree with 23,942


sinks. The time skew discussed above is called global skew, which is usually pessimistic.

A more specific term, local skew, is defined as the time difference for the clock signals to

reach the sinks that have data exchange activities among them. Local skew is more

precise and useful for circuit analysis but the extraction of necessary information for

processing is beyond the capability of current tools.

Figure 9 Cell based ASIC design methodology.


Figure 10 Abstract view of the physical distribution of a clock sink.


Figure 11 Layout view of the physical distribution of a clock sink.


Figure 12 The clock tree in a three-dimensional space–time plot.


Figure 13 A large clock tree of 23,942 sinks.


CHAPTER 4

4.1 VLSI

4.1.1 INTRODUCTION

Very-large-scale integration (VLSI) is the process of creating integrated

circuits by combining thousands of transistor-based circuits into a single chip. VLSI began in

the 1970s when complex semiconductor and communication technologies were being

developed. The microprocessor is a VLSI device. The term is no longer as common as it once

was, as chips have increased in complexity into the hundreds of millions of transistors.

4.1.2 Overview

The first semiconductor chips held one transistor each. Subsequent advances

added more and more transistors, and, as a consequence, more individual functions or

systems were integrated over time. The first integrated circuits held only a few devices,

perhaps as many as ten diodes, transistors, resistors and capacitors, making it possible to

fabricate one or more logic gates on a single device. Now known retrospectively as "small-

scale integration" (SSI), improvements in technique led to devices with hundreds of logic

gates, known as large-scale integration (LSI), i.e. systems with at least a thousand logic gates.

Current technology has moved far past this mark and today's microprocessors have many

millions of gates and hundreds of millions of individual transistors.

At one time, there was an effort to name and calibrate various levels of large-scale

integration above VLSI. Terms like Ultra-large-scale Integration (ULSI) were used. But the

huge number of gates and transistors available on common devices has rendered such fine

distinctions moot. Terms suggesting greater than VLSI levels of integration are no longer in

widespread use. Even VLSI is now somewhat quaint, given the common assumption that all

microprocessors are VLSI or better.

As of early 2008, billion-transistor processors are commercially available, an example

of which is Intel's Montecito Itanium chip. This is expected to become more commonplace as

semiconductor fabrication moves from the current generation of 65 nm processes to the next


45 nm generations (while experiencing new challenges such as increased variation across

process corners). Another notable example is NVIDIA’s 280 series GPU.

This microprocessor is unique in the fact that its 1.4 Billion transistor count, capable

of a teraflop of performance, is almost entirely dedicated to logic (Itanium's transistor count

is largely due to the 24MB L3 cache). Current designs, as opposed to the earliest devices, use

extensive design automation and automated logic synthesis to lay out the transistors, enabling

higher levels of complexity in the resulting logic functionality. Certain high-performance

logic blocks like the SRAM cell, however, are still designed by hand to ensure the highest

efficiency (sometimes by bending or breaking established design rules to obtain the last bit of

performance by trading stability).

4.1.3 What is VLSI?

VLSI stands for "Very Large Scale Integration". This is the field which involves

packing more and more logic devices into smaller and smaller areas.

1. Simply we say Integrated circuit is many transistors on one chip.

2. Design/manufacturing of extremely small, complex circuitry using modified

semiconductor material

3. Integrated circuit (IC) may contain millions of transistors, each a few mm in size

4. Applications wide ranging: most electronic logic devices

4.1.4 History of Scale Integration

late 1940s Transistor invented at Bell Labs

late 1950s First IC (JK-FF by Jack Kilby at TI)

early 1960s Small Scale Integration (SSI)

10s of transistors on a chip

late 1960s Medium Scale Integration (MSI)

100s of transistors on a chip

early 1970s Large Scale Integration (LSI)

1000s of transistor on a chip


early 1980s VLSI 10,000s of transistors on a

chip (later 100,000s & now 1,000,000s)

Ultra LSI is sometimes used for 1,000,000s

SSI - Small-Scale Integration (0-102)

MSI - Medium-Scale Integration (102-103)

LSI - Large-Scale Integration (103-105)

VLSI - Very Large-Scale Integration (105-107)

ULSI - Ultra Large-Scale Integration (>=107)

4.1.5 Advantages of ICs over discrete components

While we will concentrate on integrated circuits, the properties of integrated circuits-

what we can and cannot efficiently put in an integrated circuit-largely determine the

architecture of the entire system. Integrated circuits improve system characteristics in several

critical ways. ICs have three key advantages over digital circuits built from discrete

components:

Size. Integrated circuits are much smaller-both transistors and wires are shrunk to micrometer

sizes, compared to the millimetre or centimetre scales of discrete components. Small size

leads to advantages in speed and power consumption, since smaller components have smaller

parasitic resistances, capacitances, and inductances.

Speed. Signals can be switched between logic 0 and logic 1 much quicker within a chip than

they can between chips. Communication within a chip can occur hundreds of times faster

than communication between chips on a printed circuit board. The high speed of circuits on-

chip is due to their small size-smaller components and wires have smaller parasitic

capacitances to slow down the signal.

Power consumption. Logic operations within a chip also take much less power. Once again,

lower power consumption is largely due to the small size of circuits on the chip-smaller

parasitic capacitances and resistances require less power to drive them.

VLSI and systems

These advantages of integrated circuits translate into advantages at the system level:


Smaller physical size. Smallness is often an advantage in itself-consider portable televisions

or handheld cellular telephones.

Lower power consumption. Replacing a handful of standard parts with a single chip reduces

total power consumption. Reducing power consumption has a ripple effect on the rest of the

system: a smaller, cheaper power supply can be used; since less power consumption means

less heat, a fan may no longer be necessary; a simpler cabinet with less shielding for

electromagnetic shielding may be feasible, too.

Reduced cost. Reducing the number of components, the power supply requirements, cabinet

costs, and so on, will inevitably reduce system cost. The ripple effect of integration is such

that the cost of a system built from custom ICs can be less, even though the individual ICs

cost more than the standard parts they replace.

Understanding why integrated circuit technology has such profound influence on the

design of digital systems requires understanding both the technology of IC manufacturing and

the economics of ICs and digital systems.

Applications

Electronic system in cars.

Digital electronics control VCRs

Transaction processing system, ATM

Personal computers and Workstations

Medical electronic systems.

Etc….

4.1.6 Applications of VLSI

Electronic systems now perform a wide variety of tasks in daily life. Electronic

systems in some cases have replaced mechanisms that operated mechanically, hydraulically,

or by other means; electronics are usually smaller, more flexible, and easier to service. In

other cases electronic systems have created totally new applications. Electronic systems

perform a variety of tasks, some of them visible, some more hidden:

Personal entertainment systems such as portable MP3 players and DVD players perform

sophisticated algorithms with remarkably little energy.


Electronic systems in cars operate stereo systems and displays; they also control fuel

injection systems, adjust suspensions to varying terrain, and perform the control functions

required for anti-lock braking (ABS) systems.

Digital electronics compress and decompress video, even at high-definition data rates, on-the-

fly in consumer electronics.

Low-cost terminals for Web browsing still require sophisticated electronics, despite their

dedicated function.

Personal computers and workstations provide word-processing, financial analysis, and games.

Computers include both central processing units (CPUs) and special-purpose hardware for

disk access, faster screen display, etc.

Medical electronic systems measure bodily functions and perform complex processing

algorithms to warn about unusual conditions. The availability of these complex systems, far

from overwhelming consumers, only creates demand for even more complex systems.

The growing sophistication of applications continually pushes the design and

manufacturing of integrated circuits and electronic systems to new levels of complexity. And

perhaps the most amazing characteristic of this collection of systems is its variety-as systems

become more complex, we build not a few general-purpose computers but an ever wider

range of special-purpose systems. Our ability to do so is a testament to our growing mastery

of both integrated circuit manufacturing and design, but the increasing demands of customers

continue to test the limits of design and manufacturing.

Electronic systems now perform a wide variety of tasks in daily life. Electronic

systems in some cases have replaced mechanisms that operated mechanically, hydraulically,

or by other means; electronics are usually smaller, more flexible, and easier to service. In

other cases electronic systems have created totally new applications.


4.2 VHDL

4.2.1 Introduction

VHDL is an acronym for Very High Speed Integrated Circuits Hardware

description Language. The language can be used to model a digital system at many levels

of abstraction ranging from the algorithmic level to the gate level. The complexity of the

digital system being modeled could vary from that of a simple gate to a complete digital

electronic system. The VHDL language can be regarded as an integrated amalgamation

of sequential, concurrent, net list and waveform generation languages and timing

specifications.

4.2.2 History of VHDL

VHDL stands for VHSIC (Very High Speed Integrated Circuit) Hardware Description

Language. It was developed in the 1980’s as spin-off of a high-speed integrated circuit

research project funded by the US department of defence. During the VHSIC program,

researchers were confronted with the daunting task of describing circuits of enormous scale

(for their time) and of managing very large circuit design problems that involved multiple

teams of engineers. With only gate-level tools available, it soon became clear that more

structured design methods and tools would be needed.

To meet this challenge, teams of engineers from three companies - IBM, Texas

Instruments and Intermetrics — were contracted by the department of defence to complete

the specification and implementation of a new language based design description method.

The first publicly available version of VHDL, version 7.2 was released in 1985. In 1986, the

IEEE was presented with a proposal to standardize the language, which it did in 1987 and

academic representatives. The resulting standard, IEEE 1076—1987 is the basis for virtually

every simulation and synthesis product sold today. An enhanced and updated version of the

language, IEEE 1076-1993, was released in 1994, and VHDL tool vendors have been

responding by adding these new language features to their products.

Although IEEE standard 1076 defines the complete VHDL language, there are aspects

of the language that make it difficult to write completely portable design descriptions


(description that can be simulated identically using different vendor’s tools). The problem

stems from the fact that VHDL supports many abstract data types, but it does not address the

simple problem of characterizing different signal strengths or commonly used simulation

conditions such as unknowns and high impedances. Soon after IEEE 1076-1987 [3] was

adopted, simulator companies began enhancing VHDL with new non-standard types to allow

their customers to accurately simulate complex electronic circuits. This caused problems

because design descriptions entered into one simulator were often incompatible with another

with other environments. VHDL was quickly becoming a non-standard.

To get around the problem of non-standard data types, an IEEE committee adopted

another standard. This standard numbered 1164, defines a standard package (a VHDL feature

that allows commonly used declaration to be collected into an external library) containing

definition for a standard nine-value data type. This standard data type is called standard logic,

and the IELL 1164 package is often referred to as the standard logic package.

The problem stems from the fact that VHDL supports many abstract data types, but it

does not address the simple problem of characterizing different signal strengths or commonly

used simulation conditions such as unknowns and high impedances. Soon after IEEE 1076-

1987 [3] was adopted, simulator companies began enhancing VHDL with new non-standard

types to allow their customers to accurately simulate complex electronic circuits. This caused

problems because design descriptions entered into one simulator were often incompatible

with another with other environments. VHDL was quickly becoming a non-standard.

The IEEN 1076-1987 and IEEE 1164 standards together form the complete VHDL

standard in widest use today (IEEE 1076-1993 is slowly working its way into the VHDL

mainstream, but it does not add significant number of features for synthesis users).

In the search for a standard design and documentation tool for the Very High Speed

Integrated Circuits (VHSIC) program the United States Department of Defence (DOD) in

the summer of 1981 sponsored a workshop on HDLs at Woods Hole, Massachusetts. The

conclusion of the workshop was the need for a standard language, and the features that might

be required by such a standard in 1983.DoD established requirements for a standard VHSIC

hardware description language(VHDL), based on the recommendation of the “Woods Hole”


workshop. A contract for the development of the VHDL language, its environment, and its

software was awarded to IBM, Texas instruments and Intermetrics. VHDL 2.0 was released

only six months after the project began. The language was significantly improved hereafter

and other shortcomings were corrected leading to the release of VHDL 6.0. In 1985 this

significant developments led to the release of VHDL 6.0. In 1985 these significant

development led to the release of VHDL 7.2 language reference manual. This was later on

developed as IEEE 1076/A VHDL language reference manual.

Efforts for defining the new version of VHDL stated in 1990 by a ream of volunteers

working under the IEEE DASC (Design Automation Standards committee). In October of

1992, a new VHDL’93 was completed and was released for review. After minor

modifications, this new version was approved by the VHDL balloting group members and

became the new VHDL language standard. The present VHDL standard is formally referred

as VHDL 1076-1993.

In the search for a standard design and documentation tool for the Very High Speed

Integrated Circuits (VHSIC) program the United States Department of Defence (DOD) in

the summer of 1981 sponsored a workshop on HDLs at Woods Hole, Massachusetts. The

conclusion of the workshop was the need for a standard language, and the features that might

be required by such a standard in 1983.DoD established requirements for a standard VHSIC

hardware description language(VHDL), based on the recommendation of the “Woods Hole”

workshop. A contract for the development of the VHDL language, its environment, and its

software was awarded to IBM, Texas instruments and Intermetrics. VHDL 2.0 was released

only six months after the project began. The language was significantly improved hereafter

and other shortcomings were corrected leading to the release of VHDL 6.0. In 1985 this

significant developments led to the release of VHDL 6.0. In 1985 these significant

development led to the release of VHDL 7.2 language reference manual. This was later on

developed as IEEE 1076/A VHDL language reference manual.


4.2.3 Levels of abstraction (Styles)

VHDL supports many possible styles of design description. These styles differ

primarily in how closely they relate to the underlying hardware. When we speak of the

different styles of VHDL, then, we are really talking about the differing levels of abstraction

possible using the language. To give an example, it is possible to describe a counter circuit in

a number of ways. At the lowest level of abstraction, you could use VHDL's hierarchy

features to connect a sequence of predefined logic gates and flip-flips to form a counter

circuit.

Figure. 14 Levels of abstraction

In a behavioural description, the concept of time may be expressed precisely, with

actual delays between related events, or may simply be an ordering of operations that are

expressed sequentially. When you are writing VHDL for input to synthesis tools, you may

use behavioural statements in VHDL to imply that there are registers in your circuit. It is

unlikely, however, that your synthesis tool will be capable of creating precisely the same

behaviour in actual circuitry as you have defined in the language.

The highest level of abstraction supported in VHDL is called the behavioural level of

abstraction. When creating a behavioural description of a circuit, you will describe your

circuit in terms of its operation over time. The concept of time is the critical distinction

between behavioural descriptions of circuits and lower-level descriptions.


If you are familiar with event-driven software programming languages then writing

behaviour level VHDL will not seem like anything new. Just like a programming language,

you will be writing one or more small programs that operate sequentially and communicate

with one another through their interfaces. The only difference between behaviour-level

VHDL and a software programming language such as Visual Basic is the underlying

execution platform: in the case of Visual Basic, it is the Windows operating system; in the

case of VHDL, it is a simulator.

An alternate design method, in which a circuit design problem is segmented into

registers and combinational input logic, is what is often called the dataflow level of

abstraction. Dataflow is an intermediate level of abstraction that allows the drudgery of

combinational logic to be hidden while the more important parts of the circuit, the registers,

are more completely specified.

There are some drawbacks to using a purely dataflow method of design in VHDL.

First, there are no built-in registers in VHDL; the language was designed to be general-

purpose, and VHDL’s designers on its behavioural aspects placed the emphasis. If you are

going to write VHDL at the dataflow level of abstraction, then you must first create

behavioural descriptions of the register elements that you will be using in your design. These

elements must be provided in the form of components or in the form of subprograms.

But for hardware designers, for whom it can be difficult to relate the sequential

descriptions and operation of behavioural VHDL with the hardware that is being described,

using the dataflow level of abstraction can make quite a lot of sense. Using dataflow, it can

be easier to relate a design description to actual hardware devices.

If you are familiar with event-driven software programming languages then writing

behaviour level VHDL will not seem like anything new. Just like a programming language,

you will be writing one or more small programs that operate sequentially and communicate

with one another through their interfaces. The only difference between behaviour-level

VHDL and a software programming language such as Visual Basic is the underlying

execution platform: in the case of Visual Basic, it is the Windows operating system; in the

case of VHDL, it is a simulator.


The dataflow and behaviour levels of abstraction are used to describe circuits in terms

of their logical function. There is a third style of VHDL that is used to combine such

descriptions together into a larger, hierarchical circuit description.

Structural VHDL allows you to encapsulate one part of a design description as a re-

usable component. Structural VHDL can be thought of as being analogous to a textual

schematic, or as a textual block diagram for higher-level design.

4.2.4 Need for VHDL

The complex and laborious manual procedures for the design of the hardware

have paved the way for the development of languages for high –level description of the

digital system. This high-level description can serve as documentation for the part as well

as an entry point into the design process. The high level description can be processed

through various boards, or gate array using the synthesis tools of Hardware Description

language us such a language. VHDL was designed as a solution to provide an integrated

design and documentation to communicate design data between various levels of

abstractions.

4.2.5 Advantages of VHDL

VHDL allows quick description and synthesis of circuits of 5, 10, 20 thousand

gates. It also provides the following capabilities. The following are the major advantages of

VHDL over other hardware description languages:

• Power and flexibility VHDL has powerful language constructs which allows code description

of complex control logic.

• Device independent design VHDL creates design that fits into many device architecture and it

also permits multiple styles of design description.

• Portability VHDL’s portability permits the design description to be used on different

simulators and synthesis tools. Thus VHDL design descriptions can be used in multiple

projects.

• ASIC migration The efficiency of VHDL allows design to be synthesized on a CPLD or an

FPGA. Sometimes the code can be used with the ASIC.


• Quick time to market and low cost VHDL and programmable logic pair together facilitate

speedy design process. VHDL permits designs to be described quickly.

Programmable logic eliminates expenses and facilitates quick design iterations

• The language can be used as a communication medium between different Computer Aided

Design (CAD) and Computer Aided Engineering (CAE) tools.

• The language supports hierarchy, i.e., a digital system can be modelled as a set of

interconnected components; each component, in turn, can be modelled as a set of

interconnected subcomponents.

• The language supports flexible design methodologies: Top-Down, Bottom- Up, or Mixed.

• The language is technology independent and hence the same behaviour model can be

synthesized into different vendor libraries.

• Various digital modelling techniques such as finite-state machine descriptions, algorithmic

descriptions and Boolean equations can be modelled using the language.

• It supports both synchronous and asynchronous timing models.

• It is an IEEE and ANSI standard, and therefore, models described using these languages are

portable.

• There are no limitations that are imposed by the language on the size of the design.

• The language has elements that make large-scale design modelling easier, for e.g.

Components, functions, procedures and packages.

• Test benches can be written using the same language to test other VHDL models.

• Nominal propagation delays, min-max delays, setup and holding timing, timing constraints, and

spike detection can all be described very naturally in this language.

• Behavioural models that conform to a certain synthesis description style are capable of being

synthesized to gate-level description.

• The capability of defining new data types provides the power to describe and simulate a new

design technique at a very high level of abstraction without any concern about

implementation details.

4.2.6 Design methodology using VHDL


There are three design methodologies namely: bottom-up, top-down

and flat

• The bottom-up approach involves the defining and designing the individual components, then

bringing the individual components together to form the overall design.

• In a flat design the functional components are defined at the same level as the interconnection

of those functional components.

• A top-down design process involves a divide-and-conquer approach to implement the design a

large system. Top-down design is referred to as recursive partitioning of a system into its

sub-components until all sub-components become manageable design parts. Design of a

component is manageable if the component is available as part of a library, it can be

implemented by modifying an already available part, or it can be described for a synthesis

program or an automatic hardware generator.

4.2.7Elements of VHDL

Constructs of the VHDL language are designed for describing hardware components,

packaging parts and utilities use of libraries and for specifying design libraries and

parameters. In its simplest form, the description of a component in VHDL consists of an

interface specification and an architectural specification. The interface description begins

with Entity keyword and contains the input-output ports of the component. An architectural

specification begins with the Architectural keyword, which describes the functionality of a

component.

This functionality depends on input-output signals and other parameters that are

specified in the interface description. Several architectural specifications with different

identifiers can exist for one component with a given interface description. VHDL allows

architecture to be configured for a specific technology environment.

In a hardware design environment it becomes necessary to group components

or utilities used for description of components. Components and such utilities can be

grouped by use of packages. A package declaration contains components and utilities to be

come visible by Entities and Architectures. VHDL allows the use of Libraries and binding of

sub-components of a design to elements of various libraries. Constructs for such applications

include a library statement and configurations.


4.2.8 VHDL language features

The various building blocks and constructs in VHDL which have been used are:

4.2.8.1 Entity

Every VHDL design description consists of at least one entity. In VHDL, an entity

declaration describes the circuit as it appears from the "outside", from the perspective of its

input and output interfaces.

An entity declaration in VHDL provides the complete interface for a circuit. Using the

information provided in an entity declaration (the port names and the data type and direction

of each port), you have all the information you need to connect that portion of a circuit into

other, higher-level circuits.

The entity declaration includes a name, compare, and a port statement defining all the

inputs and outputs of the entity. Each of the ports is given a direction (either in, out or inout).

• Formal Definition

It is the hardware abstraction of a digital system. Entity declaration describes the

external view of the entity to the outside world.

Simplified syntax:

Entity entity-name is

Port (port-list);

[generic(generic-list);]

end entity-name;

• Description

All designs are expressed in terms of entities. Entity is the most basic building block

in a design. The uppermost level of the design is the top-level entity. If the design is

hierarchical, then the top-level description will have lower-level descriptions contained in

it. These lower-level descriptions will be lower-level entities contained in the top-level

entity description.


4.2.8.2 Architecture

Every entity in a VHDL design description must be bound with a corresponding

architecture. The architecture describes the actual function of the entity to which it is bound.

Using the schematic as a metaphor, you can think of the architecture as being roughly

analogous to a lower-level schematic pointed to by the higher-level functional block symbol.

The second part of a minimal VHDL source file is the architecture declaration. Every

entity declaration you write must be accompanied by at least one corresponding architecture.

The architecture declaration begins with a unique name, followed by the name of the

entity to which the architecture is bound. Within the architecture declaration is found the

actual functional description of our comparator. There are many ways to describe

combinational logic functions in VHDL.


A body associated with an entity declaration to describe the internal organization or

operation of a design entity. An architecture body is used to describe the behavior, data

flow or structure of a design entity:

• Simplified syntax

Architecture architecture-name of entity-name is

Architecture-declarations

Begin

Concurrent-statements

End [architecture] [architecture-name];

• Description

Architecture assigned to an entity describes internal relationship between input

and output ports of the entity. It contains of two parts: declarations and concurrent

statements. First (declarative) part of architecture may contain declarations of types, signals,

constants, subprograms (functions and procedures), components and groups.

Concurrent statements in the architecture body define the relationship between inputs

and outputs. This relationship can be specified using different types of statements:


Concurrent signal assignment, process statement, component instantiation, and concurrent

procedure call, generate statement, concurrent assertion statement, and block statement. It

can be writing in different styles: structural, dataflow, behavioral (functional) or mixed.

The description of a structural body is based on component instantiation and

generates statements. It allows creating hierarchical projects, from simple gates to very

complex components, describing entire subsystems. The Connections among components are

realized through ports.

The Dataflow description is built with concurrent signal assignment statements. Each

of the statements can be activated when any of its input signals changes its value.

The architecture body describes only the expected functionality (behavior) of

the circuit, without any direct indication as to the hard ware implementation. Such

description consists only of one or more processes, each of which contains sequential

statements. The Architecture body may contain statements that define both behavior and

structure of the circuit at the same time. Such architecture description is called mixed.

4.2.8.3 Component declaration


A component declaration declares a virtual design entity interface that may be used in

component instantiation statement.

Simplified syntax:

Component component-name

[generic(generic-list)];

port(port-list);

end component [component-name];


Component instantiation


A component instantiation statement defines a subcomponent of the design entity in

which it appears, associate signals or values with the ports of that subcomponent, and

associates values with generics of that subcomponent.


Label: [component] component-name

Generic map (generic-association-list);

Port map (port-association-List);

4.2.8.4 Configuration declaration


A configuration is a construct that defines how component instances in a given block are

bound to design entities in order to describe how design entities are put together to form a

complete design.


Configuration configuration-name of entity-name is

Configuration declarations.

For architecture-name

For instance-label: component-name

Use entity library-name. Entity-name (arch-name);

End for;

End for;

End configuration-name;


Configuration instantiation


A component instantiation statement defines a subcomponent of the design entity in

which it appears, associates signals or value with the ports of that subcomponent, and

associates values with generics of that subcomponent.


Label: Configuration configuration-name

Generic map (generic-association-list);

Port map (port-association-list);

4.2.8.5 Package


A package declaration defines the interface to a package.


Package package-name is

Package –declarations

End [package] package-name;

Package body


A package body defines the bodies of subprograms and the values of deferred

constants declared in the interface to the package.

Simplified syntax:

Package body package-name is

Package-body-declarations

Subprogram bodies declarations

End [package body] package-name;


4.2.8.6 Attributes

Attributes are of two types: user defined and predefined.

User defined


A value, function, type, range, signals, or constant that may be associated with one or

more named entities in a description.


Attribute attribute-name: type; --attribute declaration

Attribute attribute-name of item: item-class is expression –attribute specification

• Description

Attributes allow retrieving information about named entities: types, objects,

subprograms etc. Users can define mew attributes and then assign them to named entities by

specifying the entity and the attribute values for it.

Predefined


A value, function, type, range, signals, or constant that may be associated with one or

more named entities in a description.

Simplified syntax: object’s attribute-name

4.2.8.7 Process statement


A process statement defines an independent sequential process representing the

behaviour of some portion of the design

Simplified syntax:

[process-label:] process [(sensitivity-list)];

Process-declarations

begin

Sequential-statements


end process [process-label];

4.2.8.8 Function


A function call is a subprogram of the form of an expression that returns a value.


Function function name (parameters) return type -- function declaration

Function function-name (parameters) return type is --- function definition.

Begin

Sequential statements

End [function] function-name;

4.2.8.9 Port


A channel for dynamic communication between a block and its environment.

Simplified syntaxe:

Port (port-déclaration, port-déclaration,-----);

----port déclarations:

Port-signal-name: in port-signal-type: =initial-value

Port-signal-name: out port-signal-type: =initial-value

Port-signal-name: in out port-signal-type: =initial-value

Port-signal-name: buffer port-signal-type: =initial-value

Port-signal-name: linkage port-signal-type: =initial-value

4.2.8.10 Sensitivity list


A list of signals a process is sensitive to.


Simplified syntax:

(Signal-name, signal-name, ---)

Formal Definition

4.2.8.11 Standard logic


A nine-value resolved logic type.

Std-logic is not a part of the VHDL standard. It is defined in IEEE Std 1164.

Simplified syntax:

Type std-ulogic is (‘U’, -- Uninitialized

‘X’, -- Forcing Unknown

‘0’, -- Forcing 0

‘1’, -- Forcing 1

‘Z’ -- High Impedance

‘W’--Weak Unknown

‘L’--Weak 1

‘-‘--Don’t Care);

Type std-ulogic-vector is array (natural range <>) of std-ulogic

Function resolved (s: std-ulogic-vector) return std-ulogic;

Subtype std-logic is resolved std-ulogic;

4.2.9 Data Types

There are many data types available in VHDL. VHDL allows data to be represented


in terms of high-level data types. These data types can represent individual wires in a circuit,

or can represent collections of wires using a concept called an array.

The preceding description of the comparator circuit used the data types bit and bit

vector for its inputs and outputs. The bit data type (bit vector is simply an array of bits)

values of '1' and '0' are the only possible values for the bit data type. Every data type in

VHDL has a defined set of values, and a defined set of valid operations. Type checking is

strict, so it is not possible, for example, to directly assign the value of an integer data type to a

bit vector data type. (There are ways to get around this restriction, using what are called type

conversion functions.) VHDL is rich language with many different data types.

The most common data types are listed below:

Bit: a 1-bit value representing a wire. (Note: IEEE standard 1164 defines a 9-valued

replacement for bit called std_logic.)

Bit vector: an array of bits. (Replaced by std_logic_vector in IEEE 1164.)

Boolean: a True/False value.

Integer: a signed integer value, typically implemented as a 32-bit data type.

Real: a floating-point value.

Enumerated: used to create custom data types.

Record: used to append multiple data types as a collection.

Array: can be used to create single or multiple dimension arrays.

Access: similar to pointers in C or Pascal.

File: used to read and write disk files. Useful for simulation.

Physical: used to represent values such as time, voltage, etc. using symbolic units of

measure (such as 'ns' or 'ma').

4.2.10 Packages and Package Bodies.


A VHDL package declaration is identified by the package keyword, and is used to

collect commonly used declarations for use globally among different design units. You can

think of a package as being a common storage area, one used to store such things as type

declarations, constants, and global subprograms.

A package can consist of two basic parts: a package declaration and an optional

package body. Package declarations can contain the following types of statements:

Type and subtype declarations

Constant declarations

Global signal declarations

Function and procedure declarations

Attribute specifications

File declarations

Component declarations

Alias declarations

Disconnect specifications

Use clauses

Items appearing within a package declaration can be made visible to other design

units through the use of a use statement.

If the package contains declarations of subprograms (functions or procedures) or

defines one or more deferred constants (constants whose value is not given), then a package

body is required in addition to the package declaration. A package body must have the same

name as its corresponding package declaration, but can be located anywhere in the design.

The relationship between a package and package body is somewhat akin to the

relationship between an entity and its corresponding architecture. While the package

declaration provides the information needed to use the items defined within it (the parameter

list for a global procedure, or the name of a defined type or subtype), the actual behavior of

such things as procedures and functions must be specified within package bodies.


4.3 SOFTWARE USED:

4.3.1. Xilinx

Xilinx software is used by the VHDL designers for performing Synthesis operation.

Any simulated code can be synthesized and configured on FPGA. Synthesis is the


transformation of VHDL code into gate level net list. It is an integral part of current design

flows.

4.3.2. Algorithm

Start the ISE Software by clicking the XILINX ISE icon.

Create a New Project and find the following properties displayed.

Create a VHDL Source formatting all inputs, outputs and buffers if required. which

provides a window to write the VHDL code, to be synthesized.


Check Syntax after finally editing the VHDL source for any errors.

Design Simulation is done after compilation.

Synthesizing starts by creating Timing Constraints

Implement Design and Verify Constraints

Assigning Pin Location Constraints according to the requirement on FPGA board.

Download Design to the Spartan FPGA Board by clicking ‘Configure device’, until

a .bit file is generated showing a message “Program Succeeded”.


4.4 VERILOG HDL

Verilog HDL is a hardware description language that can be used to model a

digital system at many levels of abstraction ranging from the algorithmic-level to the gate-

level to the switch-level. The complexity of the digital system being modeled could vary

from that of a simple gate to a complete electronic digital system, or anything in between.

The digital system can be described hierarchically and timing can be explicitly modeled

within the same description.

The Verilog HDL language includes capabilities to describe the behavior-al

nature of a design, the dataflow nature of a design, a design's structural composition, delays

and a waveform generation mechanism including aspects of response monitoring and

verification, all modeled using one single language. In addition, the language provides a

programming language interface through which the internals of a design can be accessed

during simulation including the control of a simulation run.

The language not only defines the syntax but also defines very clear

simulation semantics for each language construct. Therefore, models written in this

language can be verified using a Verilog simulator. The language inherits many of its

operator symbols and constructs from the C programming language. Verilog HDL provides

an extensive range of modeling capabilities, some of which are quite difficult to comprehend

initially. However, a core subset of the language is quite easy to leam and use. This is

sufficient to model most applications.


4.4.1 History:

The verilog HDL language was first developed by Gateway Design Automation in

1983 as hardware are modleling language for their simulator product, At that time ,twas a

propnetary language. Because of the popularity of the,simulator product, Verilog HDL gained

acceptance as a usable and practical language by a number of designers. In an effort to

increase the popularity of the language, the language was placed in the public domain in

1990. Open verilog International (OVI) was formed to promote Verilog. In 1992 OVI

decided to pursue standardization of verilog HDL as an IEEE standard. This effort was

succeful and the language became an IEEE standard in 1995. The complete standard is

described in the verilog hardware description language reference manual. The standard is

called std 1364-1995.

4.4.2 Major Capabilities:

Listed below are the majort capabilities of the verilog hardware description:

Primitive logic gates, such as and, or and nand, are built-in into the language.

Flexibility of creating a user-defined primitive (UDP). Such a primitive could either

be a combinational logic primitive or a sequential logic primitive.

Switch-level modeling primitive gates, such as pmos and nmos, are also built-in into

the language.

Explicit language constructs are provided for specifying pin-to-pin delays, path delays

and timing checks of a design.

A design can be modeled in three different styles or in a mixed style. These styles are:

behavioral style - modeled using procedur-al constructs; dataflow style - modeled

using continuous assign-ments; and structural style - modeled using gate and module

instantiations.

There are two data types in Verilog HDL; the net data type and the register data type.

The net type represents a physical connection between structural elements while a

register type represents an abstract data storage element.

Figure.2-1 shows the mixed-level modeling capability of Verilog HDL, that is, in one

design, each module may be modeled at a different level.


Verilog HDL also has built-in logic functions such as & (bitwise-and) and I (bitwise-

or).

High-level programming language constructs such as condition- als, case statements,

and loops are available in the language.

Notion of concurrency and time can be explicitly modeled.

Powerful file read and write capabilities fare provided.

The language is non-deterministic under certain situations, that is, a model may

produce different results on different simulators; for example, the ordering of events

on an event queue is not defined by the standard.

4.4.3 SYNTHESIS:

Synthesis is the process of constructing a gate level netlist from a register-transfer

level model of a circuit described in Verilog HDL. Figure.2-2 shows such a process. A

synthesis system may as an intermediate step, generate a netlist that is comprised of register-

transfer level blocks such as flip-flops, arithmetic-logic-units, and multiplexers,

interconnected by wires. In such a case, a second program called the RTL module builder is

necessary. The purpose of this builder is to build, or acquire from a library of predefined

components, each of the required RTL blocks in the user-specified target technology.


Figure:15 Mixed level

Having produced a gate level netlist, a logic optimizer reads in the netlist and

optimizes the circuit for the user-specified area and timing constraints. These area and timing

constraints may also be used by the module builder for appropriate selection or generation of

RTL blocks. In this book, we assume that the target netlist is at the gate level. The logic gates

used in the synthesized netlists are described in Appendix B. The module building and logic

optimization phases are not described in this book.

The above figure shows the basic elements ofVerilog HDL and the elements used in

hardware. A mapping mechanism or a construction mechanism has to be provided that

translates the Verilog HDL elements into their corresponding hardware elements as shown in


Figure:16 synthesis process

Fig.2-3 Typical design

process

CHAPTER 5

SIMULATION MODEL

5.1 PROGRAM

library ieee;

use ieee.std_logic_1164.all;

entity clk_div is

port ( nreset : in std_logic_vector(3 downto 0); -- Reset

clk_in : in std_logic; -- Clock Input

clk_out1 : out std_logic;-- Clock Output1














clk_out15 : out std_logic);-- Clock Output15

end entity clk_div;

architecture clk_div of clk_div is

signal div_2 : std_logic; -- Divide By 2^1
















begin

Process(nreset,clk_in,Div_2) is -- Divide by 2^1

begin

if (nreset="0000") then

div_2 <= '0';

elsif (clk_in = '1' and clk_in'event) then

div_2 <= not div_2;

end if;

end process;

Process(div_2,div_4,nreset) is -- Divide by 2^2

begin

if (nreset ="0000") then

div_4 <= '0';

elsif(div_2 ='1' and div_2'event) then

div_4 <= not div_4;

end if;


end process;


begin


div_8 <= '0';


div_8 <= not div_8;

end if;

end process;


begin


div_16 <= '0';


div_16 <= not div_16;

end if;

end process;


begin


div_32 <= '0';



end if;

end process;


begin


div_64 <= '0';




end if;

end process;


begin


div_128 <= '0';


div_128 <= not div_128;

end if;

end process;


begin


div_256 <= '0';


div_256 <= not div_256;

end if;

end process;


begin


div_512 <= '0';


div_512 <= not div_512;

end if;

end process;


begin



div_1024 <= '0';


div_1024 <= not div_1024;

end if;

end process;


begin


div_2048 <= '0';


div_2048 <= not div_2048;

end if;

end process;


begin


div_4096 <= '0';


div_4096 <= not div_4096;

end if;

end process;


begin


div_8192 <= '0';


div_8192 <= not div_8192;

end if;

end process;



begin


div_16384 <= '0';


div_16384 <= not div_16384;

end if;

end process;


begin


div_32768 <= '0';


div_32768 <= not div_32768;

end if;

end process;

clk_out1 <= div_2;

clk_out2 <= div_4;

clk_out3 <= div_8;

clk_out4 <= div_16;

clk_out5 <= div_32;

clk_out6 <= div_64;

clk_out7<= div_128;

clk_out8 <= div_256;






clk_out14 <= div_16384;

clk_out15 <= div_32768;

end architecture clk_div;


5.2 RESULTENT WAVE FORM

Figure:18 Output figure of clock distribution


CHAPTER 6CONCLUSIONS AND FUTURE WORK

6.1 Conclusions

In this thesis we presented a novel approach to model a clock distribution network

using VHDL-AMS. For this purpose, a set of models were developed for the clock

distribution network, including the components like Interconnects, Buffers, Phase Locked

Loop and Source Oscillator. The models were simulated using Cadence LDV 5.1 AMS

simulator and were checked for functionality. Modeling of a generic clock distribution

network was demonstrated using the VHDL-AMS models. This satisfied the first objective of

this project.

Two case studies were considered in this research to demonstrate the versatility of the

VHDL AMS in modeling a clock distribution network. In the first case, a balanced H Tree

based clock distribution network was modeled, and in the second case a regular pattern clock

distribution network was modeled. This addressed the second objective of this project.

The characteristic of clock studied in this research was clock skew. Its variation with

varying levels of the H-Tree, interconnect lengths, load capacitance and the number of stages

in a regular pattern clock distribution network was studied in this research. This satisfied the

third objective of this research (Section 1.2). Compared to equivalent SPICE AMS models

developed in this research had an average error ranging between 3.60% and 8.21%. This

validated the accuracy of the VHDL-AMS models developed in this research.


6.2 Future Work

The suggested future works of this research are listed as follows:

A Model generator can be developed to automatically output a VHDL-AMS

Model based on the user requirements for the clock distribution network.

In this research, only behavioral and structural levels of abstraction were

considered for the selected components. To achieve high fidelity, component level

of abstraction can be considered.

The buffer models can be made more exhaustive by including the process

variation effects.

For high fidelity, transmission models can be generated for interconnects which

involves frequency domain modeling.

Higher order filters can be implemented in PLLs to make the model more

Accurate.

Effects of jitter can be included to improve the accuracy of a clock distribution

network model


Bibliography1. FRIEDMAN, E. G., AND POWELL, S. Design and Analysis of a Hierarchical Clock

Distribution System for Synchronous Standard Cell/Macro Cell. IEEE Journal of Solid-State

Circuits (April 1986), Vol. SC-21, No. 2.

2. RESTLE, P. J., AND DEUTSCH, A. Designing the Best Clock Distribution Network.

1998 Symposium on VLSI Circuits Digest of Technical Papers.

3. FRIEDMAN, E. G. Clock Distribution Design in VlSI Circuits – an Overview.

Proceedings of IEEE International Symposium on Circuits and Systems (May 1993),

pp. 1475-1478.

4. ZANELLA, S., NARDI, A., NEVIANI, A., QUARANTELLI, M., SAXENA, S.,

AND

GUARDIANI, C. Analysis of the Impact of Process Variations on Clock Skew.IEEE

Transactions on Semiconductor Manufacturing (Nov 2000), Vol 13, No. 4.

5. MEHROTRA, V., AND BONING, D Technology Scaling Impact of Variation on

Clock Skew and Interconnect Delay. International Interconnect Technology

Conference (IITC) (June 2001), San Francisco, CA.

6. MEHROTRA, V., SAM, S. L., BONING, D., CHANDRAKASAN, A.,

VALLISHAYEE, R.,

AND NASSIF, S. A Methodology for M odeling the Effects of Systematic Within-

Die Interconnect and Device Variation on Circuit Performance. 37th Conference

on Design Automation (DAC 2000), pp. 172-175.

7. XI, J. G., AND DAI, W. Buffer Insertion and Sizing Under Process Variations for

Low Power Clock Distribution. Proceedings of the 32nd ACM/IEEE Conference

ation (1995), pp. 491-496.

8. WOLAVER, D.H. Phase-Locked Loop Circuit Design. Prentice Hall, 1991.

9. STENSBY, J. L. Phase-Locked Loops: Theory and Applications. CRC Press, 1997.

10. http://lsiwww.epfl.ch/LSI2001/teaching/webcourse/ch05/ch05.html.

11. http://www-ensps.u-strasbg.fr/coursen/Option3A/ams_part1.html

12. http://hyperphysics.phy-astr.gsu.edu/hbase/electric/restmp.html

13. http://www.mosis.org/cgi-bin/cgiwrap/umosis/swp/params/ami-c5/t54gparams.

Txt

14. http://www.ece.cmu.[28] edu/~ee762/hspice-ocs/html/hspice_and_qrg/hspice_2001_2-

72.html


http://hyperphysics.phy-astr.gsu.edu/hbase/electric/restmp.html

http://www-ensps.u-strasbg.fr/coursen/Option3A/ams_part1.html

http://lsiwww.epfl.ch/LSI2001/teaching/webcourse/ch05/ch05.html

Documents

CLOCK DISTRIBUTION USING VHDL