Upload
bridget-graham
View
85
Download
7
Embed Size (px)
DESCRIPTION
The quality of the clock signals is the most important factor for ensuring a chip’s successful operation. In a design net list, there are hundreds of thousands or millions of cells. Those cells can be classified as two types: combinational cells and sequential cells (including memories). The sequential cells are used for storing information and they must operate on clocks. After the placement stage of the design implementation process, all of the cells, including the sequential cells are spread around the entire chip. The task of clock distribution is to distribute the clock signals to all of these sequential cells. This work is commonly called clock tree synthesis .The principle idea of how a clock tree is constructed.Our project is dealing with how the clock is distributed.
Citation preview
ABSTRACT
Very-large-scale integration (VLSI) is the process of creating integrated
circuits by combining thousands of transistor-based circuits into a single chip. VLSI
began in the 1970s when complex semiconductor and communication technologies were
being developed. The microprocessor is a VLSI device. The term is no longer as common
as it once was, as chips have increased in complexity into the hundreds of millions of
transistors.
VHDL stands for VHSIC Hardware Description Language. VHSIC is
an abbreviation for Very High Speed Integrated Circuit, a project sponsored by the US
Government and Air Force begun in 1980 to advance techniques for designing VLSI
silicon chips. VHDL is an IEEE standard.
The quality of the clock signals is the most important factor for
ensuring a chip’s successful operation. In a design net list, there are hundreds of
thousands or millions of cells. Those cells can be classified as two types: combinational
cells and sequential cells (including memories). The sequential cells are used for storing
information and they must operate on clocks. After the placement stage of the design
implementation process, all of the cells, including the sequential cells are spread around
the entire chip. The task of clock distribution is to distribute the clock signals to all of
these sequential cells. This work is commonly called clock tree synthesis .The principle
idea of how a clock tree is constructed.
Our project is dealing with how the clock is distributed.
DEPT OF ECE Page 1 of 66 AIET
Chapter 1
INTRODUCTION
In the past few decades, Integrated Circuit technology has been advancing
rapidly. In synchronous integrated circuits, clock is used to synchronize the data transfer.
The performance and functionality of the entire system depends on the clock
characteristics. Of late, clock distribution has become an exigent task for the VLSI
designers, as it consumes a large portion of resources like wiring, design time and power.
In the worst case, functional errors can also be caused due to the uncertainties in
clock network delays. These uncertainties also result in performance degradation.
Therefore it is imperative to model clock distributions accurately for their performance.
Clock distribution in high-speed digital systems is an exigent problem
overwhelming a rising fraction of assets for example design time, power, wiring, and
skew, which is the key parameter of interest. First of all, the issues related to clock skew
and it’s estimation in a digital circuit or network comes in our mind. Clock skew is the
difference in clock arrival time between different components across a chip. Due to this
difference in clock arrival time, delay comes into picture in getting the output of each
circuit, which results in speed irregularity of the digital system. [1] In this paper, we
introduce a synchronous counter circuit in which four D Flip-flops are used, driven with
the same clock and in this synchronous circuit we estimate the clock skew using
VHDL(Very High Speed Integrated Circuits Hardware Descriptive Language). This
paper is organized as follows. Section II describes the circuit design .Section III consists
the implementation of the circuit in VHDL. Section IV shows the analysis and simulation
results of the circuit. Section V describes the conclusion of our analysis.
Clock signals are typically loaded with the greatest fan-out and operate at the
highest speeds of any signal, either control or data, within the entire synchronous system.
Since the data signals are provided with a temporal reference by the clock signals, the
clock waveforms must be particularly clean and sharp. Furthermore, these clock signals
are particularly affected by technology scaling (Moore's law), in that long global
interconnect lines become significantly more resistive as line dimensions are decreased.
This increased line resistance is one of the primary reasons for the increasing significance
DEPT OF ECE Page 2 of 66 AIET
of clock distribution on synchronous performance. Finally, the control of any differences
and uncertainty in the arrival times of the clock signals can severely limit the maximum
performance of the entire system and create catastrophic race conditions in which an
incorrect data signal may latch within a register. The clock distribution network often
takes a significant fraction of the power consumed by a chip. Furthermore, significant
power can be wasted in transitions within blocks, even when their output is not needed.
These observations have led to a power saving technique called clock gating, which
involves adding logic gates to the clock distribution tree, so portions of the tree can be
turned off when not needed (when a clock can be safely gated may be determined either
through automatic analysis of the circuit, or specified by the designer). The exact savings
are very design dependent, but around 20-30% is often achievable. Most
synchronous digital systems consist of cascaded banks of
sequential registers with combinational logic between each set of registers.
The functional requirements of the digital system are satisfied by the logic stages. Each
logic stage introduces delay that affects timing performance, and the timing performance
of the digital design can be evaluated relative to the timing requirements by a timing
analysis. Often special consideration must be made to meet the timing requirements. For
example, the global performance and local timing requirements may be satisfied by the
careful insertion of pipeline registers into equally spaced time windows to satisfy critical
worst-case timing constraints.
The proper design of the clock distribution network helps ensure that critical
timing requirements are satisfied and that no race conditions exist (clock skew). The
delay components that make up a general synchronous system are composed of the
following three individual subsystems: the memory storage elements, the logic elements,
and the clocking circuitry and distribution network. Interrelationships among these three
subsystems of a synchronous digital system are critical to achieving maximum levels of
performance and reliability.
Clock gating is a popular technique used in many synchronous circuits for
reducing dynamic power dissipation. Clock gating saves power by adding more logic to a
circuit to prune the clock tree. Pruning the clock disables portions of the circuitry so that
the flip-flops in them do not have to switch states. Switching states consumes power.
DEPT OF ECE Page 3 of 66 AIET
When not being switched, the switching power consumption goes to zero, and
only leakage currents are incurred.
Clock gating works by taking the enable conditions attached to registers, and uses
them to gate the clocks. Therefore it is imperative that a design must contain these enable
conditions in order to use and benefit from clock gating. This clock gating process can
also save significant die area as well as power, since it removes large numbers
of mixes and replaces them with clock gating logic. This clock gating logic is generally in
the form of "Integrated clock gating" (ICG) cells. However, note that the clock gating
logic will change the clock tree structure, since the clock gating logic will sit in the clock
tree.
1.1 Motivation
Clock distribution networks synchronize the data transfer between the data
paths. In Integrated Circuits, proper design of a clock distribution network is necessary as
it has a direct impact on the system performance. Significant research has been done both
in the industrial and academic communities in the area of design and the optimizations of
clock distribution networks. To the best of our knowledge, there are no previous works
on modeling clock distribution networks using Hardware Description Languages. The
need for accurate modeling of general characteristics of clock distribution networks has
motivated this research. Providing accurate models to identify any uncertainty in the
clock signal arrival times at different significant points in the clock distribution network
will be a great aid to the circuit designers. The models developed as part of this research
will be used to build the model library in Distributed Processing Laboratory (DPL) at
University of Cincinnati.
1.2 Objective
This thesis deals with the following research questions.
1. Is it possible to accurately model clock distributions using VHDL? If so,What
do the models look like?
2. Is VHDL versatile enough to model different clock distribution networks?
DEPT OF ECE Page 4 of 66 AIET
3. What are the characteristics of clock distribution that can be modeled by
VHDL?
1.3 Approach
The approach taken in this research to model any clock distribution network is as
follows: Firstly a library of components which are essential components of any clock
distribution network is build. The components considered in this research are as follows:
• Interconnects
• Buffers
• Phase Locked Loop
In addition to the above listed components, simple models for the oscillator and
the load for the clock distribution network are modeled. Using the library of components,
a generic model for a clock distribution network can be generated. This satisfies the first
research goal. To validate the second research goal, two case studies have been
considered in this research. An H-tree clock distribution network and a regular pattern
clock distribution network have been chosen for their versatility. By modeling these two
different types of clock distribution networks, the versatility of the VHDL-AMS is
proven. The important characteristic of a clock distribution network considered in this
research is the skew of the clock signal.
1.4 Overview of Results
Experiments have been setup to validate the goals of the research. It is shown how
the models developed for this research can be used to model clock distribution networks.
The simulation and CPU times of the various models are reported to validate the speed of
VHDL language. Two case studies were considered to prove the versatility of the clock
distribution network. The skew is an important performance limiting factor in any clock
distribution. The skew variation with varying levels of the H-tree, interconnect lengths,
load capacitance and the number of stages in a regular pattern clock distribution network
is analyzed.
DEPT OF ECE Page 5 of 66 AIET
CHAPTER 2
BACKGROUND
This chapter discusses the need for modeling clock distribution networks in A
brief description of the related work is provided in provides an overview of the VHDL-
language used in this research. The language constructs and the modeling techniques to
model a mixed-signal system are presented in this section. Background information of
clock distribution networks is presented in Section 2.4. And the components of the clock
distribution networks: interconnects, buffers and phase locked loops are discussed after
wards Finally the characteristics of clock distribution considered in this research are
discussed in Section 2.5.
2.1 Need for Modeling Clock Distribution Networks
Clock distribution over an entire chip is a very complex problem and is one of the
main challenges in the design of today’s high-performance processors. Clock distribution
has a significant impact on the performance of the entire system and heavily contributes
to the total power dissipation of the chip. Any inaccuracies of clock timing may be
critical to the circuit operations resulting in functionality errors. An accurate model of the
clock distribution network for any VLSI circuit is helpful for accurate performance
evaluation [1]. It will be of great help to the circuit designers to model uncertainties in the
clock signal arrival times between key points in a clock distribution network.
2.2 Related Work
In the literature, there are some works reported to model the impact of process
variations on Clock skew [4], and the effect of technology scaling on clock skew and
interconnect delay [5]. Research has also been done to model the effects of systematic
within-die interconnect parameter variation like metal thickness, metal line width, and
DEPT OF ECE Page 6 of 66 AIET
inter-layer dielectric thickness variations on circuit performance [6]. Buffer Insertion and
the effect of process variations on its sizing have also been studied [7]. Clock skew
analysis has also been reported in many research papers [8]. But most of the above works
use statistical analysis and Monte-Carlo simulations to find the effects of different
variations on skew. To the best of our knowledge, no work has been reported to model
clock distribution network using a hardware description language (HDL) taking
parameters like skew and process variations in to account. Research efforts have been
made to model the Phase Locked Loop and its components for mixed-mode simulation
[9, 10, 11]. But there are no existing works which link the phase locked loop with the
clock distribution network and model its effect on the performance of the clock
distribution. The novel feature of this research is that a clock distribution network is
being modeled using a HDL by modeling its components (Interconnects, Buffers, Phase
locked loop, Source and Load) and their impact on clock skew and rise and fall times of
the clock. Use of a HDL combines the advantages of the flexibility of the language in
modeling, with the accuracy and speed of simulation and modeling. The HDL used for
this research is VHDL.
2.3 Clock Distribution Networks
Designing a clock distribution network is a very complex task for the circuit
designers. The difficulties in the clock distribution design are being augmented by the
device technology improvements leading to smaller feature size; larger chip area and
increased component density, since they result in higher interconnect resistance and
higher clock loads [14]. Some of the clock distribution networks used in certain
microprocessors are discussed below. The clock distribution network used in Intel IA-64
microprocessor is shown in Figure 2.1. The significant segments are discussed as follows.
The global clock distribution connects the PLL clock generator to the de-skewing buffers.
The regional clock distribution connects the clock from de-skewing buffers to the local
clock regions using clock grids. The local clock distribution connects the clock from the
regional clock grid to the clocked elements using local clock buffers and local
interconnections
Figure 2.1: Clock distribution network in Intel IA-64 microprocessor [8] The clock
distribution network of the 600MHz Alpha microprocessor is depicted in Figure 2.2. The
DEPT OF ECE Page 7 of 66 AIET
clock is generated on-chip using a PLL which multiplies an 80-200MHz reference clock
to generate a frequency of 600MHz. The feedback loop of the PLL includes the clock
distribution network up to and including the global clock (GCLK) to control phase
alignment. A high-gain buffer network is used to route GCLK to a central point on the
die. From there the clock is distributed by buffered
X and H trees as shown in Figure 2.3 to the GCLK drivers located in a windowpane
pattern across the chip
Figure 1: Clock distribution network in 600MHz Alpha Microprocessor
DEPT OF ECE Page 8 of 66 AIET
Figure 2: Global Clock Network [15]
To summarize, any clock distribution network will usually contain the following
components: Phase-locked loop for on-chip clock generation, clock buffers to drive the
large capacitive load in the network and the local interconnects to connect the clock to
the clock driving points. Other components of the clock distribution networks include
De-skewing buffers, Delay locked loops, etc. In this research, interconnects, buffers and
PLLs will be studied in detail.
2.4 Components of a Clock Distribution Network
2.4.1 Buffers
Another important component of any clock-distribution network is a buffer.
Typically the clock is connected to a large number of components in the circuit resulting
in large loads. The characteristic load consists of the clock distribution wires, and the end
points which drive the logic blocks in the circuit. Buffers are inserted between the clock
source and the load, to ensure that the clock signals at different end points have smaller
rise and fall times. The clock buffer design involves the following steps: deciding on the
buffer delay and the output slopes, and choosing the buffer sizes based on the load
capacitance to be driven. The clock buffer delays affect the clock skew of the network
and hence are chosen such that the clock skew budget is met. The effect of clock buffer
delays on skew is shown in Figure 2.5.
DEPT OF ECE Page 9 of 66 AIET
Figure 3: Clock skew variation with buffer delay
2.4.2 Phase Locked Loop
A Phase Locked Loop (PLL) is used as a frequency synthesizer to multiply a
reference frequency generated by a crystal oscillator to much higher frequencies needed
for today’s higher end microprocessors. In principle, a PLL synchronizes the frequency
of the output signal generated by an oscillator with the frequency of a reference signal
and eliminates any phase misalignment. Some of the other applications of PLLs in
communications are carrier and clock recovery, frequency and phase modulation and
demodulation [17]. Thus, the main purpose of a PLL in integrated circuits is to generate
clock and to obtain accurate phase synchronizations between the off-chip reference clock
and the internal clock signals.
A rudimentary PLL (shown in Figure 2.6) typically has the following three basic
blocks in a feedback loop
• Phase Detector (PD)
• Low Pass Filter (LPF)
• Voltage Controlled Oscillator (VCO)
Ref. signal
DEPT OF ECE Page 10 of 66 AIET
PDLPF VCO
Figure 4: Block diagram of a basic PLL
The phase detector may be a simple analog multiplier. Based on the
requirements of the application, more complicated phase detectors are used in practice.
The loop filters are optional and are used to increase the bandwidth of the PLL and to
reduce the phase noise. The VCO is an oscillator producing an output frequency
proportional to its control voltage [18]. The main drawback of a rudimentary PLL using
an analog multiplier as a phase detector is that it has a finite phase error that is a function
of input frequency. The PLLs used contemporarily are charge-pump PLLs, which have
the capability to track the phase accurately, resulting in practically zero nominal phase
error regardless of its input
Frequency. A charge pump PLL shown in Figure 2.7 has the following blocks.
• Phase detector (PD)
• Charge pump (CP)
• Low Pass Filter (LPF)
• Voltage Controlled Oscillator (VCO)
Reference signal
OUTPUT
Figure 5: Block diagram of charge-pump PLL
A brief discussion about the different blocks of a PLL is given below
2.4.3 Phase Detector
DEPT OF ECE Page 11 of 66 AIET
PD
FREQUENCY
DIVIDER
CP
LPF
VCO
A phase detector of the PLL and outputs the phase error. There will be no output
from the PD if the two signals have same phase and frequency. Else, the phase error is
used to generate a control voltage for the voltage controlled oscillator such that the phase
error minimizes to zero.
2.4.4 The charge pump
The charge pump converts the phase error detected by the PD into current or
Voltage, to control the frequency generated by the VCO. The charge pump can be used to
set the phase detector gain, KD. The Charge Pump either charges or discharges the
filter’s capacitors based on the output of the PD. If the reference signal leads the VCO
output, the output of the PD will signal the Charge Pump to pump more charge into the
capacitors. If it lags, the equivalent amount of charge discharges from the filter’s
capacitors.
2.4.5 Low Pass Filter
The main purpose of a low pass filter is to modify the bandwidth of the PLL and
to reduce the phase error. The filter converts the charge of the charge-pump into voltage,
to control the frequency of the VCO. A passive RC filter can be used as a simple low
pass filter. The higher the order of the filter, the better is the noise rejection in the PLL.
The filters can be a passive RC filters or active filters using op-amps. If the control
voltage of the VCO is less than the voltage generated by the charge pump, a passive RC
filter will suffice, if not, an active filter is used
2.4.6 Voltage Controlled Oscillator
A Voltage Controlled Oscillator is the heart of the PLL as it dominates the phase
noise performance of the entire PLL. It produces the required frequency for the PLL. The
frequency of the VCO can be controlled by a control voltage. The output of the VCO is
fed back to the phase detector and the phase difference between the reference signal and
the output in such a way that the output matches the reference signal closely. This process
is called “acquiring of the VCO is changed into a DC output voltage. This DC voltage
controls the frequency of the VCO lock”.
DEPT OF ECE Page 12 of 66 AIET
2.4.7 Frequency Divider
Typically the output frequency of a PLL will be a multiple of the reference signal
Frequency. Hence, a frequency divider is used as a part of the feedback loop to divide the
frequency generated by the VCO into a value that can be comparable to the reference
signal. Of late frequency dividers have become inevitable in any PLL circuit as the output
clock frequencies of today’s microprocessors are much higher than their input clock
frequencies.
2.5 Characteristics of Clock Distribution Network
Clock distribution is very crucial for any digital system. Ideally, clock signals
should have zero skew, zero jitter, negligible rise and fall times and specified duty cycles.
But in reality, clock signals have non-zero skews, non-zero jitter, considerable rise and
fall times and varying duty cycles. Power consumption is another important performance
metric of any clock distribution network, as it may take a large portion of the total power
consumption of the entire chip. As the clock frequency increases, clock inaccuracy is
occupying a considerable percentage of the clock period. In present-day microprocessors,
clock skews take up as much as 10% of the available clock cycle time [22]. Thus, clock
distribution networks can be modeled for various characteristics like clock skew, clock
jitter, rise and fall times and variations in duty cycles, at different driving points in the
distribution network. The characteristic of clock distribution considered in this research is
clock skew.
2.5.1 Clock Skew
An ideal clock is defined as a signal which arrives at different register inputs at
the same time. But due to static mismatches in the clock paths and the clock load
variations, clocks are not ideal. The absolute delay of any clock path is not that important
compared to the relative arrival of the clock at different points in the circuit. Clock skew
is defined as the spatial variation in arrival time of the clock at different clock terminals
in the circuit. Clock skew results in phase shift . Clock skew is one of the important
performance limiting factors in the system performance. Usually, the circuit designers
DEPT OF ECE Page 13 of 66 AIET
have a clock skew budget to meet, beyond which the system will not have correct
functionality at a desired frequency.
The clock distributions usually target zero or minimum skew for efficient
performance. Zero skew is obtained when the phase delays of all the clock terminals from
the clock source calculated with a delay model, like the Elmore delay model, are equal
under ideal process condition. Clock skew results mostly from the different delays
associated with the clock buffers present on chip. The common design technique used to
reduce the skew is to equalize the capacitive load of clock signal as seen by each clock
buffer.
Skew can be caused either by a systematic effect which is predictable or a random
effect which is not predictable. Some of the reasons for skew include variations in
effective channel lengths of devices, Inter-layer dielectric (ILD) thickness variations,
process variations, threshold voltage variations, power supply voltage variations and
temperature variations across the die and design errors and capacitive coupling in the
circuit .
DEPT OF ECE Page 14 of 66 AIET
DEPT OF ECE Page 15 of 66 AIET
CHAPTER 3
3.1 CLOCK DISTRIBUTION
The quality of the clock signals is the most important factor for ensuring a chip’s
successful operation. In a design net list, there are hundreds of thousands or millions of
cells. Those cells can be classified as two types: combinational cells and sequential cells
(including memories). The sequential cells are used for storing information and they must
operate on clocks. After the placement stage of the design implementation process, all of
the cells, including the sequential cells, are spread around the entire chip. The task of
clock distribution is to distribute the clock signals to all of these sequential cells. This
work is commonly called clock tree synthesis. Figure 4.36 (page 141) shows the principle
idea of how a clock tree is constructed. As depicted, a clock network may be constructed
in tree fashion. Starting from the clock source, the first level of clock buffers are laid out,
then the second level, then the third level, and so on. In most designs, there are many
clock domains, and each domain has hundreds or thousands of sequential cells attached to
it. This many cells cannot be driven by a single buffer from the clock source, even with
the strongest buffer in the library.
A tree structure is used to deal with this problem by letting each buffer drive only
the number of loads that it is allowed to drive. As a result, the quality of the clock signal,
in term of slew rate (the rising and falling time of the clock edges), is not significantly
degraded when it reaches the leaf sequential cells. Figure 4.37 shows the commonly used
clock tree structures in the clock distribution networks: trunk, branch-tree, mesh, X-tree
and H-tree. Figure 4.38 is an example of how a real clock tree looks in a design block. In
this simple example, there is one level of clock buffers between the clock root and the
leaves. Another type of clock distribution network is the clock grid. In this approach, a
grid of metal structure, which covers the entire chip, is dedicated to the distribution of
clock signals, as graphically shown in Figure 3.1
DEPT OF ECE Page 16 of 66 AIET
Figure 6 A basic clock tree.
DEPT OF ECE Page 17 of 66 AIET
Figure 7 commonly used tree structures in clock distribution networks
DEPT OF ECE Page 18 of 66 AIET
Figure.8 An example of a clock tree in chip design.
A tree structure usually consumes less wiring and thus less capacitance and less
routing resources, which results in lower power and less latency. However, a tree
structure must be carefully tuned and it is very load (placement) dependent. In contrast, a
grid structure uses significantly more routing resources and thus has large capacitance
and large latency, but it tends to be less load dependent as any leaf cell can always find a
nearby tapping point to connect to directly.
As a result, a grid structure clock distribution network is typically used only for
high-end applications, such as microprocessors, whereas a tree structure is widely used
for ASIC-based designs. The clock distribution network consumes more than 10% of the
total power used by the chip in large designs. During each clock cycle, the capacitance
associated with the entire clock structure must be charged to the supply voltage and
subsequently dumped to ground, with the stored energy lost as heat. To ease this
DEPT OF ECE Page 19 of 66 AIET
problem, resonant clock distribution has been actively studied by some groups. In this
method, the traditional tree- or grid-driven clock structure is augmented with on-chip
inductors to resonate with the clock capacitance at the clock’s fundamental frequency.
The energy of the fundamental frequency resonates back and forth between its electric
and magnetic form rather than being dissipated as heat. The clock driver is only used for
adding the energy lost during the operation. This idea is depicted in Figure 3.3
3.2 The Key Requirements For Constructing A Clock Tree.
The key requirements for constructing a clock tree are clock skew and insertion
delay. Clock skew is the maximum timing difference among the arrive times of the leaf
cells in a clock domain. In Figure 4.41, the result of a SPICE analysis of a clock tree is
demonstrated. A clock pulse is injected into the clock tree at time 0 ns with a rise time of
1 ns. After traveling inside the tree, the clock signal arrives at the leaves (also called
clock sinks) at approximately 3.4 ns. However, it is clear that the arrive times for the
leaves are not the same due to the different physical locations of the leaf cells .They
spread within a range of approximately 1 ns, which is defined as clock skew. In other
words, the existence of skew means that not all of the sequential cells in a particular
clock domain receive their clock signals at exactly the same moment, as desired. Clock
skew is significant because it eats up the time budget assigned for logic operations. If
skew is over the desired budget, the chip might not function correctly at its designed
speed (a setup violation), or might not function at all (a hold violation).Clock tree
insertion delay is the measure of time difference between the clock signal started at the
source and the clock signal received at the leaf cells. The concept of insertion delay is
also depicted in Figure 4.41. Insertion delay is important because the designer might need
to balance clock tree delays between different clock domains for cross-domain
information exchange. Also, insertion delay impacts I/O timing constraints. These
scenarios are graphically demonstrated in Figure 4.42 where the insertion delays of
CLK1_TREE and CLK2_TREE must be balanced for the proper exchange of data
between the logic cells of the two domains. For the CLK2domain, the value of the
insertion delay must be known so that the communication between I/O cells (DATA_IN,
DATA_OUT) and logic cells can be carried out safely.
DEPT OF ECE Page 20 of 66 AIET
3.3 Difference between Time skew and length skew in a clock Tree
Clock tree synthesis is a crucial step in a chip’s physical design. The quality of the
clock tree has a great impact on the status of timing closure. One of the critical metrics in
measuring the clock tree quality is the time skew, which is the maximum arrive time
difference among the clock sinks. Physically, the time skew is caused by the different
locations of the clock sinks on chip. Figure 3.5 is an abstract view of the physical
locations of a clock tree’s leaf cells. Figure 3.6 presents the same information in a real
layout .As seen, the clock sinks are spread within a certain region. From the clock source
to various clock sinks, the physical distances are different. Hence, when connections are
completed by metal routing, the wire lengths are not the same. The maximum wire length
difference is referred to as length skew .Physically, the clock tree is composed of clock
buffers and routing wires. Therefore, the time delay from the clock source to any clock
sink is affected by two factors: the gate’s delay and the wire’s delay. Since these two
types of delay scale differently among different process, temperature, and voltage
(PTV) conditions, a time-balanced clock tree in one PTV corner might experience
significant time skew in another PTV corner if the clock tree is constructed with a
considerable amount of length skew. This scenario is worsened when the process
geometry becomes smaller because wire delay carries more weight in the total delay
equation. Ideally, among different branches of a clock tree, it is desired to match gate
delay with gate delay and wire delay with wire delay. In other words, time skew should
be minimized by using the approach of minimizing the length skew such that the amount
of time skew is preserved over different PTV conditions. This is especially helpful for the
on-chip variation (OCV) optimization .Figure 3.7 depicts the relationship between time
skew and length skew for the same clock tree in Figure 3.5 . As shown in this space–
timing plot this clock tree has six levels. Any vertical line in this plot represents a gate
delay since a gate has no length skew but time delay. Wire delays are expressed by nearly
horizontal lines, which have a large length difference but small time difference. At Level
4 and Level 5, the clock tree starts to grow different branches. Consequently, the length
skew is seen at these levels. The time skew for this tree is ~30 ps, whereas the length
skew is approximately250 _m. Figure 3.7 is the same space–time relationship in a three
dimensional(3D) world. Figure 3.8 is the 3D plot of a very large clock tree with 23,942
DEPT OF ECE Page 21 of 66 AIET
sinks. The time skew discussed above is called global skew, which is usually pessimistic.
A more specific term, local skew, is defined as the time difference for the clock signals to
reach the sinks that have data exchange activities among them. Local skew is more
precise and useful for circuit analysis but the extraction of necessary information for
processing is beyond the capability of current tools.
Figure 9 Cell based ASIC design methodology.
DEPT OF ECE Page 22 of 66 AIET
Figure 10 Abstract view of the physical distribution of a clock sink.
DEPT OF ECE Page 23 of 66 AIET
Figure 11 Layout view of the physical distribution of a clock sink.
DEPT OF ECE Page 24 of 66 AIET
Figure 12 The clock tree in a three-dimensional space–time plot.
DEPT OF ECE Page 25 of 66 AIET
Figure 13 A large clock tree of 23,942 sinks.
DEPT OF ECE Page 26 of 66 AIET
CHAPTER 4
4.1 VLSI
4.1.1 INTRODUCTION
Very-large-scale integration (VLSI) is the process of creating integrated
circuits by combining thousands of transistor-based circuits into a single chip. VLSI began in
the 1970s when complex semiconductor and communication technologies were being
developed. The microprocessor is a VLSI device. The term is no longer as common as it once
was, as chips have increased in complexity into the hundreds of millions of transistors.
4.1.2 Overview
The first semiconductor chips held one transistor each. Subsequent advances
added more and more transistors, and, as a consequence, more individual functions or
systems were integrated over time. The first integrated circuits held only a few devices,
perhaps as many as ten diodes, transistors, resistors and capacitors, making it possible to
fabricate one or more logic gates on a single device. Now known retrospectively as "small-
scale integration" (SSI), improvements in technique led to devices with hundreds of logic
gates, known as large-scale integration (LSI), i.e. systems with at least a thousand logic gates.
Current technology has moved far past this mark and today's microprocessors have many
millions of gates and hundreds of millions of individual transistors.
At one time, there was an effort to name and calibrate various levels of large-scale
integration above VLSI. Terms like Ultra-large-scale Integration (ULSI) were used. But the
huge number of gates and transistors available on common devices has rendered such fine
distinctions moot. Terms suggesting greater than VLSI levels of integration are no longer in
widespread use. Even VLSI is now somewhat quaint, given the common assumption that all
microprocessors are VLSI or better.
As of early 2008, billion-transistor processors are commercially available, an example
of which is Intel's Montecito Itanium chip. This is expected to become more commonplace as
semiconductor fabrication moves from the current generation of 65 nm processes to the next
DEPT OF ECE Page 27 of 66 AIET
45 nm generations (while experiencing new challenges such as increased variation across
process corners). Another notable example is NVIDIA’s 280 series GPU.
This microprocessor is unique in the fact that its 1.4 Billion transistor count, capable
of a teraflop of performance, is almost entirely dedicated to logic (Itanium's transistor count
is largely due to the 24MB L3 cache). Current designs, as opposed to the earliest devices, use
extensive design automation and automated logic synthesis to lay out the transistors, enabling
higher levels of complexity in the resulting logic functionality. Certain high-performance
logic blocks like the SRAM cell, however, are still designed by hand to ensure the highest
efficiency (sometimes by bending or breaking established design rules to obtain the last bit of
performance by trading stability).
4.1.3 What is VLSI?
VLSI stands for "Very Large Scale Integration". This is the field which involves
packing more and more logic devices into smaller and smaller areas.
1. Simply we say Integrated circuit is many transistors on one chip.
2. Design/manufacturing of extremely small, complex circuitry using modified
semiconductor material
3. Integrated circuit (IC) may contain millions of transistors, each a few mm in size
4. Applications wide ranging: most electronic logic devices
4.1.4 History of Scale Integration
late 1940s Transistor invented at Bell Labs
late 1950s First IC (JK-FF by Jack Kilby at TI)
early 1960s Small Scale Integration (SSI)
10s of transistors on a chip
late 1960s Medium Scale Integration (MSI)
100s of transistors on a chip
early 1970s Large Scale Integration (LSI)
1000s of transistor on a chip
DEPT OF ECE Page 28 of 66 AIET
early 1980s VLSI 10,000s of transistors on a
chip (later 100,000s & now 1,000,000s)
Ultra LSI is sometimes used for 1,000,000s
SSI - Small-Scale Integration (0-102)
MSI - Medium-Scale Integration (102-103)
LSI - Large-Scale Integration (103-105)
VLSI - Very Large-Scale Integration (105-107)
ULSI - Ultra Large-Scale Integration (>=107)
4.1.5 Advantages of ICs over discrete components
While we will concentrate on integrated circuits, the properties of integrated circuits-
what we can and cannot efficiently put in an integrated circuit-largely determine the
architecture of the entire system. Integrated circuits improve system characteristics in several
critical ways. ICs have three key advantages over digital circuits built from discrete
components:
Size. Integrated circuits are much smaller-both transistors and wires are shrunk to micrometer
sizes, compared to the millimetre or centimetre scales of discrete components. Small size
leads to advantages in speed and power consumption, since smaller components have smaller
parasitic resistances, capacitances, and inductances.
Speed. Signals can be switched between logic 0 and logic 1 much quicker within a chip than
they can between chips. Communication within a chip can occur hundreds of times faster
than communication between chips on a printed circuit board. The high speed of circuits on-
chip is due to their small size-smaller components and wires have smaller parasitic
capacitances to slow down the signal.
Power consumption. Logic operations within a chip also take much less power. Once again,
lower power consumption is largely due to the small size of circuits on the chip-smaller
parasitic capacitances and resistances require less power to drive them.
VLSI and systems
These advantages of integrated circuits translate into advantages at the system level:
DEPT OF ECE Page 29 of 66 AIET
Smaller physical size. Smallness is often an advantage in itself-consider portable televisions
or handheld cellular telephones.
Lower power consumption. Replacing a handful of standard parts with a single chip reduces
total power consumption. Reducing power consumption has a ripple effect on the rest of the
system: a smaller, cheaper power supply can be used; since less power consumption means
less heat, a fan may no longer be necessary; a simpler cabinet with less shielding for
electromagnetic shielding may be feasible, too.
Reduced cost. Reducing the number of components, the power supply requirements, cabinet
costs, and so on, will inevitably reduce system cost. The ripple effect of integration is such
that the cost of a system built from custom ICs can be less, even though the individual ICs
cost more than the standard parts they replace.
Understanding why integrated circuit technology has such profound influence on the
design of digital systems requires understanding both the technology of IC manufacturing and
the economics of ICs and digital systems.
Applications
Electronic system in cars.
Digital electronics control VCRs
Transaction processing system, ATM
Personal computers and Workstations
Medical electronic systems.
Etc….
4.1.6 Applications of VLSI
Electronic systems now perform a wide variety of tasks in daily life. Electronic
systems in some cases have replaced mechanisms that operated mechanically, hydraulically,
or by other means; electronics are usually smaller, more flexible, and easier to service. In
other cases electronic systems have created totally new applications. Electronic systems
perform a variety of tasks, some of them visible, some more hidden:
Personal entertainment systems such as portable MP3 players and DVD players perform
sophisticated algorithms with remarkably little energy.
DEPT OF ECE Page 30 of 66 AIET
Electronic systems in cars operate stereo systems and displays; they also control fuel
injection systems, adjust suspensions to varying terrain, and perform the control functions
required for anti-lock braking (ABS) systems.
Digital electronics compress and decompress video, even at high-definition data rates, on-the-
fly in consumer electronics.
Low-cost terminals for Web browsing still require sophisticated electronics, despite their
dedicated function.
Personal computers and workstations provide word-processing, financial analysis, and games.
Computers include both central processing units (CPUs) and special-purpose hardware for
disk access, faster screen display, etc.
Medical electronic systems measure bodily functions and perform complex processing
algorithms to warn about unusual conditions. The availability of these complex systems, far
from overwhelming consumers, only creates demand for even more complex systems.
The growing sophistication of applications continually pushes the design and
manufacturing of integrated circuits and electronic systems to new levels of complexity. And
perhaps the most amazing characteristic of this collection of systems is its variety-as systems
become more complex, we build not a few general-purpose computers but an ever wider
range of special-purpose systems. Our ability to do so is a testament to our growing mastery
of both integrated circuit manufacturing and design, but the increasing demands of customers
continue to test the limits of design and manufacturing.
Electronic systems now perform a wide variety of tasks in daily life. Electronic
systems in some cases have replaced mechanisms that operated mechanically, hydraulically,
or by other means; electronics are usually smaller, more flexible, and easier to service. In
other cases electronic systems have created totally new applications.
DEPT OF ECE Page 31 of 66 AIET
4.2 VHDL
4.2.1 Introduction
VHDL is an acronym for Very High Speed Integrated Circuits Hardware
description Language. The language can be used to model a digital system at many levels
of abstraction ranging from the algorithmic level to the gate level. The complexity of the
digital system being modeled could vary from that of a simple gate to a complete digital
electronic system. The VHDL language can be regarded as an integrated amalgamation
of sequential, concurrent, net list and waveform generation languages and timing
specifications.
4.2.2 History of VHDL
VHDL stands for VHSIC (Very High Speed Integrated Circuit) Hardware Description
Language. It was developed in the 1980’s as spin-off of a high-speed integrated circuit
research project funded by the US department of defence. During the VHSIC program,
researchers were confronted with the daunting task of describing circuits of enormous scale
(for their time) and of managing very large circuit design problems that involved multiple
teams of engineers. With only gate-level tools available, it soon became clear that more
structured design methods and tools would be needed.
To meet this challenge, teams of engineers from three companies - IBM, Texas
Instruments and Intermetrics — were contracted by the department of defence to complete
the specification and implementation of a new language based design description method.
The first publicly available version of VHDL, version 7.2 was released in 1985. In 1986, the
IEEE was presented with a proposal to standardize the language, which it did in 1987 and
academic representatives. The resulting standard, IEEE 1076—1987 is the basis for virtually
every simulation and synthesis product sold today. An enhanced and updated version of the
language, IEEE 1076-1993, was released in 1994, and VHDL tool vendors have been
responding by adding these new language features to their products.
Although IEEE standard 1076 defines the complete VHDL language, there are aspects
of the language that make it difficult to write completely portable design descriptions
DEPT OF ECE Page 32 of 66 AIET
(description that can be simulated identically using different vendor’s tools). The problem
stems from the fact that VHDL supports many abstract data types, but it does not address the
simple problem of characterizing different signal strengths or commonly used simulation
conditions such as unknowns and high impedances. Soon after IEEE 1076-1987 [3] was
adopted, simulator companies began enhancing VHDL with new non-standard types to allow
their customers to accurately simulate complex electronic circuits. This caused problems
because design descriptions entered into one simulator were often incompatible with another
with other environments. VHDL was quickly becoming a non-standard.
To get around the problem of non-standard data types, an IEEE committee adopted
another standard. This standard numbered 1164, defines a standard package (a VHDL feature
that allows commonly used declaration to be collected into an external library) containing
definition for a standard nine-value data type. This standard data type is called standard logic,
and the IELL 1164 package is often referred to as the standard logic package.
The problem stems from the fact that VHDL supports many abstract data types, but it
does not address the simple problem of characterizing different signal strengths or commonly
used simulation conditions such as unknowns and high impedances. Soon after IEEE 1076-
1987 [3] was adopted, simulator companies began enhancing VHDL with new non-standard
types to allow their customers to accurately simulate complex electronic circuits. This caused
problems because design descriptions entered into one simulator were often incompatible
with another with other environments. VHDL was quickly becoming a non-standard.
The IEEN 1076-1987 and IEEE 1164 standards together form the complete VHDL
standard in widest use today (IEEE 1076-1993 is slowly working its way into the VHDL
mainstream, but it does not add significant number of features for synthesis users).
In the search for a standard design and documentation tool for the Very High Speed
Integrated Circuits (VHSIC) program the United States Department of Defence (DOD) in
the summer of 1981 sponsored a workshop on HDLs at Woods Hole, Massachusetts. The
conclusion of the workshop was the need for a standard language, and the features that might
be required by such a standard in 1983.DoD established requirements for a standard VHSIC
hardware description language(VHDL), based on the recommendation of the “Woods Hole”
DEPT OF ECE Page 33 of 66 AIET
workshop. A contract for the development of the VHDL language, its environment, and its
software was awarded to IBM, Texas instruments and Intermetrics. VHDL 2.0 was released
only six months after the project began. The language was significantly improved hereafter
and other shortcomings were corrected leading to the release of VHDL 6.0. In 1985 this
significant developments led to the release of VHDL 6.0. In 1985 these significant
development led to the release of VHDL 7.2 language reference manual. This was later on
developed as IEEE 1076/A VHDL language reference manual.
Efforts for defining the new version of VHDL stated in 1990 by a ream of volunteers
working under the IEEE DASC (Design Automation Standards committee). In October of
1992, a new VHDL’93 was completed and was released for review. After minor
modifications, this new version was approved by the VHDL balloting group members and
became the new VHDL language standard. The present VHDL standard is formally referred
as VHDL 1076-1993.
In the search for a standard design and documentation tool for the Very High Speed
Integrated Circuits (VHSIC) program the United States Department of Defence (DOD) in
the summer of 1981 sponsored a workshop on HDLs at Woods Hole, Massachusetts. The
conclusion of the workshop was the need for a standard language, and the features that might
be required by such a standard in 1983.DoD established requirements for a standard VHSIC
hardware description language(VHDL), based on the recommendation of the “Woods Hole”
workshop. A contract for the development of the VHDL language, its environment, and its
software was awarded to IBM, Texas instruments and Intermetrics. VHDL 2.0 was released
only six months after the project began. The language was significantly improved hereafter
and other shortcomings were corrected leading to the release of VHDL 6.0. In 1985 this
significant developments led to the release of VHDL 6.0. In 1985 these significant
development led to the release of VHDL 7.2 language reference manual. This was later on
developed as IEEE 1076/A VHDL language reference manual.
DEPT OF ECE Page 34 of 66 AIET
4.2.3 Levels of abstraction (Styles)
VHDL supports many possible styles of design description. These styles differ
primarily in how closely they relate to the underlying hardware. When we speak of the
different styles of VHDL, then, we are really talking about the differing levels of abstraction
possible using the language. To give an example, it is possible to describe a counter circuit in
a number of ways. At the lowest level of abstraction, you could use VHDL's hierarchy
features to connect a sequence of predefined logic gates and flip-flips to form a counter
circuit.
Figure. 14 Levels of abstraction
In a behavioural description, the concept of time may be expressed precisely, with
actual delays between related events, or may simply be an ordering of operations that are
expressed sequentially. When you are writing VHDL for input to synthesis tools, you may
use behavioural statements in VHDL to imply that there are registers in your circuit. It is
unlikely, however, that your synthesis tool will be capable of creating precisely the same
behaviour in actual circuitry as you have defined in the language.
The highest level of abstraction supported in VHDL is called the behavioural level of
abstraction. When creating a behavioural description of a circuit, you will describe your
circuit in terms of its operation over time. The concept of time is the critical distinction
between behavioural descriptions of circuits and lower-level descriptions.
DEPT OF ECE Page 35 of 66 AIET
If you are familiar with event-driven software programming languages then writing
behaviour level VHDL will not seem like anything new. Just like a programming language,
you will be writing one or more small programs that operate sequentially and communicate
with one another through their interfaces. The only difference between behaviour-level
VHDL and a software programming language such as Visual Basic is the underlying
execution platform: in the case of Visual Basic, it is the Windows operating system; in the
case of VHDL, it is a simulator.
An alternate design method, in which a circuit design problem is segmented into
registers and combinational input logic, is what is often called the dataflow level of
abstraction. Dataflow is an intermediate level of abstraction that allows the drudgery of
combinational logic to be hidden while the more important parts of the circuit, the registers,
are more completely specified.
There are some drawbacks to using a purely dataflow method of design in VHDL.
First, there are no built-in registers in VHDL; the language was designed to be general-
purpose, and VHDL’s designers on its behavioural aspects placed the emphasis. If you are
going to write VHDL at the dataflow level of abstraction, then you must first create
behavioural descriptions of the register elements that you will be using in your design. These
elements must be provided in the form of components or in the form of subprograms.
But for hardware designers, for whom it can be difficult to relate the sequential
descriptions and operation of behavioural VHDL with the hardware that is being described,
using the dataflow level of abstraction can make quite a lot of sense. Using dataflow, it can
be easier to relate a design description to actual hardware devices.
If you are familiar with event-driven software programming languages then writing
behaviour level VHDL will not seem like anything new. Just like a programming language,
you will be writing one or more small programs that operate sequentially and communicate
with one another through their interfaces. The only difference between behaviour-level
VHDL and a software programming language such as Visual Basic is the underlying
execution platform: in the case of Visual Basic, it is the Windows operating system; in the
case of VHDL, it is a simulator.
DEPT OF ECE Page 36 of 66 AIET
The dataflow and behaviour levels of abstraction are used to describe circuits in terms
of their logical function. There is a third style of VHDL that is used to combine such
descriptions together into a larger, hierarchical circuit description.
Structural VHDL allows you to encapsulate one part of a design description as a re-
usable component. Structural VHDL can be thought of as being analogous to a textual
schematic, or as a textual block diagram for higher-level design.
4.2.4 Need for VHDL
The complex and laborious manual procedures for the design of the hardware
have paved the way for the development of languages for high –level description of the
digital system. This high-level description can serve as documentation for the part as well
as an entry point into the design process. The high level description can be processed
through various boards, or gate array using the synthesis tools of Hardware Description
language us such a language. VHDL was designed as a solution to provide an integrated
design and documentation to communicate design data between various levels of
abstractions.
4.2.5 Advantages of VHDL
VHDL allows quick description and synthesis of circuits of 5, 10, 20 thousand
gates. It also provides the following capabilities. The following are the major advantages of
VHDL over other hardware description languages:
• Power and flexibility VHDL has powerful language constructs which allows code description
of complex control logic.
• Device independent design VHDL creates design that fits into many device architecture and it
also permits multiple styles of design description.
• Portability VHDL’s portability permits the design description to be used on different
simulators and synthesis tools. Thus VHDL design descriptions can be used in multiple
projects.
• ASIC migration The efficiency of VHDL allows design to be synthesized on a CPLD or an
FPGA. Sometimes the code can be used with the ASIC.
DEPT OF ECE Page 37 of 66 AIET
• Quick time to market and low cost VHDL and programmable logic pair together facilitate
speedy design process. VHDL permits designs to be described quickly.
Programmable logic eliminates expenses and facilitates quick design iterations
• The language can be used as a communication medium between different Computer Aided
Design (CAD) and Computer Aided Engineering (CAE) tools.
• The language supports hierarchy, i.e., a digital system can be modelled as a set of
interconnected components; each component, in turn, can be modelled as a set of
interconnected subcomponents.
• The language supports flexible design methodologies: Top-Down, Bottom- Up, or Mixed.
• The language is technology independent and hence the same behaviour model can be
synthesized into different vendor libraries.
• Various digital modelling techniques such as finite-state machine descriptions, algorithmic
descriptions and Boolean equations can be modelled using the language.
• It supports both synchronous and asynchronous timing models.
• It is an IEEE and ANSI standard, and therefore, models described using these languages are
portable.
• There are no limitations that are imposed by the language on the size of the design.
• The language has elements that make large-scale design modelling easier, for e.g.
Components, functions, procedures and packages.
• Test benches can be written using the same language to test other VHDL models.
• Nominal propagation delays, min-max delays, setup and holding timing, timing constraints, and
spike detection can all be described very naturally in this language.
• Behavioural models that conform to a certain synthesis description style are capable of being
synthesized to gate-level description.
• The capability of defining new data types provides the power to describe and simulate a new
design technique at a very high level of abstraction without any concern about
implementation details.
4.2.6 Design methodology using VHDL
DEPT OF ECE Page 38 of 66 AIET
There are three design methodologies namely: bottom-up, top-down
and flat
• The bottom-up approach involves the defining and designing the individual components, then
bringing the individual components together to form the overall design.
• In a flat design the functional components are defined at the same level as the interconnection
of those functional components.
• A top-down design process involves a divide-and-conquer approach to implement the design a
large system. Top-down design is referred to as recursive partitioning of a system into its
sub-components until all sub-components become manageable design parts. Design of a
component is manageable if the component is available as part of a library, it can be
implemented by modifying an already available part, or it can be described for a synthesis
program or an automatic hardware generator.
4.2.7Elements of VHDL
Constructs of the VHDL language are designed for describing hardware components,
packaging parts and utilities use of libraries and for specifying design libraries and
parameters. In its simplest form, the description of a component in VHDL consists of an
interface specification and an architectural specification. The interface description begins
with Entity keyword and contains the input-output ports of the component. An architectural
specification begins with the Architectural keyword, which describes the functionality of a
component.
This functionality depends on input-output signals and other parameters that are
specified in the interface description. Several architectural specifications with different
identifiers can exist for one component with a given interface description. VHDL allows
architecture to be configured for a specific technology environment.
In a hardware design environment it becomes necessary to group components
or utilities used for description of components. Components and such utilities can be
grouped by use of packages. A package declaration contains components and utilities to be
come visible by Entities and Architectures. VHDL allows the use of Libraries and binding of
sub-components of a design to elements of various libraries. Constructs for such applications
include a library statement and configurations.
DEPT OF ECE Page 39 of 66 AIET
4.2.8 VHDL language features
The various building blocks and constructs in VHDL which have been used are:
4.2.8.1 Entity
Every VHDL design description consists of at least one entity. In VHDL, an entity
declaration describes the circuit as it appears from the "outside", from the perspective of its
input and output interfaces.
An entity declaration in VHDL provides the complete interface for a circuit. Using the
information provided in an entity declaration (the port names and the data type and direction
of each port), you have all the information you need to connect that portion of a circuit into
other, higher-level circuits.
The entity declaration includes a name, compare, and a port statement defining all the
inputs and outputs of the entity. Each of the ports is given a direction (either in, out or inout).
• Formal Definition
It is the hardware abstraction of a digital system. Entity declaration describes the
external view of the entity to the outside world.
Simplified syntax:
Entity entity-name is
Port (port-list);
[generic(generic-list);]
end entity-name;
• Description
All designs are expressed in terms of entities. Entity is the most basic building block
in a design. The uppermost level of the design is the top-level entity. If the design is
hierarchical, then the top-level description will have lower-level descriptions contained in
it. These lower-level descriptions will be lower-level entities contained in the top-level
entity description.
DEPT OF ECE Page 40 of 66 AIET
4.2.8.2 Architecture
Every entity in a VHDL design description must be bound with a corresponding
architecture. The architecture describes the actual function of the entity to which it is bound.
Using the schematic as a metaphor, you can think of the architecture as being roughly
analogous to a lower-level schematic pointed to by the higher-level functional block symbol.
The second part of a minimal VHDL source file is the architecture declaration. Every
entity declaration you write must be accompanied by at least one corresponding architecture.
The architecture declaration begins with a unique name, followed by the name of the
entity to which the architecture is bound. Within the architecture declaration is found the
actual functional description of our comparator. There are many ways to describe
combinational logic functions in VHDL.
• Formal Definition
A body associated with an entity declaration to describe the internal organization or
operation of a design entity. An architecture body is used to describe the behavior, data
flow or structure of a design entity:
• Simplified syntax
Architecture architecture-name of entity-name is
Architecture-declarations
Begin
Concurrent-statements
End [architecture] [architecture-name];
• Description
Architecture assigned to an entity describes internal relationship between input
and output ports of the entity. It contains of two parts: declarations and concurrent
statements. First (declarative) part of architecture may contain declarations of types, signals,
constants, subprograms (functions and procedures), components and groups.
Concurrent statements in the architecture body define the relationship between inputs
and outputs. This relationship can be specified using different types of statements:
DEPT OF ECE Page 41 of 66 AIET
Concurrent signal assignment, process statement, component instantiation, and concurrent
procedure call, generate statement, concurrent assertion statement, and block statement. It
can be writing in different styles: structural, dataflow, behavioral (functional) or mixed.
The description of a structural body is based on component instantiation and
generates statements. It allows creating hierarchical projects, from simple gates to very
complex components, describing entire subsystems. The Connections among components are
realized through ports.
The Dataflow description is built with concurrent signal assignment statements. Each
of the statements can be activated when any of its input signals changes its value.
The architecture body describes only the expected functionality (behavior) of
the circuit, without any direct indication as to the hard ware implementation. Such
description consists only of one or more processes, each of which contains sequential
statements. The Architecture body may contain statements that define both behavior and
structure of the circuit at the same time. Such architecture description is called mixed.
4.2.8.3 Component declaration
• Formal Definition
A component declaration declares a virtual design entity interface that may be used in
component instantiation statement.
Simplified syntax:
Component component-name
[generic(generic-list)];
port(port-list);
end component [component-name];
DEPT OF ECE Page 42 of 66 AIET
Component instantiation
• Formal Definition
A component instantiation statement defines a subcomponent of the design entity in
which it appears, associate signals or values with the ports of that subcomponent, and
associates values with generics of that subcomponent.
• Simplified syntax
Label: [component] component-name
Generic map (generic-association-list);
Port map (port-association-List);
4.2.8.4 Configuration declaration
• Formal Definition
A configuration is a construct that defines how component instances in a given block are
bound to design entities in order to describe how design entities are put together to form a
complete design.
• Simplified syntax
Configuration configuration-name of entity-name is
Configuration declarations.
For architecture-name
For instance-label: component-name
Use entity library-name. Entity-name (arch-name);
End for;
End for;
End configuration-name;
DEPT OF ECE Page 43 of 66 AIET
Configuration instantiation
• Formal Definition
A component instantiation statement defines a subcomponent of the design entity in
which it appears, associates signals or value with the ports of that subcomponent, and
associates values with generics of that subcomponent.
• Simplified syntax
Label: Configuration configuration-name
Generic map (generic-association-list);
Port map (port-association-list);
4.2.8.5 Package
• Formal Definition
A package declaration defines the interface to a package.
• Simplified syntax
Package package-name is
Package –declarations
End [package] package-name;
Package body
• Formal Definition
A package body defines the bodies of subprograms and the values of deferred
constants declared in the interface to the package.
Simplified syntax:
Package body package-name is
Package-body-declarations
Subprogram bodies declarations
End [package body] package-name;
DEPT OF ECE Page 44 of 66 AIET
4.2.8.6 Attributes
Attributes are of two types: user defined and predefined.
User defined
• Formal Definition
A value, function, type, range, signals, or constant that may be associated with one or
more named entities in a description.
• Simplified syntax
Attribute attribute-name: type; --attribute declaration
Attribute attribute-name of item: item-class is expression –attribute specification
• Description
Attributes allow retrieving information about named entities: types, objects,
subprograms etc. Users can define mew attributes and then assign them to named entities by
specifying the entity and the attribute values for it.
Predefined
• Formal Definition
A value, function, type, range, signals, or constant that may be associated with one or
more named entities in a description.
Simplified syntax: object’s attribute-name
4.2.8.7 Process statement
• Formal Definition
A process statement defines an independent sequential process representing the
behaviour of some portion of the design
Simplified syntax:
[process-label:] process [(sensitivity-list)];
Process-declarations
begin
Sequential-statements
DEPT OF ECE Page 45 of 66 AIET
end process [process-label];
4.2.8.8 Function
• Formal Definition
A function call is a subprogram of the form of an expression that returns a value.
• Simplified syntax
Function function name (parameters) return type -- function declaration
Function function-name (parameters) return type is --- function definition.
Begin
Sequential statements
End [function] function-name;
4.2.8.9 Port
• Formal Definition
A channel for dynamic communication between a block and its environment.
Simplified syntaxe:
Port (port-déclaration, port-déclaration,-----);
----port déclarations:
Port-signal-name: in port-signal-type: =initial-value
Port-signal-name: out port-signal-type: =initial-value
Port-signal-name: in out port-signal-type: =initial-value
Port-signal-name: buffer port-signal-type: =initial-value
Port-signal-name: linkage port-signal-type: =initial-value
4.2.8.10 Sensitivity list
• Formal Definition
A list of signals a process is sensitive to.
DEPT OF ECE Page 46 of 66 AIET
Simplified syntax:
(Signal-name, signal-name, ---)
Formal Definition
4.2.8.11 Standard logic
• Formal Definition
A nine-value resolved logic type.
Std-logic is not a part of the VHDL standard. It is defined in IEEE Std 1164.
Simplified syntax:
Type std-ulogic is (‘U’, -- Uninitialized
‘X’, -- Forcing Unknown
‘0’, -- Forcing 0
‘1’, -- Forcing 1
‘Z’ -- High Impedance
‘W’--Weak Unknown
‘L’--Weak 1
‘-‘--Don’t Care);
Type std-ulogic-vector is array (natural range <>) of std-ulogic
Function resolved (s: std-ulogic-vector) return std-ulogic;
Subtype std-logic is resolved std-ulogic;
4.2.9 Data Types
There are many data types available in VHDL. VHDL allows data to be represented
DEPT OF ECE Page 47 of 66 AIET
in terms of high-level data types. These data types can represent individual wires in a circuit,
or can represent collections of wires using a concept called an array.
The preceding description of the comparator circuit used the data types bit and bit
vector for its inputs and outputs. The bit data type (bit vector is simply an array of bits)
values of '1' and '0' are the only possible values for the bit data type. Every data type in
VHDL has a defined set of values, and a defined set of valid operations. Type checking is
strict, so it is not possible, for example, to directly assign the value of an integer data type to a
bit vector data type. (There are ways to get around this restriction, using what are called type
conversion functions.) VHDL is rich language with many different data types.
The most common data types are listed below:
Bit: a 1-bit value representing a wire. (Note: IEEE standard 1164 defines a 9-valued
replacement for bit called std_logic.)
Bit vector: an array of bits. (Replaced by std_logic_vector in IEEE 1164.)
Boolean: a True/False value.
Integer: a signed integer value, typically implemented as a 32-bit data type.
Real: a floating-point value.
Enumerated: used to create custom data types.
Record: used to append multiple data types as a collection.
Array: can be used to create single or multiple dimension arrays.
Access: similar to pointers in C or Pascal.
File: used to read and write disk files. Useful for simulation.
Physical: used to represent values such as time, voltage, etc. using symbolic units of
measure (such as 'ns' or 'ma').
4.2.10 Packages and Package Bodies.
DEPT OF ECE Page 48 of 66 AIET
A VHDL package declaration is identified by the package keyword, and is used to
collect commonly used declarations for use globally among different design units. You can
think of a package as being a common storage area, one used to store such things as type
declarations, constants, and global subprograms.
A package can consist of two basic parts: a package declaration and an optional
package body. Package declarations can contain the following types of statements:
Type and subtype declarations
Constant declarations
Global signal declarations
Function and procedure declarations
Attribute specifications
File declarations
Component declarations
Alias declarations
Disconnect specifications
Use clauses
Items appearing within a package declaration can be made visible to other design
units through the use of a use statement.
If the package contains declarations of subprograms (functions or procedures) or
defines one or more deferred constants (constants whose value is not given), then a package
body is required in addition to the package declaration. A package body must have the same
name as its corresponding package declaration, but can be located anywhere in the design.
The relationship between a package and package body is somewhat akin to the
relationship between an entity and its corresponding architecture. While the package
declaration provides the information needed to use the items defined within it (the parameter
list for a global procedure, or the name of a defined type or subtype), the actual behavior of
such things as procedures and functions must be specified within package bodies.
DEPT OF ECE Page 49 of 66 AIET
4.3 SOFTWARE USED:
4.3.1. Xilinx
Xilinx software is used by the VHDL designers for performing Synthesis operation.
Any simulated code can be synthesized and configured on FPGA. Synthesis is the
DEPT OF ECE Page 50 of 66 AIET
transformation of VHDL code into gate level net list. It is an integral part of current design
flows.
4.3.2. Algorithm
Start the ISE Software by clicking the XILINX ISE icon.
Create a New Project and find the following properties displayed.
Create a VHDL Source formatting all inputs, outputs and buffers if required. which
provides a window to write the VHDL code, to be synthesized.
DEPT OF ECE Page 51 of 66 AIET
Check Syntax after finally editing the VHDL source for any errors.
Design Simulation is done after compilation.
Synthesizing starts by creating Timing Constraints
Implement Design and Verify Constraints
Assigning Pin Location Constraints according to the requirement on FPGA board.
Download Design to the Spartan FPGA Board by clicking ‘Configure device’, until
a .bit file is generated showing a message “Program Succeeded”.
DEPT OF ECE Page 52 of 66 AIET
4.4 VERILOG HDL
Verilog HDL is a hardware description language that can be used to model a
digital system at many levels of abstraction ranging from the algorithmic-level to the gate-
level to the switch-level. The complexity of the digital system being modeled could vary
from that of a simple gate to a complete electronic digital system, or anything in between.
The digital system can be described hierarchically and timing can be explicitly modeled
within the same description.
The Verilog HDL language includes capabilities to describe the behavior-al
nature of a design, the dataflow nature of a design, a design's structural composition, delays
and a waveform generation mechanism including aspects of response monitoring and
verification, all modeled using one single language. In addition, the language provides a
programming language interface through which the internals of a design can be accessed
during simulation including the control of a simulation run.
The language not only defines the syntax but also defines very clear
simulation semantics for each language construct. Therefore, models written in this
language can be verified using a Verilog simulator. The language inherits many of its
operator symbols and constructs from the C programming language. Verilog HDL provides
an extensive range of modeling capabilities, some of which are quite difficult to comprehend
initially. However, a core subset of the language is quite easy to leam and use. This is
sufficient to model most applications.
DEPT OF ECE Page 53 of 66 AIET
4.4.1 History:
The verilog HDL language was first developed by Gateway Design Automation in
1983 as hardware are modleling language for their simulator product, At that time ,twas a
propnetary language. Because of the popularity of the,simulator product, Verilog HDL gained
acceptance as a usable and practical language by a number of designers. In an effort to
increase the popularity of the language, the language was placed in the public domain in
1990. Open verilog International (OVI) was formed to promote Verilog. In 1992 OVI
decided to pursue standardization of verilog HDL as an IEEE standard. This effort was
succeful and the language became an IEEE standard in 1995. The complete standard is
described in the verilog hardware description language reference manual. The standard is
called std 1364-1995.
4.4.2 Major Capabilities:
Listed below are the majort capabilities of the verilog hardware description:
Primitive logic gates, such as and, or and nand, are built-in into the language.
Flexibility of creating a user-defined primitive (UDP). Such a primitive could either
be a combinational logic primitive or a sequential logic primitive.
Switch-level modeling primitive gates, such as pmos and nmos, are also built-in into
the language.
Explicit language constructs are provided for specifying pin-to-pin delays, path delays
and timing checks of a design.
A design can be modeled in three different styles or in a mixed style. These styles are:
behavioral style - modeled using procedur-al constructs; dataflow style - modeled
using continuous assign-ments; and structural style - modeled using gate and module
instantiations.
There are two data types in Verilog HDL; the net data type and the register data type.
The net type represents a physical connection between structural elements while a
register type represents an abstract data storage element.
Figure.2-1 shows the mixed-level modeling capability of Verilog HDL, that is, in one
design, each module may be modeled at a different level.
DEPT OF ECE Page 54 of 66 AIET
Verilog HDL also has built-in logic functions such as & (bitwise-and) and I (bitwise-
or).
High-level programming language constructs such as condition- als, case statements,
and loops are available in the language.
Notion of concurrency and time can be explicitly modeled.
Powerful file read and write capabilities fare provided.
The language is non-deterministic under certain situations, that is, a model may
produce different results on different simulators; for example, the ordering of events
on an event queue is not defined by the standard.
4.4.3 SYNTHESIS:
Synthesis is the process of constructing a gate level netlist from a register-transfer
level model of a circuit described in Verilog HDL. Figure.2-2 shows such a process. A
synthesis system may as an intermediate step, generate a netlist that is comprised of register-
transfer level blocks such as flip-flops, arithmetic-logic-units, and multiplexers,
interconnected by wires. In such a case, a second program called the RTL module builder is
necessary. The purpose of this builder is to build, or acquire from a library of predefined
components, each of the required RTL blocks in the user-specified target technology.
DEPT OF ECE Page 55 of 66 AIET
Figure:15 Mixed level
Having produced a gate level netlist, a logic optimizer reads in the netlist and
optimizes the circuit for the user-specified area and timing constraints. These area and timing
constraints may also be used by the module builder for appropriate selection or generation of
RTL blocks. In this book, we assume that the target netlist is at the gate level. The logic gates
used in the synthesized netlists are described in Appendix B. The module building and logic
optimization phases are not described in this book.
The above figure shows the basic elements ofVerilog HDL and the elements used in
hardware. A mapping mechanism or a construction mechanism has to be provided that
translates the Verilog HDL elements into their corresponding hardware elements as shown in
DEPT OF ECE Page 56 of 66 AIET
Figure:16 synthesis process
Fig.2-3 Typical design
process
CHAPTER 5
SIMULATION MODEL
5.1 PROGRAM
library ieee;
use ieee.std_logic_1164.all;
entity clk_div is
port ( nreset : in std_logic_vector(3 downto 0); -- Reset
clk_in : in std_logic; -- Clock Input
clk_out1 : out std_logic;-- Clock Output1
clk_out2 : out std_logic;-- Clock Output2
clk_out3 : out std_logic;-- Clock Output3
clk_out4 : out std_logic;-- Clock Output4
clk_out5 : out std_logic;-- Clock Output5
clk_out6 : out std_logic;-- Clock Output6
clk_out7 : out std_logic;-- Clock Output7
clk_out8 : out std_logic;-- Clock Output8
clk_out9 : out std_logic;-- Clock Output9
clk_out10 : out std_logic;-- Clock Output10
clk_out11 : out std_logic;-- Clock Output11
clk_out12 : out std_logic;-- Clock Output12
clk_out13 : out std_logic;-- Clock Output13
clk_out14 : out std_logic;-- Clock Output14
clk_out15 : out std_logic);-- Clock Output15
end entity clk_div;
architecture clk_div of clk_div is
signal div_2 : std_logic; -- Divide By 2^1
signal div_4 : std_logic; -- Divide By 2^2
DEPT OF ECE Page 57 of 66 AIET
signal div_8 : std_logic; -- Divide By 2^3
signal div_16 : std_logic; -- Divide By 2^4
signal div_32 : std_logic; -- Divide By 2^5
signal div_64 : std_logic; -- Divide By 2^6
signal div_128 : std_logic; -- Divide By 2^7
signal div_256 : std_logic; -- Divide By 2^8
signal div_512 : std_logic; -- Divide By 2^9
signal div_1024 : std_logic; -- Divide By 2^10
signal div_2048 : std_logic; -- Divide By 2^11
signal div_4096 : std_logic; -- Divide By 2^12
signal div_8192 : std_logic; -- Divide By 2^13
signal div_16384 : std_logic; -- Divide By 2^14
signal div_32768 : std_logic; -- Divide By 2^15
begin
Process(nreset,clk_in,Div_2) is -- Divide by 2^1
begin
if (nreset="0000") then
div_2 <= '0';
elsif (clk_in = '1' and clk_in'event) then
div_2 <= not div_2;
end if;
end process;
Process(div_2,div_4,nreset) is -- Divide by 2^2
begin
if (nreset ="0000") then
div_4 <= '0';
elsif(div_2 ='1' and div_2'event) then
div_4 <= not div_4;
end if;
DEPT OF ECE Page 58 of 66 AIET
end process;
Process(div_4,div_8,nreset) is -- Divide by 2^3
begin
if (nreset ="0000") then
div_8 <= '0';
elsif(div_4 ='1' and div_4'event) then
div_8 <= not div_8;
end if;
end process;
Process(div_8,div_16,nreset) is -- Divide by 2^4
begin
if (nreset ="0000") then
div_16 <= '0';
elsif(div_8 ='1' and div_8'event) then
div_16 <= not div_16;
end if;
end process;
Process(div_16,div_32,nreset) is -- Divide by 2^5
begin
if (nreset ="0000") then
div_32 <= '0';
elsif(div_16 ='1' and div_16'event) then
div_32 <= not div_32;
end if;
end process;
Process(div_32,div_64,nreset) is -- Divide by 2^6
begin
if (nreset ="0000") then
div_64 <= '0';
elsif(div_32 ='1' and div_32'event) then
DEPT OF ECE Page 59 of 66 AIET
div_64 <= not div_64;
end if;
end process;
Process(div_64,div_128,nreset) is -- Divide by 2^7
begin
if (nreset ="0000") then
div_128 <= '0';
elsif(div_64 ='1' and div_64'event) then
div_128 <= not div_128;
end if;
end process;
Process(div_128,div_256,nreset) is -- Divide by 2^8
begin
if (nreset ="0000") then
div_256 <= '0';
elsif(div_128 ='1' and div_128'event) then
div_256 <= not div_256;
end if;
end process;
Process(div_256,div_512,nreset) is -- Divide by 2^9
begin
if (nreset ="0000") then
div_512 <= '0';
elsif(div_256 ='1' and div_256'event) then
div_512 <= not div_512;
end if;
end process;
Process(div_512,div_1024,nreset) is -- Divide by 2^10
begin
if (nreset ="0000") then
DEPT OF ECE Page 60 of 66 AIET
div_1024 <= '0';
elsif(div_512 ='1' and div_512'event) then
div_1024 <= not div_1024;
end if;
end process;
Process(div_1024,div_2048,nreset) is -- Divide by 2^11
begin
if (nreset ="0000") then
div_2048 <= '0';
elsif(div_1024 ='1' and div_1024'event) then
div_2048 <= not div_2048;
end if;
end process;
Process(div_2048,div_4096,nreset) is -- Divide by 2^12
begin
if (nreset ="0000") then
div_4096 <= '0';
elsif(div_2048 ='1' and div_2048'event) then
div_4096 <= not div_4096;
end if;
end process;
Process(div_4096,div_8192,nreset) is -- Divide by 2^13
begin
if (nreset ="0000") then
div_8192 <= '0';
elsif(div_4096 ='1' and div_4096'event) then
div_8192 <= not div_8192;
end if;
end process;
Process(div_8192,div_16384,nreset) is -- Divide by 2^14
DEPT OF ECE Page 61 of 66 AIET
begin
if (nreset ="0000") then
div_16384 <= '0';
elsif(div_8192 ='1' and div_8192'event) then
div_16384 <= not div_16384;
end if;
end process;
Process(div_16384,div_32768,nreset) is -- Divide by 2^15
begin
if (nreset ="0000") then
div_32768 <= '0';
elsif(div_16384 ='1' and div_16384'event) then
div_32768 <= not div_32768;
end if;
end process;
clk_out1 <= div_2;
clk_out2 <= div_4;
clk_out3 <= div_8;
clk_out4 <= div_16;
clk_out5 <= div_32;
clk_out6 <= div_64;
clk_out7<= div_128;
clk_out8 <= div_256;
clk_out9 <= div_512;
clk_out10 <= div_1024;
clk_out11 <= div_2048;
clk_out12 <= div_4096;
clk_out13 <= div_8192;
clk_out14 <= div_16384;
clk_out15 <= div_32768;
end architecture clk_div;
DEPT OF ECE Page 62 of 66 AIET
5.2 RESULTENT WAVE FORM
Figure:18 Output figure of clock distribution
DEPT OF ECE Page 63 of 66 AIET
CHAPTER 6CONCLUSIONS AND FUTURE WORK
6.1 Conclusions
In this thesis we presented a novel approach to model a clock distribution network
using VHDL-AMS. For this purpose, a set of models were developed for the clock
distribution network, including the components like Interconnects, Buffers, Phase Locked
Loop and Source Oscillator. The models were simulated using Cadence LDV 5.1 AMS
simulator and were checked for functionality. Modeling of a generic clock distribution
network was demonstrated using the VHDL-AMS models. This satisfied the first objective of
this project.
Two case studies were considered in this research to demonstrate the versatility of the
VHDL AMS in modeling a clock distribution network. In the first case, a balanced H Tree
based clock distribution network was modeled, and in the second case a regular pattern clock
distribution network was modeled. This addressed the second objective of this project.
The characteristic of clock studied in this research was clock skew. Its variation with
varying levels of the H-Tree, interconnect lengths, load capacitance and the number of stages
in a regular pattern clock distribution network was studied in this research. This satisfied the
third objective of this research (Section 1.2). Compared to equivalent SPICE AMS models
developed in this research had an average error ranging between 3.60% and 8.21%. This
validated the accuracy of the VHDL-AMS models developed in this research.
DEPT OF ECE Page 64 of 66 AIET
6.2 Future Work
The suggested future works of this research are listed as follows:
A Model generator can be developed to automatically output a VHDL-AMS
Model based on the user requirements for the clock distribution network.
In this research, only behavioral and structural levels of abstraction were
considered for the selected components. To achieve high fidelity, component level
of abstraction can be considered.
The buffer models can be made more exhaustive by including the process
variation effects.
For high fidelity, transmission models can be generated for interconnects which
involves frequency domain modeling.
Higher order filters can be implemented in PLLs to make the model more
Accurate.
Effects of jitter can be included to improve the accuracy of a clock distribution
network model
DEPT OF ECE Page 65 of 66 AIET
Bibliography1. FRIEDMAN, E. G., AND POWELL, S. Design and Analysis of a Hierarchical Clock
Distribution System for Synchronous Standard Cell/Macro Cell. IEEE Journal of Solid-State
Circuits (April 1986), Vol. SC-21, No. 2.
2. RESTLE, P. J., AND DEUTSCH, A. Designing the Best Clock Distribution Network.
1998 Symposium on VLSI Circuits Digest of Technical Papers.
3. FRIEDMAN, E. G. Clock Distribution Design in VlSI Circuits – an Overview.
Proceedings of IEEE International Symposium on Circuits and Systems (May 1993),
pp. 1475-1478.
4. ZANELLA, S., NARDI, A., NEVIANI, A., QUARANTELLI, M., SAXENA, S.,
AND
GUARDIANI, C. Analysis of the Impact of Process Variations on Clock Skew.IEEE
Transactions on Semiconductor Manufacturing (Nov 2000), Vol 13, No. 4.
5. MEHROTRA, V., AND BONING, D Technology Scaling Impact of Variation on
Clock Skew and Interconnect Delay. International Interconnect Technology
Conference (IITC) (June 2001), San Francisco, CA.
6. MEHROTRA, V., SAM, S. L., BONING, D., CHANDRAKASAN, A.,
VALLISHAYEE, R.,
AND NASSIF, S. A Methodology for M odeling the Effects of Systematic Within-
Die Interconnect and Device Variation on Circuit Performance. 37th Conference
on Design Automation (DAC 2000), pp. 172-175.
7. XI, J. G., AND DAI, W. Buffer Insertion and Sizing Under Process Variations for
Low Power Clock Distribution. Proceedings of the 32nd ACM/IEEE Conference
ation (1995), pp. 491-496.
8. WOLAVER, D.H. Phase-Locked Loop Circuit Design. Prentice Hall, 1991.
9. STENSBY, J. L. Phase-Locked Loops: Theory and Applications. CRC Press, 1997.
10. http://lsiwww.epfl.ch/LSI2001/teaching/webcourse/ch05/ch05.html.
11. http://www-ensps.u-strasbg.fr/coursen/Option3A/ams_part1.html
12. http://hyperphysics.phy-astr.gsu.edu/hbase/electric/restmp.html
13. http://www.mosis.org/cgi-bin/cgiwrap/umosis/swp/params/ami-c5/t54gparams.
Txt
14. http://www.ece.cmu.[28] edu/~ee762/hspice-ocs/html/hspice_and_qrg/hspice_2001_2-
72.html
DEPT OF ECE Page 66 of 66 AIET