Design of open core protocol ocp

DESIGN OF OPEN CORE PROTOCOL

CHAPTER 1

INTRODUCTION

1.1 INTRODUCTION

An SOC chip usually contains a large number of IP cores that communicate with each

other through on-chip buses. As the VLSI process technology continuously advances, the

frequency and the amount of the data communication between IP cores increase substantially. As

a result, the ability of on chip buses to deal with the large amount of data traffic becomes a

dominant factor for the overall performance. The design of on-chip buses can be divided into two

parts: bus interface and bus architecture. The bus interface involves a set of interface signals

and their corresponding timing relationship, while the bus architecture refers to the internal

components of buses and the interconnections among the IP cores.

The widely accepted on-chip bus, AMBA AHB, defines a set of bus interface to facilitate

basic (single) and burst read/write transactions. AHB also defines the internal bus architecture,

which is mainly a shared bus composed of multiplexors. The multiplexer-based bus architecture

works well for a design with a small number of IP cores. When the number of integrated IP cores

increases, the communication between IP cores also increase and it becomes quite frequent that

two or more master IPs would request data from different slaves at the same time. The shared bus

architecture often cannot provide efficient communication since only one bus transaction can be

supported at a time. To solve this problem, two bus protocols have been proposed recently. One

is the Advanced extensible Interface protocol (AXI) proposed by the ARM company.

AXI defines five independent channels (write address, write data, write response, read

address, and read data channels). Each channel involves a set of signals. AXI does not restrict the

internal bus architecture and leaves it to designers. Thus designers are allowed to integrate two IP

cores with AXI by either connecting the wires directly or invoking an in-house bus between

them. The other bus interface protocol is proposed by a non-profitable organization, the Open

Core Protocol – International Partnership (OCP-IP). OCP is an interface (or socket) aiming to

standardize and thus simplify the system integration problems. It facilitates system integration by

defining a set of concrete interface (I/O signals and the handshaking protocol) which is

independent of the bus architecture. Based on this interface IP core designers can concentrate on

designing the internal functionality of IP cores, bus designers can emphasize on the internal bus

DEPARTMENT OF ECE Page 1


architecture, and system integrators can focus on the system issues such as the requirement of the

bandwidth and the whole system architecture. In this way, system integration becomes much

more efficient.

Most of the bus functionalities defined in AXI and OCP are quite similar. The most

conspicuous difference between them is that AXI divides the address channel into independent

write address channel and read address channel such that read and write transactions can be

processed simultaneously. However, the additional area of the separated address channels is the

penalty .Some previous work has investigated on-chip buses from various aspects. The work

presented develops high-level AMBA bus models with fast simulation speed and high timing

accuracy. The authors in propose an automatic approach to generate high-level bus models from a

formal channel model of OCP. In both of the above work, the authors concentrate on fast and

accurate simulation models at high level but did not provide real hardware implementation

details. In the authors implement the AXI interface on shared bus architecture. Even though it

costs less in area, the benefit of AXI in the communication efficiency may be limited by the

shared-bus architecture.

In this paper we propose a high-performance on-chip bus design with OCP as the bus

interface. We choose OCP because it is open to the public and OCP-IP has provided some free

tools to verify this protocol. Nevertheless, most bus design techniques developed in this paper

can also be applied to the AXI bus. Our proposed bus architecture features crossbar/partial-

crossbar based interconnect and realizes most transactions defined in OCP, including 1) single

transactions, 2) burst transactions, 3) lock transactions, 4) pipelined transactions, and 5) out-of-

order transactions. In addition, the proposed bus is flexible such that one can adjust the bus

architecture according to the system requirement .One key issue of advanced buses is how to

manipulate the order of transactions such that requests from masters and responses from slaves

can be carried out in best efficiency without violating any ordering constraint. In this work we

have developed a key bus component called the scheduler to handle the ordering issues of out-of-

order transactions. We will show that the proposed crossbar/partial-crossbar bus architecture

together with the scheduler can significantly enhance the communication efficiency of a complex

SOC.



1.2 BASIC IDEA

Basic idea is to perform the proper and lossless communication between the IP cores

which using same protocols on the System on Chip (SOC) system. Basically, an SOC is a system

which is considered as a set of components and interconnects among them. The dataflow will

happen in the system in order to achieve a successful process and hence for which the various

interfaces is required. If these interfaces have issues, then the process to be achieved will fail

which leads to fail of whole application.

Generally, in an SOC system, the protocols can be used as interfaces which will be based

on the application and also the designer. The interface has its own properties which suits for the

corresponding application.

1.3 NEED FOR PROJECT

This project is chosen because currently the issues are increased in the industries due to

the lack of proper data transferring between the IP cores on the System on Chip (SOC) system.

In recent days, the development of SOC chips and the reusable IP cores are given higher

priority because of its less cost and reduction in the period of Time-to-Market. So this enables the

major and very sensitive issue such as interfacing of these IP cores. These interfaces play a vital

role in SOC and should be taken care because of the communication between the IP cores

property. The communication between the different IP cores should have a lossless data flow and

should be flexible to the designer too.

Hence to resolve this issue, the standard protocol buses are used in or order to interface

the two IP cores. Here the loss of data depends on the standards of protocols used. Most of the IP

cores OCP (Open Core Protocol) which is basically a core based protocol which has its own

advantages and flexibilities.

1.3.1 OVERVIEW

The Open Core Protocol™ (OCP) defines a high-performance, bus-independent interface

between IP cores that reduces design time, design risk, and manufacturing costs for SOC designs.

An IP core can be a simple peripheral core, a high-performance microprocessor, or an on-chip

communication subsystem such as a wrapped on-chip bus. The Open Core Protocol,



Achieves the goal of IP design reuse. The OCP transforms IP cores making them

independent of the architecture and design of the systems in which they are used

Optimizes die area by configuring into the OCP only those features needed by the

communicating cores

Simplifies system verification and testing by providing a firm boundary around each IP

core that can be observed, controlled, and validated

The approach adopted by the Virtual Socket Interface Alliance’s (VSIA) Design Working

Group on On-Chip Buses (DWGOCB) is to specify a bus wrapper to provide a bus-independent

Transaction Protocol-level interface to IP cores. The OCP is equivalent to VSIA’s Virtual

Component Interface (VCI). While the VCI addresses only data flow aspects of core

communications, the OCP is a superset of VCI additionally supporting configurable sideband

control signaling and test harness signals. The OCP is the only standard that defines protocols to

unify all of the inter-core communication.

The Open Core Protocol (OCP) delivers the only non-proprietary, openly licensed, core-

centric protocol that comprehensively describes the system-level integration requirements of

intellectual property (IP) cores. While other bus and component interfaces address only the data

flow aspects of core communications, the OCP unifies all inter-core communications, including

sideband control and test harness signals. OCP's synchronous unidirectional signaling produces

simplified core implementation, integration, and timing analysis.

OCP eliminates the task of repeatedly defining, verifying, documenting and supporting

proprietary interface protocols. The OCP readily adapts to support new core capabilities while

limiting test suite modifications for core upgrades. Clearly delineated design boundaries enable

cores to be designed independently of other system cores yielding definitive, reusable IP cores

with reusable verification and test suites.

Any on-chip interconnects can be interfaced to the OCP rendering it appropriate for many

forms of on-chip communications:

Dedicated peer-to-peer communications, as in many pipelined signal processing

applications such as MPEG2 decoding.

Simple slave-only applications such as slow peripheral interfaces.

High-performance, latency-sensitive, multi-threaded applications, such as multi-bank

DRAM architectures.



The OCP supports very high performance data transfer models ranging from simple

request-grants through pipelined and multi-threaded objects. Higher complexity SOC

communication models are supported using thread identifiers to manage out-of-order completion

of multiple concurrent transfer sequences.

The Open Core Protocol interface addresses communications between the functional units

(or IP cores) that comprise a system on a chip. The OCP provides independence from bus

protocols without having to sacrifice high-performance access to on-chip interconnects. By

designing to the interface boundary defined by the OCP, you can develop reusable IP cores

without regard for the ultimate target system.

Given the wide range of IP core functionality, performance and interface requirements, a

fixed definition interface protocol cannot address the full spectrum of requirements. The need to

support verification and test requirements adds an even higher level of complexity to the

interface. To address this spectrum of interface definitions, the OCP defines a highly configurable

interface. The OCP’s structured methodology includes all of the signals required to describe an IP

cores’ communications including data flow, control, and verification and test signals.

Here the importance of project comes into picture i.e. “OCP (Open Core Protocol) plays

a vital role by doing its transaction between two different IP cores, which will make the

application fail when it doesn’t work properly”.

1.3.2 APPLICATION

Since it is an IP block, it can be used in any kind of SOC Application. The application can

be listed as follows.

SRAM

Processor



CHAPTER 2CHAPTER 2

LITERATURE REVIEWLITERATURE REVIEW

2.1 INTRODUCTION 2.1 INTRODUCTION

With the rapid progress of system-on-a-chip (SOC) and massive data movement

requirement, on-chip system bus becomes the central role in determining the performance of a

SOC. Two types of on-chip bus have been widely used in current designs: pipelined-based bus

and packet-based bus.

For pipelined-based buses, such as ARM’s AMBA 2.0 AHB, IBM’s Core Connect and

Open Core’s Wishbone, the cost and complexity to bridge the communications among on-chip

designs are low. However, pipeline-based bus suffers from bus contention and inherent blocking

characteristics due to the protocol. The contention issue can be alleviated by adopting multi-layer

bus structure or using proper arbitration policies. However, the blocking characteristic, which

allows a transfer to complete only if the previous transfer has completed, cannot be altered

without changing the bus protocol. This blocking characteristic reduces the bus bandwidth

utilization when accessing long latency devices, such as an external memory controller.

To cope with the issues of pipelined-based buses packet-based buses such as ARM’s

AMBA 3.0 AXI, OCP-IP’s Open Core Protocol (OCP), and STMicroelectronics’ ST Bus have

been proposed to support outstanding transfer and out-of-order transfer completion. We will

focus on AXI here because of its popularity. AXI bus possesses multiple independent channels to

support multiple simultaneous address and data streams. Besides, AXI also supports improved

burst operation, register slicing with registered input and secured transfer.

Despite the above features, AXI requires high cost and possesses long transaction

handshaking latency. However, a shared-link AXI interconnect can provide good performance

while requiring less than half of the hardware required by a crossbar AXI implementation. This

work focused on the performance analysis of a shared-link AXI. The handshaking latency is at

least two cycles if the interface or interconnect is designed with registered input. This would limit

the bandwidth utilization to less than 50%. To reduce the handshaking latency, we proposed a

hybrid data locked transfer mode. Unlike the lock transfer in which requires arbitration lock over

transactions, our data locked mode is based on a transfer-level arbitration scheme and allows bus



ownership to change between transactions. This gives more flexibility to arbitration policy

selection.

With the additional features of AXI, new factors that affect the bus performance are also

introduced. The first factor is the arbitration combination. The multi-channel architecture allows

different and independent arbitration policies to be adopted by each channel. However, existing

AXI-related works often assumed a unified arbitration policy where each channel adopts the

same arbitration policy. Another key factor is the interface buffer size. A larger interface buffer

usually implies that more out-of-order transactions can be handled. The third factor is the task

access setting, which defines how the transfer modes should be used by the devices within a

system

2.2 TRANSFER MODES

2.2.1 NORMAL

This mode is the basic transfer mode in an AXI bus with registered interface. In the first

cycle of a transfer using normal mode, the initiator sets the valid signal high and sends it to the

target. In the second cycle, the target receives the high valid signal and sets the ready signal high

for one cycle in response. Once the initiator receives the High ready signal, the initiator resets the

valid signal low and this transfer is completed. As a result, at least two cycles are needed to

complete a transfer in an AXI bus with registered interface.

Fig 2.1 Normal mode transfer example



Two normal transactions with a data burst length of four. It takes 16 bus cycles to

complete the eight data transfer in the two transactions. This means 50% of the bus available

bandwidth is wasted.

2.2.2 INTERLEAVED MODE

The interleaved mode hides transfer latency by allowing two transactions from different

initiators to be transferred in an interleaved manner. Fig 2.2 illustrates the transfer of the two

transactions mentioned earlier using interleaved transfer mode. The one cycle latency introduced

in the normal mode for request B is hidden by the transfer of request A. Similarly, the interleaved

transfer mode can also be applied to data channels. As a result, transferring the data of the two

transactions only takes nine cycles.

To support the interleaved mode, only the bus interconnects needs additional hardware.

No additional hardware in device interface or modification on bus protocol is required. Hence, an

AXI interconnect that supports the interleaved mode can be used with standard AXI device.

2.2.3 PROPOSED DATA LOCKED MODE

Although the interleaved mode can increase bandwidth utilization when more than one

initiator is using the bus, the interleaved mode cannot be enabled when only one standalone

initiator is using the bus. To handle this, we proposed the data locked mode. In contrast to the

locked transfer implemented can only perform when the bus ownership is locked across

consecutive transactions, the proposed data locked mode locks the ownership of the bus only

within the period of burst data transfers. During the burst data transfer period, the ready signal is

tied high and hence the handshaking process is bypassed.

Fig 2.3 illustrates an example of two transactions using data locked mode to transfer data.

Device M0 sends a data locked request A and device M1 sends a data locked request B. Once the

bus interconnect accepts request A, the bus interconnect records the transaction ID of request A.

When a data transfer with the matched ID appears in the data channel, the bus interconnect uses

data locked mode to transfer the data continuously. For a transaction with a data burst of n, the

data transfer latency is (n + 1) cycles.

There are two approaches to signal the bus interconnect to use the data locked mode for a

transaction. One uses ARLOCK/AWLOCK signal in the address channels to signal the bus of an



incoming transaction using data locked transfer. However, doing so requires modifying the

protocol definition of these signals and the bus interface. To avoid modifying the protocol, the

other approach assigns the devices that can use the data locked mode in advance. The overhead of

this approach is that the bus interconnect must provide mechanisms to configure the device

transfer mode mapping. Note that these two approaches can be used together without conflict.

To support the proposed data locked mode, the bus interconnect needs an additional

buffer, called data locked mode buffer, to keep record of the transactions using the data locked

mode. Each entry in the buffer stores one transaction ID.

If all the entries in the data locked mode buffer are in use, no more transactions can be

transferred using the data locked mode.

Fig 2.2 Interleaved mode transfer example



Fig 2.3 Data Locked Mode Transfer Example

2.2.4 PROPOSED HYBRID DATA LOCKED MODE

The hybrid data locked mode is proposed to allow additional data locked mode

transaction requests to be transferred using the normal or interleaved mode when the data locked

mode buffer is full. This allows more transactions to be available to the scheduler of the devices

that support transaction scheduling. With the additional transactions, the scheduler of such

devices may achieve better scheduling result.

However, only a limited number of additional transactions using the data locked mode

can be transferred using the normal or interleaved mode. This avoids bandwidth-hungry devices

from occupying the bus with too many transactions. A hybrid mode counter is included to count

the number of additional transactions transferred. If the counter value reaches the preset

threshold, no more data locked mode transactions can be transferred using the normal or

interleaved mode until the data locked mode buffer becomes not full again. Once the data locked

mode buffer is not full, the hybrid mode counter is reset.

2.3 AMBA-AHB PROTOCOL

2.3.1 INTRODUCTION

The AHB (Advanced High-performance Bus) is a high-performance bus in AMBA

(Advanced Microcontroller Bus Architecture) family. This AHB can be used in high clock

frequency system modules. The AHB acts as the high-performance system backbone bus. AHB

supports the efficient connection of processors, on-chip memories and off-chip external memory

interfaces with low-power peripheral macro cell functions. AHB is also specified to ensure ease

of use in an efficient design flow using automated test techniques. This AHB is a technology-

independent and ensure that highly reusable peripheral and system macro cells can be migrated

across a diverse range of IC processes and be appropriate for full-custom, standard cell and gate

array technologies.

2.3.2 BASIC IDEA

Basic idea is to perform the proper and lossless communication between the IP cores

which using same protocols on the System on Chip (SOC) system. Basically, an SOC is a system



which is considered as a set of components and interconnects among them. The dataflow will

happen in the system in order to achieve a successful process and hence for which the various

interfaces is required. If these interfaces have issues, then the process to be achieved will fail

which leads to fail of whole application.

Generally, in an SOC system, the protocols can be used as interfaces which will be based

on the application and also the designer. The interface has its own properties which suits for the

corresponding application.

2.3.3 NEED FOR PROJECT

This project is chosen because currently the issues are increased in the industries due to

the lack of proper data transferring between the IP cores on the System on Chip (SOC) system.

In recent days, the development of SOC chips and the reusable IP cores are given higher

priority because of its less cost and reduction in the period of Time-to-Market. So this enables the

major and very sensitive issue such as interfacing of these IP cores. These interfaces play a vital

role in SOC and should be taken care because of the communication between the IP cores

property. The communication between the different IP cores should have a lossless data flow and

should be flexible to the designer too.

Hence to resolve this issue, the standard protocol buses are used in or order to interface

the two IP cores. Here the loss of data depends on the standards of protocols used. Most of the IP

cores from ARM use the AMBA (Advanced Microcontroller Bus Architecture) which has AHB

(Advanced High-Performance Bus). This bus has its own advantages and flexibilities. A full

AHB interface is used for the following.

Bus masters

On-chip memory blocks

External memory interfaces

High-bandwidth peripherals with FIFO interfaces

DMA slave peripherals

2.3.4 OBJECTIVES OF THE AMBA SPECIFICATION

The AMBA specification has been derived to satisfy four key requirements:



To facilitate the right-first-time development of embedded microcontroller products with

one or more CPUs or signal processors.

To be technology-independent and ensure that highly reusable peripheral and system

macro cells can be migrated across a diverse range of IC processes and be appropriate for

full-custom, standard cell and gate array technologies.

To encourage modular system design to improve processor independence, providing a

development road-map for advanced cached CPU cores and the development of

peripheral libraries.

To minimize the silicon infrastructure required to support efficient on-chip and off-chip

communication for both operation and manufacturing test.

2.3.5 TYPICAL AMBA-BASED MICROCONTROLLER

An AMBA-based microcontroller typically consists of a high-performance system

backbone bus (AMBA AHB or AMBA ASB), able to sustain the external memory bandwidth, on

which the CPU, on-chip memory and other Direct Memory Access (DMA) devices reside. This

bus provides a high-bandwidth interface between the elements that are involved in the majority of

transfers.

Also located on the high-performance bus is a bridge to the lower bandwidth APB, where

most of the peripheral devices in the system are located.

Fig 2.4 Typical AMBA Systems



The key advantages of a typical AMBA System are listed as follows

High performance

Pipelined operation

Multiple bus masters

Burst transfers

Split transactions

AMBA APB provides the basic peripheral macro cell communications infrastructure as a

secondary bus from the higher bandwidth pipelined main system bus such peripherals typically.

Have interfaces which are memory-mapped registers

Have no high-bandwidth interfaces

Are accessed under programmed control.The external memory interface is application-

specific and may only have a narrow data path, but may also support a test access mode

which allows the internal AMBA AHB, ASB and APB modules to be tested in isolation

with system-independent test sets.

2.4. TERMINOLOGY

The following terms are used throughout this specification

2.4.1 BUS CYCLE

A bus cycle is a basic unit of one bus clock period and for the purpose of AMBA AHB or

APB protocol descriptions is defined from rising-edge to rising-edge transitions. An ASB bus

cycle is defined from falling-edge to falling-edge transitions. Bus signal timing is referenced to

the bus cycle clock.

2.4.2 BUS TRANSFER

An AMBA AHB bus transfer is a read or write operation of a data object, which may take

one or more bus cycles. The bus transfer is terminated by a completion response from the

addressed slave. The transfer sizes supported by AMBA AHB include byte (8-bit), half word (16-

bit) and word (32-bit).

2.4.3 BURST OPERATION



A burst operation is defined as one or more data transactions, initiated by a bus master,

which have a consistent width of transaction to an incremental region of address space. The

increment step per transaction is determined by the width of transfer (byte, half word and word).

2.5 APPLICATIONS

AMBA-AHB can be used in the different application and also it is technology

independent.

ARM Controllers are designed according to the specifications of AMBA.

In the present technology, high performance and speed are required which are

convincingly met by AMBA-AHB

Compared to the other architectures AMBA-AHB is far more advanced and efficient.

To minimize the silicon infrastructure to support on-chip and off-chip communications

Any embedded project which involve in ARM processors or microcontroller must always

make use of this AMBA-AHB as the common bus throughout the project.

2.6 FEATURES

AMBA Advanced High-performance Bus (AHB) supports the following features:

High performance

Burst transfers

Split transactions

Single edge clock operation

SEQ, NONSEQ, BUSY, and IDLE Transfer Types

Programmable number of idle cycles

Large Data bus-widths - 32, 64, 128 and 256 bits wide

Address Decoding with Configurable Memory Map

2.7 MERITS

Since AHB is a most commonly used bus protocol, it must have many advantages from designer’s point of view and are mentioned below.



AHB offers a fairly low cost (in area), low power (based on I/O) bus with a moderate

amount of complexity and it can achieve higher frequencies when compared to others

because this protocol separates the address and data phases.

AHB can use the higher frequency along with separate data buses that can be defined to

128-bit and above to achieve the bandwidth required for high-performance bus

applications.

AHB can access other protocols through the proper bridging converter. Hence it supports

the bridge configuration for data transfer.

AHB allows slaves with significant latency to respond to read with an HRESP of

“SPLIT”. The slave will then request the bus on behalf of the master when the read data is

available. This enables better bus utilization.

AHB offers burst capability by defining incrementing bursts of specified length and it

supports both incrementing and wrapping. Although AHB requires that an address phase

be provided for each beat of data, the slave can still use the burst information to make the

proper request on the other side. This helps to mask the latency of the slave.

AHB is defined with a choice of several bus widths, from 8-bit to 1024-bit. The most

common implementation has been 32-bit, but higher bandwidth requirements may be

satisfied by using 64 or 128-bit buses.

AHB used the HRESP signals driven by the slaves to indicate when an error has occurred.

AHB also offers a large selection of verification IP from several different suppliers. The

solutions offered support several different languages and run in a choice of environments.

Access to the target device is controlled through a MUX, thereby admitting bus-access to

one bus-master at a time.

AHB Masters, Slaves and Arbiters support Early Burst Termination. Bursts can be early

terminated either as a result of the Arbiter removing the HGRANT to a master part way

through a burst or after a slave returns a non-OKAY response to any beat of a burst.

However that a master cannot decide to terminate a defined length burst unless prompted

to do so by the Arbiter or Slave responses.

Any slave which does not use SPLIT responses can be connected directly to an AHB

master. If the slave does use SPLIT responses then a simplified version of the arbiter is

also required.



Thus the strengths of the AHB protocol is listed above which clearly resembles the reason

for the wide use of this protocol.

CHAPTER 3


3.1 INTRODUCTION

The Open Core Protocol (OCP) is a core centric protocol which defines a high-

performance, bus-independent interface between IP cores that reduces design time, design risk,

and manufacturing costs for SOC designs. Main property of OCP is that it can be configured with

respect to the application required. The OCP is chosen because of its advanced supporting

features such as configurable sideband control signaling and test harness signals, when compared

to other core protocols.

The other bus and component interfaces address only the data flow aspects of core

communications, the OCP unifies all inter-core communications, including sideband control and

test harness signals. The OCP’s synchronous unidirectional signaling produces simplified core

implementation, integration, and timing analysis. The OCP readily adapts to support new core

capabilities while limiting test suite modifications for core upgrades.

3.2 MERITS

The OCP has many advantages which will make the designers more comfortable which

are listed below.

OCP is a Point to point protocol which can be directly interfaced between the two IP

cores.

Most important advantage is that the OCP can be configured with respect to the

application due to its configurable property.

This configurable property will lead to reduction of the die area and the design time too.

Hence the optimization of die area is attained.



OCP is a bus independent protocol i.e. it can be interfaced to any bus protocol like AHB.

This supports pipelining operation and multi-threaded application such as Multi Bank

DRAM architecture.

Support the Burst operation which will generate the sequence of addresses with respect to

the burst length.

This OCP provides more flexible to the designer who uses it and also gives high

performance by improved core maintenance.

The reusability of the IP cores can be done easily using OCP because the issue arises

while reusing the IP cores for other application is that the interfaces already used in the

system have to be modified with respect to the application.

Supports Sideband Signals which will carry out the information such as interrupt, flags,

error and status which are said to be non-dataflow signals.

Also supports the Testing Signals such as scan interface, clock control interface and

Debug and test interface. This ensures that the OCP can also be used to interface the

Device under Test (DUT) and test signals can be passed.

This OCP also enables the Threads and Tags which does the independent concurrent

transfer sequence.

OCP doubles the peak bandwidth at a given frequency by using separate buses for read

and write data. These buses are used in conjunction with pipelining command to data

phases to increase performance.

Simplified circuitry needed to bridge an OCP based core to another communication

interface standard.

Thus the advantages of the OCP are listed above which clearly explains the basic reason

of choosing this protocol when compared to others.

3.3 BASIC BLOCK DIAGRAM

The block diagram which explains the basic operation and characteristics of OCP is

shown in Figure 3.1.

The OCP defines a point-to-point interface between two communicating entities such as

IP cores and bus interface modules. One entity acts as the master of the OCP instance, and the

other as the slave. Only the master can present commands and is the controlling entity.



The slave responds to commands presented to it, either by accepting data from the master,

or presenting data to the master. For two entities to communicate there need to be two instances

of the OCP connecting them such as one where the first entity is a master, and one where the first

entity is a slave.

Fig 3.1 Basic block diagram of OCP instance

Figure 3.1 shows a simple system containing a wrapped bus and three IP core entities

such as one that is a system target, one that is a system initiator, and an entity that is both. The

characteristics of the IP core determine whether the core needs master, slave, or both sides of the

OCP and the wrapper interface modules must act as the complementary side of the OCP for each

connected entity. A transfer across this system occurs as follows.

A system initiator (as the OCP master) presents command, control, and possibly data to

its connected slave (a bus wrapper interface module). The interface module plays the request

across the on-chip bus system. The OCP does not specify the embedded bus functionality.

Instead, the interface designer converts the OCP request into an embedded bus transfer. The

receiving bus wrapper interface module (as the OCP master) converts the embedded bus

operation into a legal OCP command. The system target (OCP slave) receives the command and

takes the requested action.

Each instance of the OCP is configured (by choosing signals or bit widths of a particular

signal) based on the requirements of the connected entities and is independent of the others. For

instance, system initiators may require more address bits in their OCP instances than do the



system targets; the extra address bits might be used by the embedded bus to select which bus

target is addressed by the system initiator.

The OCP is flexible. There are several useful models for how existing IP cores

communicate with one another. Some employ pipelining to improve bandwidth and latency

characteristics. Others use multiple-cycle access models, where signals are held static for several

clock cycles to simplify timing analysis and reduce implementation area. Support for this wide

range of behavior is possible through the use of synchronous handshaking signals that allow both

the master and slave to control when signals are allowed to change.

3.4 THEORY OF OPERATION

The various operation involved in the Open Core Protocol will be discussed as follows.

Point-to-Point Synchronous Interface

To simplify timing analysis, physical design, and general comprehension, the OCP is

composed of unidirectional signals driven with respect to, and sampled by the rising edge of the

OCP clock. The OCP is fully synchronous (with the exception of reset) and contains no multi-

cycle timing paths with respect to the OCP clock. All signals other than the clock signal are

strictly point-to-point.

Bus Independence

A core utilizing the OCP can be interfaced to any bus. A test of any bus-independent

interface is to connect a master to a slave without an intervening on-chip bus. This test not only

drives the specification towards a fully symmetric interface but helps to clarify other issues. For

instance, device selection techniques vary greatly among on-chip buses. Some use address

decoders. Others generate independent device select signals (analogous to a board level chip

select). This complexity should be hidden from IP cores, especially since in the directly-

connected case there is no decode/selection logic. OCP-compliant slaves receive device selection

information integrated into the basic command field.

Arbitration schemes vary widely. Since there is virtually no arbitration in the directly-

connected case, arbitration for any shared resource is the sole responsibility of the logic on the

bus side of the OCP. This permits OCP-compliant masters to pass a command field across the

OCP that the bus interface logic converts into an arbitration request sequence.

Address/Data



Wide widths, characteristic of shared on-chip address and data buses, make tuning the

OCP address and data widths essential for area-efficient implementation. Only those address bits

that are significant to the IP core should cross the OCP to the slave. The OCP address space is

flat and composed of 8-bit bytes (octets).

To increase transfer efficiencies, many IP cores have data field widths significantly

greater than an octet. The OCP supports a configurable data width to allow multiple bytes to be

transferred simultaneously. The OCP refers to the chosen data field width as the word size of the

OCP. The term word is used in the traditional computer system context; that is, a word is the

natural transfer unit of the block. OCP supports word sizes of power-of-two and non-power-of-

two as would be needed for a 12-bit DSP core. The OCP address is a byte address that is word

aligned.

Transfers of less than a full word of data are supported by providing byte enable

information that specifies which octets are to be transferred. Byte enables are linked to specific

data bits (byte lanes). Byte lanes are not associated with particular byte addresses.

Pipelining

The OCP allows pipelining of transfers. To support this feature, the return of read data

and the provision of write data may be delayed after the presentation of the associated request.

Response

The OCP separates requests from responses. A slave can accept a command request from

a master on one cycle and respond in a later cycle. The division of request from response permits

pipelining. The OCP provides the option of having responses for Write commands, or completing

them immediately without an explicit response.

Burst

To provide high transfer efficiency, burst support is essential for many IP cores. The

extended OCP supports annotation of transfers with burst information. Bursts can either include

addressing information for each successive command (which simplifies the requirements for

address sequencing/burst count processing in the slave), or include addressing information only

once for the entire burst.

3.5 ON-CHIP BUS FUNCTIONALITIES



The On-Chip Bus Functionalities are classified into 4 types including 1) burst, 2) lock, 3)

pipelined, and 4) out-of-order transactions.

3.5.1 BURST TRANSACTIONS

The burst transactions allow the grouping of multiple transactions that have a certain

address relationship, and can be classified into multi-request burst and single-request burst

according to how many times the addresses are issued. Fig 3.2 shows the two types of burst read

transactions. The multi-request burst as defined in AHB is illustrated in Fig 3.2(a) where the

address information must be issued for each command of a burst transaction (e.g., A11, A12, A13

and A14).

This may cause some unnecessary overhead. In the more advanced bus architecture, the

single-request burst transaction is supported. As shown in Fig 3.2(b), which is the burst type

defined in AXI, the address information is issued only once for each burst transaction. In our

proposed bus design we support both burst transactions such that IP cores with various burst

types can use the proposed on-chip bus without changing their original burst behavior.

Fig 3.2 Burst Transactions

3.5.2 LOCK TRANSACTIONS

Lock is a protection mechanism for masters that have low bus priorities. Without this

mechanism the read/write transactions of masters with lower priority would be interrupted

whenever a higher-priority master issues a request. Lock transactions prevent an arbiter from

performing arbitration and assure that the low priority masters can complete its granted

transaction without being interrupted.

3.5.3 PIPELINED TRANSACTIONS (OUTSTANDING TRANSACTIONS)



Fig 3.3(a) and .3(b) show the difference between non-pipelined and pipelined (also called

outstanding in AXI) read transactions. In Fig 3.3(a), for a non-pipelined transaction a read data

must be returned after its corresponding address is issued plus a period of latency. For example,

D21 is sent right after A21 is issued plus t. For a pipelined transaction as shown in Fig 3.3(b), this

hard link is not required. Thus A21 can be issued right after A11 is issued without waiting for the

return of data requested by A11 (i.e., D11-D14).

Fig 3.3 Pipelined Transactions

3.5.4 OUT-OF-ORDER TRANSACTIONS

The out-of-order transactions allow the return order of responses to be different from the

order of their requests. These transactions can significantly improve the communication

efficiency of an SOC system containing IP cores with various access latencies as illustrated in

Figure. In Figure 3.4(a) which does not allow out-of-order transactions, the corresponding

responses of A21 and A31 must be returned after the response of A11. With the support of out-

of-order transactions as shown in Figure, the response with shorter access latency (D21, D22 and

D31) can be returned before those with longer latency (D11-D14) and thus the transactions can

be completed in much less cycles.



Fig 3.4 Out -Of -Order Transactions

3.6 HARDWARE DESIGN OF THE ON-CHIP BUS.Arbiter

In traditional shared bus architecture, resource contention happens whenever more than one

master requests the bus at the same time. For a crossbar or partial crossbar architecture, resource

contention occurs when more than one master is to access the same slave simultaneously. In the

proposed design each slave IP is associated with an arbiter that determines which master can

access the slave.



l

Fig 3.5 Ocp Block Diagram

Decoder

Since more than one slave exists in the system, the decoder decodes the address and

decides which slave return response to the target master. In addition, the proposed decoder also

checks whether the transaction address is illegal or nonexistent and responses with an error

message if necessary.

FSM-M & FSM-S



Depending on whether a transaction is a read or a write operation, the request and

response processes are different. For a write transaction, the data to be written is sent out together

with the address of the target slave, and the transaction is complete when the target slave accepts

the data and acknowledges the reception of the data. For a read operation, the address of the

target slave is first sent out and the target slave will issue an accept signal when it receives the

message.

The slave then generates the required data and sends it to the bus where the data will be

properly directed to the master requesting the data. The read transaction finally completes when

the master accepts the response and issues an acknowledge signal. In the proposed bus

architecture, we employ two types of finite state machines, namely FSM-M and FSM-S to control

the flow of each transaction. FSM-M acts as a master and generates the OCP signals of a master,

while FSM-S acts as a slave and generates those of a slave. These finite state machines are

designed in a way that burst, pipelined, and out-or-order read/write transactions can all be

properly controlled.

3.7 OCP DATAFLOW SIGNALS

The OCP interface has the dataflow signals which are divided into basic signals, burst

extensions, tag extensions, and thread extensions. A small set of the signals from the basic

dataflow group is required in all OCP configurations. Optional signals can be configured to

support additional core communication requirements. The OCP is a synchronous interface with a

single clock signal. All OCP signals are driven with respect to the rising edge of the OCP clock

and also sampled by the rising edge of the OCP clock. Except for clock, OCP signals are strictly

point-to-point and unidirectional.

The basic signals between the two cores are identified and are shown in the fig 4.5. which

is said to be a dataflow signal diagram. Here core1 acts as the master that gives the command to

the slave and the core2 acts as the slave which accepts the command given by the master in order

to perform an operation.


MASTER

SLAVE

CLK

MCmd

MAddr

MData

MBurstSeq

MBurstLength

MBurstPrecise

SCmdAccept

SResp

SData

SRespLast

MDataLast

Input Addr

Control

Input Data

Output Data

CORE 1 CORE 2

REQUEST

RESPONSE

Data Handshake

DATAFLOW SIGNALS

Burst Length


Fig 3.6 OCP dataflow signals

Fig 3.6 shows the OCP dataflow signals which include the Request, Response and Data

Handshake. A set of signals comes under the request phase are the one which will be used for

requesting a particular operation to the slave. The request phase will be ended by the

‘SCmdAccept’ signal. Similarly the signals comes under the response phase are the one which

will used for sending the proper response to the corresponding request. The response phase will



be ended by the ‘SResp’ signal. The data handshake signals are one which deals with the data

transfer either from master or slave.

The Basic signals are the one which will be used in the simple read and write operation of

the OCP master and slave. This simple operation can also support the pipelining operation. These

basic signals are extended to the burst operation in which more than one request with multiple

data transfer. It can also be defined in such a way that the burst extensions allow the grouping of

multiple transfers that have a defined address relationship. The burst extensions are enabled only

when MBurstLength is included in the interface.

The burst length is the one which represents that how many write or read operation should

be carried out in a burst. Hence this burst length will be given by the system to the master which

will in turn give it to the slave through the MBurstLength signal. Thus the burst length acts as

one of the input to the master only in burst mode is enabled. Whereas in simple write and read

operation, the burst length input is not needed.

From the Figure 2.2, the inputs and outputs of the OCP are clearly identified which are

discussed as follows.

3.8 INPUTS AND OUTPUTS

Basically OCP has the address is of 13bits, data is of 8bits, control signal is of 3bits and

burst length is of integer type. The 8kbit memory (213 = 8192bits = 8kbits) is used in the slave

side in order to verify the protocol functionality. The System will give the inputs to OCP Master

during Write operation and receive signals from OCP Slave during Read operation which is listed

below.

Master System

Control

o Control signal acts as input which will say whether the WRITE or READ

operation to be performed by the master and is given by the processor through

“control” pin.

Input address

o System will give the address through “addr” pin to the master to which the write

or read operation can be carried out.



Input data

o This will act as input pin in which data will be given by the system through

“data_in” pin to the master and that must be stored in the corresponding address

during write operation.

Burst Length

o This input is used only when the burst profile is used and is of integer type which

indicates the number of operations that is to be carried out in a burst.

Output data

o In Read operation, the master will give the address and the slave will receive the

address. Now the slave will fetch the corresponding data from the sent address and

that data will be given out through this “data_out” pin.

3.9 OCP SPECIFICATION

The specifications for the Open Core Protocol are identified for both simple write and

read operation supports the pipelining operation and burst operation. The identified specifications

are represented in tabulation format.

3.9.1 SIMPLE WRITE AND READ

This simple write and read operation for which the basic and mandatory signals required

signals are tabulated in Table 3.1.

S.No. NAME WIDTH DRIVER FUNCTION

1 Clk 1 Varies OCP Clock

2 MCmd 3 Master Transfer Command

3 MAddr 13 (Configurable) Master Transfer address

4 MData 8 (Configurable) Master Write data

5 SCmdAccept 1 Slave Slave accepts transfer

6 SData 1 Slave Read data

7 SResp 2 Slave Transfer response

Table 3.1 Basic OCP Signal Specification



The request issued by system is given to slave by MCmd signal. Similarly, in Write

operation, the input address and data provided by the system will be given to slave through the

signal MAddr and MData and when those information’s are accepted, slave will give

SCmdAccept signal which ensures that the system can issue next request. During Read operation,

system issues the request and address to slave which will set SResp and fetch the corresponding

data that is given to output through SData.

Clk

Input clock signal for the OCP clock. The rising edge of the OCP clock is defined as a

rising edge of Clk that samples the asserted EnableClk. Falling edges of Clk and any rising edge

of Clk that does not sample EnableClk asserted do not constitute rising edges of the OCP clock.

EnableClk

EnableClk indicates which rising edges of Clk are the rising edges of the OCP clock, that

is. which rising edges of Clk should sample and advance interface state. Use the enableclk

parameter to configure this signal. EnableClk is driven by a third entity and serves as an input to

both the master and the slave.

When enableclk is set to 0 (the default), the signal is not present and the OCP behaves as

if EnableClk is constantly asserted. In that case all rising edges of Clk are rising edges of the

OCP clock.

MAddr

The Transfer address, MAddr specifies the slave-dependent address of the resource

targeted by the current transfer. To configure this field into the OCP, use the addr parameter. To

configure the width of this field, use the addr_wdth parameter.

MCmd

Transfer command. This signal indicates the type of OCP transfer the master is

requesting. Each non-idle command is either a read or write type request, depending on the

direction of data flow.

MData

Write data. This field carries the write data from the master to the slave. The field is

configured into the OCP using the mdata parameter and its width is configured using the

data_wdth parameter. The width is not restricted to multiples of 8.



SCmdAccept

Slave accepts transfer. A value of 1 on the SCmdAccept signal indicates that the slave

accepts the master’s transfer request. To configure this field into the OCP, use the cmdaccept

parameter.

SData

This field carries the requested read data from the slave to the master. The field is

configured into the OCP using the sdata parameter and its width is configured using the

data_wdth parameter. The width is not restricted to multiples of 8.

SResp

Response field is given from the slave to a transfer request from the master. The field is

configured into the OCP using the resp parameter.

3.9.2 BURST EXTENSION

The required signals are identified for the burst operation and are tabulated in the Table

3.2. Burst Length indicates the number of transfers in a burst. For precise bursts, the value

indicates the total number of transfers in the burst, and is constant throughout the burstHere the

burst length that can be configured which represents that many read or write operation can be

performed in sequence.

The Burst Precise field indicates whether the precise length of a burst is known at the start

of the burst or not. The Burst Sequence field indicates the sequence of addresses for requests in a

burst. The burst sequence can be incrementing which increments the address sequentially.

S.No. NAME WIDTH DRIVER FUNCTION

1 MBurstLength 13(Configurable) Master Burst Length

2 MBurstPrecise 1 Master Given burst length is Precise

3 MBurstSeq 3 Master Address seq of burst

4 MDataLast 1 Master Last write data in burst

5 SRespLast 1 Slave Last response in burst

Table 3.2 OCP Burst Signal Specification



Each type will be indicated by its corresponding representation such as increment

operation can be indicated by setting the Burst Sequence signal to “000”. Data last represents the

last write data in a burst. This field indicates whether the current write data transfer is the last in a

burst. Last Response represents last response in a burst. This field indicates whether the current

response is the last in this burst.

MBurstLength

Basically this field indicates the number of transfers for a row of the burst and stays

constant throughout the burst. For imprecise bursts, the value indicates the best guess of the

number of transfers remaining (including the current request), and may change with every

request. To configure this field into the OCP, use the burstlength parameter.

MBurstPrecise

This field indicates whether the precise length of a burst is known at the start of the burst

or not. When set to 1, MBurstLength indicates the precise length of the burst during the first

request of the burst. To configure this field into the OCP, use the burstprecise parameter. If set to

0, MBurstLength for each request is a hint of the remaining burst length.

MBurstSeq

This field indicates the sequence of addresses for requests in a burst. To configure this

field into the OCP, use the burstseq parameter.

MDataLast

Last write data in a burst. This field indicates whether the current write data transfer is the

last in a burst. To configure this field into the OCP, use the datalast parameter. When this field is

set to 0, more write data transfers are coming for the burst; when set to 1, the current write data

transfer is the last in the burst.

SRespLast

Last response in a burst. This field indicates whether the current response is the last in this

burst. To configure this field into the OCP, use the resplast parameter.

When the field is set to 0, more responses are coming for this burst; when set to 1, the current

response is the last in the burst.

Thus the OCP basic block diagram, dataflow signal diagram and its specifications are

tabulated and hence give the clear view in designing the Open Core Protocol bus.



CHAPTER 4

IMPLEMENTATION OF OPEN CORE PROTOCOL

4.1 INTRODUCTION

The design of the Open Core Protocol starts with the initial study based on which the

development of FSM (Finite State Machine) for the various supporting operation after which the

development of VHDL for the FSM. The development of the FSM’s are the basic step based on

which the design can be modelled. The FSM will ensure and explains the clear operation of the

OCP step by step and hence this development will act as a basic step for design.

The notations used while designing the OCP are listed in the Table 4.1, Table 4.2 and

Table 4.3 which are as follows.

Control Notations Used Command

000 IDL Idle

001 WR Write

010 RD Read

011 INCR_WR Burst_Write

100 INCR_RD Burst_Read

Table 4.1 Input Control Values

Table 4.2

OCP Master

Command (Mcmd)

Values

Table 4.3 slave response (SResp) values


MCmd Notations Used Command

000 IDL Idle

001 WR Write

010 RD Read

SResp Notations Used Response

00 NUL No Response

01 DVA Data Valid / Accept

IDLE

READWRITE

WAIT

Control = WrReq Control = RdReq

SCmdAccept=1 & SResp != DVA

SCmdAccept=1

SResp = DVASCmdAccept=0

MAddr, MCmd & MData MAddr & MCmd

Data_out = SData

SCmdAccept=0


4.1.1 SIMPLE WRITE AND READ OPERATION

FSM for OCP master

The Finite State Machine (FSM) is developed for the simple write and read operation of

OCP Master. The simple write and read operation indicates that the control goes to IDLE state

after every operation. The FSM for the OCP Master – Simple Write and Read is developed and is

shown in the Figure 3.1. Totally there are four states are available in this FSM such as IDLE,

WRITE, READ and WAIT.

Basically, the operation in the OCP will be held in two phases.

1.Request Phase,2.Response Phase

Initially the control will be in IDLE state (Control = “000”) at which all the outputs such

as MCmd, MAddr and MData are set to “don’t care”. The system will issue the request to the

master such write request which leads to the WRITE state (Control = “001”). In this state, the

address and the data will be given to the slave that is to be written and hence the process will get

over only when the SCmdAccept is asserted to high. If SCmdAccept is not set, this represents

that the write operation still in process and the control will be in the WRITE state itself. Once the

write operation is over the control will go to the IDLE state and then it will check for the next

request.

Fig4.1 FSM for OCP Master - Simple Write and Read


IDLE

WRITE READ

MCmd = WrReq MCmd = RdReq

SCmdAccept=1 & Sresp = NULL SCmdAccept=1, Sresp = DVA & SData

SCmdAccept=0 & Sresp = NULL

Store_Mem = MData

SData = Store_Mem


When the read request is made, the control will go to the READ state (Control = “010”)

and the address is send to the slave which in turn gives the SCmdAccept signal that ends the

request phase. Once the SCmdAccept is set and SResp is not Data Valid (DVA), the control will

go the WAIT state and will be waiting for the SResp signal. When the read operation is over

which represents that the SResp is set to DVA and the data for the corresponding address is

taken. Hence the SResp signal ends the response phase and the control will go the IDLE state,

then checks for the next request.

FSM for OCP slave

The FSM for the OCP Slave which has the simple write and read operation is developed

and is shown in the Figure 4.2.

Fig4.2 FSM for OCP slave - simple write and read

The slave will be set to the respective state based on the MCmd issued by the master and

the output of this slave is that the SCmdAccept and SResp. Initially control will be in the IDLE

state and when the master issues the command as write request, and then the control will go the

WRITE state in which the data will be written to the corresponding memory address location

which is sent by the masters. Once the write operation is finished, the SCmdAccept signal is set

to high and is given to the master. When MCmd is given as read request, then the control will

move to the READ state in which the data will read from the particular memory address location

that is given by the master. Hence the SCmdAccept is set to high and the SResp is set to the DVA

which represents that the read operation over and control goes to the IDLE state.


Write Request and inputs are given After READ, again goes to IDLE State

Master and Slave go to WRITE state based on Write RequestMaster and Slave go to IDLE state after Write operation over

Input Data is stored in corresponding address during Write operationStored Data is read from corresponding address during Read operation

MCmd, MAddr and MData are assertedIn IDLE, MAddr and MData are set to Don’t CaresAfter IDLE, control checks for next request and goes to READ stateIn IDLE, MAddr and MData are set to Don’t Cares

MCmd, MAddr and MData are asserted


Simulation result for simple write and read

The above developed FSM for the OCP Master and Slave which supports the simple write

and read operation is designed using VHDL and is simulated. The designed OCP master and

slave are integrated as a single design and is simulated waveform represents the complete

transaction of simple write and read operation from master to slave and vice-versa which is

shown in Figure 5.3.

Fig4.3 Waveform for OCP master and slave - simple write and read

The integrated OCP master and slave is simulated which clearly explains the operation of

the FSM developed for the simple write and read. The input data is written in the given 0 th and 3rd

address memory location during write operation and is read out by giving the corresponding

address during the read operation.


IDLE

READWRITE

WAIT

Control = WrReq

SCmdAccept=0

Control = RdReq

SCmdAccept=1 & SResp != DVA

SCmdAccept=1 & (Count = BurstLength)

SResp = DVA & (Count = BurstLength)

SResp = DVA & (Count != BurstLength)

MAddr, MCmd, MBurstLength & MData

SCmdAccept=1 & (Count != BurstLength)

SCmdAccept=0

MAddr, MCmd & MBurstLength

Data_out = SData


FSM for OCP master

The FSM for the OCP Master which supports the burst extension is developed with

respect to its functionality and is shown in Figure 5.4.

Fig 4.4 FSM for OCP master – burst operation

Note

In the FSM with burst extension shown in Figure 4.4, the transition occurs for the

condition (Count = BurstLength) will be the same when the condition (Count != BurstLength).

The only difference will be coming in the “Address Generation”. When the condition (Count =

BurstLength) is set, then the address will be generated from the starting location and when the

condition (Count != BurstLength) is set, then the address will be generated from the previous

location.

The basic operation for this burst extension remains the same as previously developed

FSMs. Initially control will be in IDLE state and goes to the WRITE state when the write request

is given. In this burst extension, the mandatory signal to be present is burst length which says the

number of transfers in a burst. The counter is implemented in this operation which will start the

count and hence the address generation will be started. When SCmdAccept is set to high, control



check for the count value reaches the burst length. If not, the address generation will be continued

and if count reaches burst length, the count is reset to zero and hence the address generation will

start from initial location mentioned.

Similarly, the control will be in READ state for the read request in which the count is

process and when SCmdAccept is set to high, control goes to the WAIT state. In WAIT state, the

count process will not be done i.e. the count process will be paused and hence the address

generation will also be stopped. Once the Sresp is set to DVA, then the count process is

continued which leads the address generation to be continued. The corresponding data for the

generated address will be read from the memory and is sent to the master through the SData

signal.

Thus, after every burst operation i.e. either write or read, the control will goes to the idle

state and then the next request will be checked by the control and will be performed according to

it.

FSM for OCP slave

The FSM for the OCP slave with the burst extension is developed and is shown in the

Figure 4.5.

Note

The transition occurs for the condition (Count = BurstLength) will be the same when the

condition (Count != BurstLength). The only difference is the “Address Generation”. When the

condition (Count = BurstLength) is set, then the address will be generated from the starting

location and when the condition (Count != BurstLength) is set, then the address will be generated

from the previous location or value.

The Initial state will be IDLE and when the MCmd is set to write request, the control will

go the WRITE state. Here the burst length and the count are declared because the slave may not

know whether burst extension is enabled. The generated address and the input data will be given

to the slave and it will store the data to the corresponding address and assert the SCmdAccept

signal to high. Then the control will check for the count and the next MCmd request and will

process according to it.


IDLE

WRITE READ

MCmd = WrReq MCmd = RdReq

SCmdAccept=1 SCmdAccept=1, Sresp = DVA & SData

SCmdAccept=0 & Sresp = NULL

Store_Mem = MData & (Count = MBurstLength)

SData = Store_Mem & (Count = MBurstLength)

Count != MBurstLength Count != MBurstLength


Fig 4.5 FSM for OCP slave – burst operation

When the MCmd has read request, then the control will go to READ state and the

corresponding data for the generated address will be read during which SCmdAccept is set to

high. Once the read process over, SResp will be set to DVA and will check for both count and

next request.


The Write process enables count which is incremented when SCmdAccept is set to highBurst Length is given as 8 which represents that a burst has 8 Data transfers The input Data is given for every generated Address

Master and Slave go to WRITE State based on current requestThe generated Address and the corresponding input data is assigned to the MAddr and MDataSlave writes the input data to corresponding generated Address Memory Location

Burst write request is given


CHAPTER 5CHAPTER 5

RESULTSRESULTS

5.1 SIMULATION RESULTS FOR BURST OPERATION

The design of the developed FSM is done using VHDL and the integration of the master

and slave is made. The simulation result of the integrated design gives the clear view on the OCP

Burst Write operation which is shown in Figure 5.1.

Here the burst length is given as 8 and hence the address will be generated for number of

memory locations that equals to the burst length and the corresponding input data is given. The

sequence address generation and writing the input data to the corresponding memory location is

represented in the waveform clearly. Also the increment of count value is shown in the waveform

based on which the sequence of address get generated.

Fig 5.1 Waveform for OCP master and slave – burst write operation


Control goes to IDLE state after Burst overMaster and Slave are in Burst READ State

Count gets Reset when Count=BurstLenthRead Request is assigned to MCmdAddress goes to initial location when count resetsAgain Count and Address starts incrementing in Read operation

The Data is read out for corresponding generated Address


The simulated waveform for the burst read operation is shown in the Figure 5.2 which

represents that for the generation of the address in sequence, the corresponding data that stored in

the memory are read out. Here also the count is implemented which will be incremented with

respect to the burst length. The master and slave will go to the IDLE state when the burst

operation got over which can be indicated by the count i.e. when the count reaches the burst

length given, it got reset and hence the address will be generated from the initial value.

Fig 5.2 Waveform for OCP master and slave – burst read operation

Thus the simulation results represents the operation of the developed FSM’s for the

master and slave that supports the simple write and read operation, pipelining operation and

finally the burst operation.



5.2 SYNTHESIS RESULTS

5.3 BLOCK LEVEL DIAGRAM (INTERNAL DIAGRAM)



5.3.1 ADVANCED HDL SYNTHESIS REPORT

Macro Statistics

# Adders/Subtractions : 13

13-bit adder : 4

16-bit adder : 1

4-bit adder : 8

# Registers : 2376

Flip-Flops : 2376

# Comparators : 19

13-bit comparator less : 3

8-bit comparator equal : 8

8-bit comparator not equal : 8

# Multiplexers : 13

13-bit 4-to-1 multiplexer : 1





5.3.2 Final Report *

Final Results

RTL Top Level Output File Name : ocp_master_slave.ngr

Top Level Output File Name : ocp_master_slave

Output Format : NGC

Optimization Goal : Speed

Keep Hierarchy : NO

Design Statistics

# IOs : 42

Cell Usage

# BELS : 7818



# BUF : 7

# GND : 1

# INV : 8

# LUT1 : 52

# LUT2 : 100

# LUT2_D : 5

# LUT2_L : 6

# LUT3 : 2314

# LUT3_D : 48

# LUT3_L : 7

# LUT4 : 2783

# LUT4_D : 270

# LUT4_L : 21

# MUXCY : 71

# MUXF5 : 1165

# MUXF6 : 512

# MUXF7 : 256

# MUXF8 : 128

# VCC : 1

# XORCY : 63

# FlipFlops/Latches : 2368

# FD : 2123

# FDE : 190

# FDR : 21

# FDRS : 1

# FDS : 33

# Clock Buffers : 1

# BUFGP : 1

# IO Buffers : 41

# IBUF : 33

# OBUF : 8



5.3.3 DEVICE UTILIZATION SUMMARY

Selected Device: 3s500efg320-5

Number of Slices: 2947 out of 4656 63%

Number of Slice Flip Flops: 2368 out of 9312 25%

Number of 4 input LUTs: 5614 out of 9312 60%

Number of IOs: 42

Number of bonded IOBs: 42 out of 232 18%

Number of GCLKs: 1 out of 24 4%

5.3.4 TIMING SUMMARY

Speed Grade: -5

Minimum period: 13.413ns (Maximum Frequency: 74.555MHz)

Minimum input arrival time before clock: 8.467ns

Maximum output required time after clock: 5.184ns

Maximum combinational path delay: No path found

5.4 CONCLUSION

Based on the literature review, the working of OCP masters and slaves is made clear and

on identified specifications the design is made.

Initially the FSMs are developed for both master and slave of OCP separately which

includes simple write and read operation and burst operation.

The modelling of the developed FSMs of OCP are made using VHDL.

Finally the OCP is designed in such a way that the transaction between master and slave

is carried out with proper delay and timings.

The screen shots of the simulated waveform results are displayed and are explained with

respect to the design behaviour.



CHAPTER 6

CONCLUSION AND FUTURE WORK

6.1 CONCLUSION

Cores with OCP interfaces and OCP interconnect systems enable true modular, plug-and-

play integration; allowing the system integrators to choose cores optimally and the best

application interconnect system. This allows the designer of the cores and the system to work in

parallel and shorten design times. In addition, not having system logic in the cores allows the

cores to be reused with no additional time for the core to be re-created. Depending upon the real

time application these intellectual properties can be used.

The basic aim of our project is to model the master and slave of OCP and we have

successfully modeled both MASTER and SLAVE along with internal memory design using

VHDL. The simulation result shows that the communication between different IP cores using

OCP is proper.

All of the commands and data are successfully transferred from one IP core to the other IP

core using OCP. There is no loss of data or control information. The OCP supports the simple

write and read operation and the burst extension. Based on the result obtained, the burst extension

is seen to automate the address generation. The initial address alone is provided to the protocol.

The Various Scenarios for each component in the OCP design are verified effectively during the

simulation with respect to its behavior.

6.2 FUTURE WORK

The design can be further extended by developing a total system around it.For Example,

we can use this protocol to interface between an ARM processor and any device (like SRAM)

provided both the IP cores should have OCP Compliance.

Burst Mode can be extended further to include various supported types of burst. Thread

and Tag extension can also to include in this developed protocol with the corresponding

supporting signals. This project work provides an ideal platform for enhancement or further

development of the OCP.




Technology

Design of open core protocol ocp