110
1 Memory & Programmable Logic Logic and Digital System Design - CS 303 Erkay Savaş Sabancı University 06/20/22 CS 303

1 Memory & Programmable Logic Logic and Digital System Design - CS 303 Erkay Savaş Sabancı University 10/25/2015CS 303

Embed Size (px)

Citation preview

  • *Memory & Programmable LogicLogic and Digital System Design - CS 303Erkay SavaSabanc University*CS 303

    CS 303

  • *Memory UnitA device to which binary information is transferred for storage and from which information is available when needed for processingWhen data processing takes placeinformation from the memory is transferred to selected register in the processing unitIntermediate and final results obtained in the processing unit are transferred back to the memory for storage*CS 303

    CS 303

  • *Memory UnitUsed to communicate with an input/output devicebinary information received from an input device is stored in memoryinformation transferred to an output device is taken from memoryA collection of cells capable of storing a large quantity of binary informationTwo typesRAM (random-access memory)ROM (read-only memory)*CS 303

    CS 303

  • *ClassificationRAMWe can read stored information (read operation)Accepts new information for storage (write operation)Perform both read and write operationROMonly performs read operationexisting information cannot be modifiedProgramming the device specifying the binary information and storing it within the programmable devicea.k.a. programmable logic device*CS 303

    CS 303

  • *Programmable Logic Devices PLDROM is one exampleProgrammable Logic Array (PLA)Programmable Array Logic (PAL)Field Programmable Gate Array (FPGA)PLD isan integrated circuit (IC) with internal logic gatesInterconnection between the gates can be programmed through fusesAt the beginning they are all intactBy programming we remove some of them, while keeping the others*CS 303

    CS 303

  • *Random Access Memory (RAM)RAMThe reason for the nameThe time it takes to transfer information to or from any desired random location is always the same.Wordgroups of bits in which a memory unit stores informationAt one time, memory move in and out of the storage a word of information8 bit byte16 bit 32 bit Capacity is usually given in number of bytes *CS 303

    CS 303

  • *Memory UnitBlock diagramMemory Unit2k wordsn bit per word*CS 303

    CS 303

  • *SpecificationA memory unit is specified bythe number of words it containsnumber of bits in each wordEach word (or location for a word) in memory is assigned an identification numberaddress0 to 2k-1Selectionselection process to read and write a word is done by applying k-bit address to the address linesA decoder accepts the address and selects the specified word in the memory*CS 303

    CS 303

  • *Memory Map and Address Selection1 K = 2101 M = 2201 G = 2301 T = 2401 K x 16 Memory*CS 303

    CS 303

  • *Write and Read OperationsWritetransfer inReadtransfer outSteps for write operationApply the binary address of the desired word to the address linesApply the data word that is be stored in memory to the (data) input linesActivate the write input*CS 303

    CS 303

  • *Read OperationStepsApply the binary address of the desired word to the address linesActivate the read inputThe desired word will appear on the (data) output linesreading does no affect the content of the word *CS 303

    CS 303

  • *Control Inputs to Memory ChipCommercial memory components usually provide a memory enable (or chip select) control inputmemory enable is used to activate a particular memory chip in a multi-chip implementation of a large memory*CS 303

    Memory EnableRead/WriteMemory Operation0XNone10write11read

    CS 303

  • *TimingMemory does not have to use an internal clockIt only reacts to the control inputs, e.g., read and writeoperation of a memory unit is controlled by an external device (e.g. CPU) that has its own clockAccess timethe time required to select a word and read itCycle timethe time required to complete a write operation*CS 303

    CS 303

  • *Write CycleCPU clock: 2 GHzCycle time: 1.25 nsCPU must devote three clock cycles for each write operation*CS 303

    CS 303

  • *Read CycleCPU clock: 2 GHzAccess time: 1.25 nsThe desired word appears at output, 1.25 ns aftermemory enable is activated*CS 303

    CS 303

  • *Types of Memory 1/2RAMaccess time is always the same no matter where the desired data is actually locatedSequential-access memoryAccess time is variablee.g., magnetic disks, tapesRAMSRAM (static RAM)latches, stores information as long as power is onDRAM (dynamic RAM)information is stored as charge on a capacitorrefreshing is necessary due to discharge*CS 303

    CS 303

  • *Types of Memory 2/2Volatile memoryWhen the power is turned off, the stored information is lostRAM (SRAM or DRAM)Nonvolatile memoryretains the stored information after removal of powermagnetic disksdata is represented as the direction of magnetizationROMprograms needed to start a computer are kept in ROM *CS 303

    CS 303

  • *Memory CellEquivalent logic of a memory cell for storing one bit of information*CS 303Q

    SRQ(t+1)00Q01010111X

    CS 303

  • *4 x 4 RAM*CS 303

    CS 303

  • *4 x 4 RAM01x3x2x1x0000000000000x3x2x1x0*CS 303

    CS 303

  • *Commercial RAMsPhysical constructionCapacity of thousands of wordseach word may range from 1 to 64 bitsExample:We have memory chips of 10244Logical construction: 10248*CS 303

    CS 303

  • *Combining Memories20484 RAMXX.X0XX.X1*CS 303

    CS 303

  • *Coincident DecodingA memory with 2k words requires a k 2k decoderk 2k decoder requires 2k AND gates with k inputs per gateThere are ways to reduce the total number of gates and number of inputs per gateTwo dimensional selection schemeArrange the memory words in an array that is as close as possible to squareUse two k/2-input decoders instead of one k-input decoder.One decoder performs the row selectionThe other does the column selection*CS 303

    CS 303

  • *Example: Coincident Decoding1024 word memory*CS 303

    CS 303

  • *DRAMsSRAM memory is expensiveOne cell typically contains four to six transistorsUsually used for on-chip cache memories and embedded systems (cameras, smart phones, etc.)DRAM is much less expensiveOne MOS transistor and a capacitorFour times the density of SRAM in a given chip areacost per bit storage is three to four times less than SRAMlow power requirementPerfect technology for large memories such as main memoryMost DRAMs have short word sizes *CS 303

    CS 303

  • *DRAMs and Address MultiplexingIn order to reduce number of pins on a memory chip, the same pins are used for both row and column addressesExample: 1 GB DRAM15-bitaddressRASCAS*CS 303

    CS 303

  • *32768 x 32768memoryarrayread/writeDRAMs and Address Multiplexing15-bitaddressRASCAS15-bit column register15 x 32768 decoder15 x 32768 decoder15-bit row register*CS 303

    CS 303

  • *Read-Only MemoryROMmemory device in which permanent binary information is storedBinary information must be specified by the designerIt then is embedded in the unit to form the required interconnection patternnonvolatile Block diagramno data inputsenable inputsthree-state outputsProgramming input*CS 303

    CS 303

  • *Example: ROM32 x 8 ROMeach OR gatesare considered ashaving 32 inputs*CS 303

    CS 303

  • *Example: ROMNumber of connections32 8 ROM has 32 8 = 256 internal connectionsIn general2k n ROM will have a k 2k decoder and n OR gatesEach OR gate has 2k inputsinputs of every OR gate are initially connected to each output of the decoderThese intersections are programmablethey are initially closed (connected to the input of OR gate)A fuse is used to connect two wiresDuring programming, some of these fuses are blown by applying high voltage.*CS 303

    CS 303

  • *Programming ROMInternal storage specified by a tableExample: 32 8 ROM*CS 303

    InputsOutputsI4I3I2I1I0A7A6A5A4A3A2A1A000000101101100000100011101000101100010100011101100101110000001001111011110001011110010010101111100110011

    CS 303

  • *Programming ROM.*CS 303

    InputsOutputsI4I3I2I1I0A7A6A5A4A3A2A1A00000010110110

    CS 303

  • *Combinational Circuit Design with ROMFormerly,we have shown that a k 2k decoder generates 2k minterms of k input variablesFurthermore,by inserting OR gates to sum these minterms, we were able to realize any desired combinational circuit.A ROM is essentially a device that includes both the decoder and the OR gates within a single device.first interpretation: a memory unit that stores words*CS 303

    CS 303

  • *Combinational Circuit Design with ROMROM (cont.)Second interpretation: a programmable device that can realize any combinational circuit*CS 303

    CS 303

  • *Combinational Circuit Design with ROMExample: Truth table*CS 303

    InputsOutputsA2A1A0B5B4B3B2B1B0000000000001000011010000100011001010100010000101011001110100111111110001

    CS 303

  • *Example: Design with ROM8 x 6 ROM would sufficeROM Truth Table*CS 303

    InputsOutputsA2A1A0B5B4B3B2B1B0000000000001000011010000100011001010100010000101011001110100111111110001

    CS 303

  • Example: Design with ROM**CS 303

    InputsOutputsA2A1A0B5B4B3B2B1B0000000000001000011010000100011001010100010000101011001110100111111110001

    CS 303

  • *Types of ROM 1/2Programming can be done in different waysMask programming: customer provides the truth tablemanufacturer generates the mask for the truth tablecan be costly, since generating a custom mask is charged to the customer.economical only if a large quantity of the same ROM configuration is to be ordered.Field ProgrammableProgrammable ROM (PROM):Customer can program the ROM by blowing fuses by applying high voltage through a special pin Special instrument called PROM programmer is needed.*CS 303

    CS 303

  • *Types of ROM 2/2Programming ROM and PROMs is irreversible.Erasable PROM (EPROM)can be programmed repeatedly. EPROM is placed under a special ultra-violet light for a given period of timeAt the end, the former program is erasedAfter erasure, EPROM becomes ready for another programmingElectronically erasable PROM (EEPROM or E2PROM) *CS 303

    CS 303

  • *Programmable Logic DevicesEPROM is an example of combinational programmable logic device (PLD)Configuration of EPROM*CS 303

    CS 303

  • *Other PLDsTwo other types*CS 303

    CS 303

  • *Programmable Logic Array (PLA)Similar to PROMHowever, PLA does not generate all the mintermsDecoder is replaced by an array of AND gatescan be programmed to generate any product term of input variablesThe product terms are then connected to OR gatesthat provide the sum of productsInputsprogrammable AND arrayprogrammableOR arrayOutputs*CS 303

    CS 303

  • *PLA: ExamplePLA: 3 inputs, 5 product terms and 4 outputsF1F2F1 = (A + BC)F2 = AC + ABF3 = BC + ABF4 = BC + AF1 = AB + AC*CS 303

    CS 303

  • *PLA Programming TableF1 = (A + BC)F2 = AC + ABF3 = BC + ABF4 = BC + Ahttp://tams-www.informatik.uni-hamburg.de/applets/hades/webdemos/42-programmable/10-pla/pla.html*CS 303

    OutputsInputsCTProduct TermABCF1F2F3F411--12-00131-0-45

    CS 303

  • *Size of PLASpecified bynumber of inputs, number of product terms, number of outputsA typical IC PLA (F100)16 inputs, 48 product terms, and 8 outputsn input, k product terms, m output PLA hask AND gates, m OR gates, m XOR gates2n k connections between input and the AND arrayk m connections between the AND and OR arrays2m connections associated with XOR gates(2n k + k m + 2m) connections to program*CS 303

    CS 303

  • *PLA: ExamplePLA: 3 inputs, 5 product terms and 4 outputsF1F2F1 = (A + BC)F2 = AC + ABF3 = BC + ABF4 = BC + A*CS 303

    CS 303

  • *Programming PLAOptimizationnumber of literals in a product term is not importantWhen implementing more than one function, functions must be optimized together in order to share more product terms multiple output optimization (espresso)both the true and complement of each function should be simplified to see which one requires fewer number of product terms http://diamond.gem.valpo.edu/~dhart/ece110/espresso/tutorial.html*CS 303

    CS 303

  • *Example: Programming PLATwo functionsF1 (A, B, C) = (0, 1, 2, 4)F2 (A, B, C) = (0, 5, 6, 7)F1 = AB + AC + BCF1 = (AB + AC + BC)F2 = AB + AC + ABCF2 = (AC + AB + ABC)*CS 303

    BCA000111100110111000

    BCA000111100100010111

    CS 303

  • *Example: Programming PLAPLA programming tableF1 = (AB + AC + BC)F2 = AB + AC + ABC*CS 303

    OutputsInputsProduct TermABCF1F2AB111-AC21-1BC3-11ABC4000

    CS 303

  • *Programmable Array Logic (PAL)Easier to program than PLABut, not as flexibleA typical PAL8 inputs, 8 outputs, 8-wide AND-OR array*CS 303

    CS 303

  • *Example: PALF1F2F3F4I1I2I3I4AND gates inputs123456789101112productterm*CS 303

    CS 303

  • *Design with PALEach Boolean function must be simplified to fit into each section.Product terms cannot be shared among OR gatesEach function can be simplified by itself without regard to common product termsThe number of product terms in each section is fixedIf the number of product terms is too many, we may have to use two sections to implement the function.*CS 303

    CS 303

  • *Example: Design with PALFour functionsA(x, y, z, t) = (2, 12, 13)B(x, y, z, t) = (7, 8, 9, 10, 11, 12, 13, 14, 15)C(x, y, z, t) = (0, 2, 3, 4, 5, 6, 7, 8, 10, 11, 15)D(x, y, z, t) = (1, 2, 8, 12, 13)First step is to simplify four functions separatelyA = xyz + xyztB = x + yztC = xy + zt + ytD = xyz + xyzt + xyt + xyzt*CS 303

    CS 303

  • Example:Design withPAL*A = xyz + xyztB = x + yztC = xy + zt + ytD = xyz + xyzt + xyt + xyzt= A + xyt + xyzt*CS 303t

    CS 303

  • *Example: Design with PALD = A + xyt + xyzt*CS 303

    CS 303

  • *Example: Design with PALABCDxyztAND gates inputsxxyyzzttAA123456789101112A = xyz + xyztD = A + xyt + xyzt*CS 303

    CS 303

  • *PAL: BCD to 7-Segment-Display Decoder(ABCD)10 (a b c d e f g)a = A + BD + C + BDb = A + CD + CD + Bc = A + B + C d = BD + CD + BCD + BCe = BD + CDf = A + CD + BD + BCg = A + CD + BC + BCwe need 4 inputs, 7 outputs, 4 product terms per outputP16H8: 10 inputs, 8 outputs, 7 product terms per output P14H8: 14 inputs, 8 outputs (2 have four product terms, 6 have 2 product terms) *CS 303

    CS 303

  • *7-Segment-Display DecoderDifferent way to optimizemultiple output optimization espresso supports thisa = BCD + CD + BD + A + BCDb = BD + CD + CD + BDc = BD + BCD + CD + CD + BCDd = BC + BCD + BD + BCD e = BD + BCDf = BCD + CD + A + BCDg = BC + BC + A + BCD9 product terms in total (previous one has 14)PLA can also be usedF100: 16 inputs, 48 product terms, and 8 outputsa = A + BD + C + BDb = A + CD + CD + Bc = A + B + C d = BD + CD + BCD + BCe = BD + CDf = A + CD + BD + BCg = A + CD + BC + BC*CS 303

    CS 303

  • *Sequential PLDsSo far, we have seen PLD that can realize only combinational circuitsHowever, digital systems are designed using both combinational circuits (gates) and flip-flops.With PLDs, we need to use external flip-flops to realize sequential circuit functions.Different typesSequential (or simple) programmable logic device (SPLD)Complex programmable logic device (CPLD)Field programmable gate array (FPGA)*CS 303

    CS 303

  • *SPLDAdditional programmable connections are available to include flip-flop outputs in the product terms formed with AND array.Flip-flops may be of D or JK typeExample: AMD 22V1024 pin device, 10 output logic macrocellsThe number of product term allocated to an output varied from 8 to 16*CS 303

    CS 303

  • AMD 22V10*CS 303*

    CS 303

  • *SPLD MacrocellSPLD is usually PAL + D flip-flopsEach section in SPLD is called macrocell.A macrocellsum-of-products combinational logic + optional flip-flop8-10 macrocells in one IC package *CS 303

    CS 303

  • *Additional SPLD Functionalities Additional SPLD FunctionalitiesBypass circuitry for the output (bypassing) flip-flopselection of clock edge polarityXOR gate for selection of true or complement of output*CS 303

    CS 303

  • Example: serial adderOutput equation:S = x y Q = xyQ + xyQ + xyQ + xyQ Flip-flop input equation:D = Q(t+1) = xy + xQ + yQ****CS 303

    CS 303

  • *Example: Serial Adder with SPLDQ(t+1) = xy + xQ + yQ*CS 303

    CS 303

  • *Example: Serial Adder with SPLDS = xyQ + xyQ + xyQ + xyQ*CS 303

    CS 303

  • Generic Array Logic (GAL)Similar to SPLD with PALPAL uses fuses while GAL uses electrically erasable CMOS (E2CMOS) cell at each intersection**CS 303

    CS 303

  • *Complex Programmable Logic DevicePLDPLDPLDPLDPLDPLDPLDPLDprogrammable switch matrixIOblockIOblockExample: Altera MAX 7000-series CPLD with 2500 gates*CS 303

    CS 303

  • *FPGAField Programmable Gate ArrayFPGA is a VLSI circuitField programmable means user can program it in his own locationGate array consists of a pattern of gates fabricated in an area of siliconpattern of gates are repeated many (thousand) timesone thousand to millions of gates are fabricated within a single IC chip*CS 303

    CS 303

  • Basics of FPGA**CS 303

    CS 303

  • *Basics of FPGACLBCLBPSMPSMPSMPSMCLBCLBCLBCLBCLBCLBCLBCLB*CS 303

    CS 303

  • *Basics of FPGAA typical FPGA consists of an array of hundreds or thousands of configurable logic blocks (CLB)CLBs are connected to each other via programmable interconnection structureCLBs are surrounded by I/O blocks for basic communication with outside world.CLBs consist of look-up tables, multiplexers, gates, and flip flopslook-up table is a truth table stored in an SRAMprovides the combinational circuit functions for the logic block. It is like a ROM implemented as SRAM*CS 303

    CS 303

  • *Xilinx FPGA CLB (Partial View)*CS 303

    CS 303

  • *Inside CLBLookup Table16 2 ROM (implemented as SRAM)Implements two four-variable Boolean functionsCan also be configured as memory (RAM)Multiplexers2k input multiplexerscontrolled by k SRAM cellsFlip-flopprovides operation as a sequential systemcan be configured as a latch as well.*CS 303

    CS 303

  • Xilinx XC 4000 CLB*

    G4G3G2G1F4F3F2F1LogicFunction of G1 G4 LogicFunction of F, G, H1 LogicFunction of F1 F4GFDINF G H DINF G H HH G K (clock)ECDQSDRD1S/R ControlS/R ControlECDQSDRD1H1DIN / H2SR / H0ECH F 4C1 C4bypassXYXQYQ*CS 303

    CS 303

  • Lookup Tables (LUT)Combinational circuits are implemented using LUT

    *G4G3G2G1LogicFunction of G1 G4 G*CS 303

    G4(A)G3(B)G2(C)G1(D)G(y)00000000110010100111010000101101101011111000010011101011011111000110101110011110

    CS 303

  • LUT Using three function generators F, G, and HA single CLB can implement any Boolean function of five variables General form of Boolean function of five variables: H = F(x, y, z, t)w+ G(x, y, z, t )w Some functions of up to nine variablesNine logic inputs: F1, F2, F3, F4, G1, G2, G3, G4, and H1.**CS 303

    CS 303

  • LUT as MemoryLUTs can be configured as memory blockOne 16x2 memory moduleOne 32x1 memory moduleDual-ported 16x1 memory moduleSynchronous-edge-triggered and asynchronous-memory interfaces are supported**CS 303

    G4G3G2G1F4F3F2F1LogicFunction of G1 G4 LogicFunction of F1 F4GFFG

    CS 303

  • LUT as Memory*

    G4G3G2G1F4F3F2F1GF16x2 configurationAddressOne write*CS 303

    CS 303

  • LUT as Memory*

    G4G3G2G1F4F3F2F1GF16x2 configurationAddress

    G4G3G2G1F4F3F2F1GF16x1 dual-ported configurationAddress 1Address 2Two reads at the same time**CS 303

    CS 303

  • LUT as Memory*32x1 configuration

    G4G3G2G1F4F3F2F1GF01A4A3A2A1A0Address*CS 303

    CS 303

  • Distributed RAM**CS 303

    CS 303

  • Spartan Dual-Port RAM**CS 303

    CS 303

  • A Simple Memory on FPGAmodule simple_memory(clk, reset, dat_in, wr_adr, wr_en, dat_out, rd_adr);input clk, reset;input [15:0] dat_in;input [7:0] wr_adr;input wr_en;output [15:0] dat_out;input [7:0] rd_adr;

    // synthesis attribute ram_style of my_memory is blockreg [15:0] my_memory[0:255];reg [15:0] dat_out;

    always @(posedge clk)begin if(wr_en) my_memory[wr_adr]

  • *Switch MatrixDifferent levels of interconnections Direct interconnects between CLBs. Short wires to connect adjacent CLBs in both directions.Switch matrixCLBCLBCLBCLBCLBCLBCLBCLBCLBPSMPSMPSMPSMPSM: Programmable Switch MatrixPSM can be configured to connect a horizontal wire to a vertical one.One wire can be connected to multiple wiresThis way output of a CLB can be routed through multiple PSMs to the input of another CLB. *CS 303

    CS 303

  • *PSMCLB1244221264848433QuadSingleDoublelongDirect connectlonglonglongDoubleQuadDirect connectCarrychainSingleGlobal clockGlobal clock*CS 303

    CS 303

  • *Types of Lines Single-length: connects adjacent PSMsDouble-length: connects every other PSMQuad-length: traverse four CLBs before passing through a PSM.Long: runs entire chip. Using tri-state buffers within the CLBs, long lines can be configured as buses.Local connections use direct interconnect or single length lines in order to avoid to many switching pointsGlobal Nets: Low skew signal paths used to distribute high fan-out signals such as clock and reset signals. *CS 303

    CS 303

  • *Double Length LinesCLBCLBCLBCLBCLBCLBCLBCLBCLBPSMPSMPSMPSMPSMPSM*CS 303

    CS 303

  • *Example InterconnectsDirect interconnects between adjacent CLBsCLBCLBCLBCLBCLBCLBCLBCLBCLBPSMPSMPSMPSMPSMPSMCLBCLBCLBCLBGeneral-purpose interconnects*CS 303

    CS 303

  • * 4-bit Adder with CLBs 1/2*CS 303

    CS 303

  • *4-bit Adder with CLBs 2/2CLBZ0ABCDxyC1C2Z1Z2*CS 303

    CS 303

  • Spartan II Architecture**CS 303

    CS 303

  • Xilinx Spartan II CharacteristicsDensity: up to 200,000 system gatesUp to 5292 logic cellsEach cell contains a LUTOperating voltage: 2.5 VOperating frequency: 200 MHzOn-chip block memorynot made up of look-up tablesDoes not reduce the amount of logicImprove performance by reducing the need to access off-chip storage.0.22/0.18-m CMOS technologySix layers of metal for interconnect**CS 303

    CS 303

  • Xilinx Spartan II CharacteristicsReliable clock distributionClock synchronization through delay-locked loops (DLLs)DLLs eliminate clock distribution delayDLLs provide frequency multipliers, frequency dividersDifferent architectureFour quadrantsEach quadrant is associated with 4096-bit block RAMThere are FPGAs with up to 14 blocks of block RAM (56K bits total block memory)

    **CS 303

    CS 303

  • Slice* A logic cell contains a four-input lookup table, logic for carry and control and a D type flip-flop Each slice contains contains two cells Each CLB contains two slices.*CS 303

    CS 303

  • Carry LogicLookup tables can be used to generate the sum bits.Each slice can be programmed to implement both carry and sum for two bits.The carry lines between cells are hardwired (not programmed) to provide for fast propagation of carriesThe carry logic can be programmed to implement subtracter, incrementer/decrementers, 2s complementers, and counters **CS 303

    CS 303

  • A Slice as Two-Bit Full Adder*G2: Ai+1G1: Bi+1CiG FunctionF FunctionCarry LogicSi+1SiCi+2Ci+1F2: AiF1: Bihardwired*CS 303

    CS 303

  • Connections for 4-bit Adder**SliceSliceA1B1A0B0C0C2C3A3B3A2B2Hard-wiredIf we want to detect a possible overflow, we add the 4th Slice.The 3rd Slice outputs C3 instead of C4 (How?)In the 4th Slice, C4 can be re-computed using the carry logic.Overflow is computed using the G function generator in the 4th Slice Overflow: V = C3 C44-bit adders can easily be expanded 8 or 16-bit addersAdder modules are available in Xilinx library*CS 303

    CS 303

  • Core GeneratorsMany vendors typically supply implementation of common building blocks that are optimized to the structure of their hardware components.Xilinx, in fact, has a core generator utility that can create common building blocks from parameterized descriptions provided by the userAdders, subtractors, multipliers, memories, etc. are such building blocksFPGA as a sea of LUTs and flip-flopsA gate-level design can be placed on the array by mapping combinational components to LUTs, sequential to flip-flops **CS 303

    CS 303

  • Adder/Subtracter with Core Generator*CS 303*

    CS 303

  • Divider with Core Generator*CS 303*

    CS 303

  • Multiplier with Core Generator*CS 303*

    CS 303

  • Xilinx Virtex FPGAsLeading edge of Xilinx technology65 nm technology1 V operating voltageUp to 330,000 logic cellsOver 200,000 internal flip-flops10 Mb of block RAMHardwired units: multipliers, DSP units, microprocessors (powerPC)550-MHz clock technology**CS 303

    CS 303

  • Xilinx Virtex FPGAs**CS 303

    CS 303

  • *Design with Programmable DevicesRequires CAD toolsEntry tools: entering a design schematic entry packageFSM (finite state machine)Hardware description languages (HDL)VHDL, Verilog, ABEL, Synthesis toolsallocatemap configureconnect logic blocks*CS 303

    CS 303

  • *FPGA Design Flow 1/4HDL editorState machine editorSchematic captureSynthesisPlace and RouteProgrammingXilinx ToolsModel Development*CS 303

    CS 303

  • *FPGA Design Flow 2/4Model development: VHDL codeState-machines may be described in a graphical manner and translated into VHDL code.Traditional schematic capture can be translated into VHDL source.Behavioral SimulationBefore synthesis; for testing functional correctnessSynthesisThe design is synthesized to a library of primitive components such as gates, flip-flops, and latchesFunctional SimulationTo find out preliminary performance estimatesFor example, timing information can be obtained from known properties of FPGA componentsStill not too accurate*CS 303

    CS 303

  • *FPGA Design Flow 3/4Place and Route:The design is mapped to the primitives in the target chipIn FPGA, there are function generators (LUTs), flip-flops, and latchesEach primitive must be assigned to a specific CLB (Placement)Connections between CLBs that implement the primitives must be established (routing)Accurate timing can be obtained in Verification step (Post-placement and routing simulation) The configuration bits are generated.*CS 303

    CS 303

  • *FPGA Design Flow 4/4Programming:The configuration data (bit stream) is finally loaded into the target FPGA chip.

    These steps are fairly generic although the terminology used here is adopted from Xilinx.*CS 303

    CS 303

  • *Xilinx Tools: Design Flow*CS 303

    CS 303