FYP Final Report

HARDWARE IMPLEMENTATION OF OFDM TRASNMITTER AND RECEIVER USING FPGA SHAHBAZ ABBASI s051.04 SHAZER BAIG s303.04 INTERNAL ADVISOR DR. IMRAN TASADDUQ EXTERNAL ADVISOR ENGR. MUSTAFA IMRAN NATIONAL UNIVERSITY OF COMPUTER AND EMERGING SCIENCES - FAST JUNE 2008 HARDWARE IMPLEMENTATION OF OFDM TRANSMITTER AND RECEIVER USING FPGA BY SHAHBAZ ABBASIs051.04 SHAZER BAIG s303.04 Report submitted in partial fulfilment of the requirementsfor the degreeof Bachelor of Sciencein Telecommunication /Computer Engineering DEPARTMENT OF TELECOM AND COMPUTER ENGINEERING NATIONAL UNIVERSITY OF COMPUTER AND EMERGING SCIENCES - FAST JUNE 2008 ii ACKNOWLEDGEMENT First of all we would like to thank Almighty Allah. Its only because of the blessings of Allah that we have been able to complete our project successfully. We take this special occasion to thank our parents. We dedicated this work to our parents. WereallyhavetoexpressourcollectivegratitudetowardsourinternaladvisorDr.Imran Tasadduq for all his help, invaluable guidance, critics and generous support throughout our finalyearproject.Wereallyappreciatethewayhementoredusthroughoutourbrief encounters with the world of Digital Communications. WealsoliketothankourexternaladvisorMr.MustafaImranforhisenlightening suggestionsandadvices.Hisprofessionalism,guidance,thoroughness,dedicationand inspirations will always serve to us as an example in our professional life. SpecialacknowledgementstoMs.SamreenAmirandMr.WasifShams.Theirinterestin this project was very beneficial and helped design many vital parts of the project. Finally,wewouldliketothankDIGITEKEngineeringforprovidinguswithModelSim6.1e and the ip cores of Viterbi decoder and Reed Solomon decoder that made the difficult task of implementation of the OFDM receiver much easier. Shahbaz Abbasis051.04 Shazer Baig s303.04 June 2008 iii TABLE OF CONTENTS Page ACKNOWLEDGEMENTii TABLE OF CONTENTSiii LIST OF TABLESvi LIST OF FIGURESvii ABSTRACTix CHAPTER 1: INTRODUCTION 1 1.1Introduction1 1.2Digital communication system architecture1 1.3Orthogonal Frequency Division Multiplexing2 1.4A Typical OFDM system3 1.4.1Scrambler / Descrambler4 1.4.2Reed Solomon Encoder / Decoder5 1.4.3Convolutional Encoder / Decoder 5 1.4.4Interleaver / De-interleaver6 1.4.5Constellation Mapper / De-mapper6 1.4.6FFT / IFFT6 1.4.7 Cyclic Prefix Adder / Remover7 1.5Field Programmable Gate Array7 1.6Project Objective8 1.7 Project Specifications8 1.7.1Transmitter Specifications10 1.7.2Receiver Specifications11 1.8Project design flow11 1.9Project scope12 CHAPTER 2: LITERATURE SURVEY13 2.1 Evolution of OFDM 13 iv 2.1.1History of OFDM13 2.2The OFDM system15 2.3Advantages and disadvantages of OFDM16 2.4Applications of OFDM17 2.5Verilog Hardware description Language17 2.6Synthesis process in Verilog HDL18 CHAPTER 3: TRANSMITTER DESIGN AND IMPLEMENTATION 20 3.1 Introduction 20 3.2OFDM system hardware architecture20 3.3The Transmitter22 3.4FIFO24 3.5Scrambler24 3.5.1Design of Scrambler25 3.6Reed Solomon Encoder 27 3.6.1Description of the Reed Solomon code27 3.6.2Galois field arithmetic29 3.6.3Encoder design32 3.7Convolutional Encoder35 3.7.1Encoder design36 3.8Interleaver38 3.8.1Interleaver design39 3.9Constellation mapper44 3.9.1Design of Constellation mapper44 3.10Inverse Fast Fourier Transform46 3.10.1Radix-22 algorithm47 3.10.2IFFT design49 3.11Cyclic Prefix Adder52 3.11.1Design of Cyclic Prefix Adder52 CHAPTER 4: RECEIVER DESIGN AND IMPLEMENTATION54 4.1Introduction54 4.2The Receiver54 v 4.3Cyclic Prefix Remover56 4.4Fast Fourier Transform57 4.5Constellation De-mapper57 4.5.1Design of Constellation De-mapper58 4.6De-interleaver59 4.7Viterbi Decoder61 4.8Reed Solomon Decoder61 4.9De-scrambler62 4.9.1De-scrambler design62 CHAPTER 5: SIMULATION, SYNTHESIS AND RESULTS64 5.1 Introduction 64 5.2Simulation of OFDM Transmitter64 5.2.1Scrambler64 5.2.2Reed Solomon Encoder65 5.2.3Convolutional Encoder65 5.2.4Interleaver66 5.2.5Constellation mapper66 5.2.6IFFT66 5.2.7Cyclic Prefix Adder67 5.3Synthesis of OFDM Transmitter68 5.4Simulation of OFDM Receiver68 5.4.1Constellation De-Mapper68 5.4.2De-Interleaver69 5.4.3De-Scrambler69 5.5Synthesis of OFDM Receiver69 REFERENCES71 APPENDIX A: RTL CODE IN VERILOG FOR OFDM TRANSMITTER 73 APPENDIX B: RTL CODE IN VERILOG FOR OFDM RECEIVER 97 vi LIST OF TABLES Page 2.1A Brief History of OFDM13 3.1OFDM system signal descriptions22 3.2Transmitter signal descriptions23 3.3Scrambler signal descriptions25 3.4Elements of GF (24) and their binary equivalents30 3.5Signal descriptions for Reed Solomon Encoder32 3.6Signal descriptions for Convolutional Encoder37 3.7Signal descriptions for Interleaver40 3.8Contents of Address ROM (in Interleaver)42 3.9Mapping of bits to constellation points44 3.10Contents of the ROM (in Constellation Mapper)45 3.11Signal descriptions for Constellation Mapper45 3.12Signal descriptions for IFFT50 3.13Signal descriptions for Constellation Mapper53 4.1OFDM Receiver signal descriptions55 4.2Data points mapped to constellation points58 4.3Signal descriptions for Constellation De-mapper59 4.4Contents of Address ROM (in De-Interleaver)60 4.5De-scrambler signal descriptions62 5.1Important Synthesis results for OFDM Transmitter68 5.2Important Synthesis results for OFDM Receiver69 vii LIST OF FIGURES Page 1.1A typical digital communication system 1 1.2Spectrum overlap in OFDM3 1.3Complete OFDM system4 1.4FPGA design flow8 1.5Top level architecture of the proposed OFDM system9 1.6OFDM transmitters top-level architecture10 1.7OFDM receivers top-level architecture11 1.8Project design flow12 2.1Synthesis Process in Verilog Environment18 3.1Serial communication format (8 bit data + start bit + stop bit)20 3.2Complete Architecture of the proposed OFDM system (transmitter highlighted) 21 3.3I/O view of the OFDM system22 3.4I/O diagram of the transmitter23 3.5Scrambler I/O diagram25 3.6Scrambler logic diagram26 3.7Circuit diagram of Scrambler27 3.8RS (n, k) code29 3.9Top-level structure of the Reed Solomon Encoder33 3.10Detailed architecture of Reed Solomon Encoder34 3.11Galois Field multiplier and adder35 3.12Convolutional Encoder I/O Diagram36 3.13Convolutional Encoder: Circuit Diagram37 3.14Interleaving concept38 3.15Interleaver I/O diagram (A top-level architecture)39 3.16Circuit diagram of Interleaver42 3.17QPSK constellation diagram44 3.18Constellation Mapper45 3.19Radix-4 FFT butterfly48 3.20Radix-2 FFT Butterfly48 3.21IFFT I/O diagram50 viii 3.22Architecture of 64-point-22 FFT53 3.23bf2i and bf2ii radix 2 butterflies51 3.24Top level architecture of cyclic prefix adder53 4.1I/O diagram of the OFDM receiver54 4.2Complete Architecture of the proposed OFDM system (receiver highlighted) 56 4.3FFT59 4.4QPSK constellation diagram60 4.5I/O diagram of constellation demapper60 4.6Verilog code showing the logic behind implementation of constellation demapper 61 4.7De-scrambler I/O diagram64 4.8De-scrambler logic diagram65 5.1Scrambler simulation results66 5.2Solomon Encoder simulation results67 5.3Simulation Waveform of the Convolutional Encoder68 5.4Constellation Mapper simulation results68 5.5IFFT simulation results69 5.6Cyclic Prefix Adder simulation result69 ix HARDWARE IMPLEMENTATION OF OFDM TRANSMITTER AND RECEIVER USING FPGA ABSTRACT Orthogonal Frequency Division Multiplexing (OFDM) is a multi carrier modulation technique. It provides high bandwidth efficiency because the carriers are orthogonal to each other and multiplecarrierssharethedataamongthemselves.Themainadvantageofthis transmissiontechniqueisitsrobustnesstochannelfadinginwirelesscommunication environment.Themainobjectiveofthisprojectistodesignandimplementabaseband OFDM transmitter and receiver. The implementation has been carried out in hardware using FieldProgrammableGateArray(FPGA).Boththetransmitterandthereceiverare implemented on a single FPGA board with the channel being a wired one. The FPGA board usedisAlterasCycloneIIIstarterboardwhichcontains24,600logicelements.The designinghasbeendoneinVerilogHDL.Modelsim6.1ehasbeenusedtosimulatethe design. Input to the system is given using computers serial port. NI Labview has been used todotheserialportinterfacing.Theoutputofthetransmitterhasbeen comparedwiththe output of MATLAB for the same OFDM system modeled in MATLAB. The data obtained at the output of the transmitter is fed to the PC using serial port and is converted to complex numbersbecauseMATLABgivesoutputintheformofcomplexnumbers.Althougherror correctionschemeshavebeenemployedinthetransmitterandthereceiverbutasthe channelisawiredone, andhencethereisnoISIorotherchannelimpairments,therefore errorsdontoccur.Therefore,onlytheproperoperationoftheOFDMsystemhasbeen aimed to achieve.Chapter 1Introduction 1CHAPTER 1 INTRODUCTION 1.1INTRODUCTION Demand for broadband access is increasing at a quick rate, and at the same time, is not limitedtoareasthatalreadyhaveanexistinghighqualityinfrastructure.Forinstance, developing countries and rural areas may not have the existing telecom infrastructure or theexistingconnections,typicallyovercopper,tomeettherequirementsofDigital SubscriberLine(DSL)technology.Furthermore,itisexpectedthatuserswillrequire morebandwidthonthemove.Whilecurrenttechnologiescanmeetthisbandwidth demand,theusefulrangeislimited.Thislimitationopensupopportunitiesfor technologies such as Orthogonal Frequency Division Multiplexing. 1.2 DIGITAL COMMUNICATION SYSTEM ARCHITECTURE OFDMisadigitalmodulationtechnique;thereforeanintroductiontodigital communication systems is being provided. A digital communication system involves the transmissionofinformationindigitalformfromonepointtoanotherpointasshownin Figure 1.1. Figure 1.1 A typical digital communication system Source of Information Transmitter Channel Receiver Received Information Chapter 1Introduction 2Thethreebasicelementsinacommunicationsystemaretransmitter,channeland receiver.Thesourceofinformationisthemessagesthataretobetransmittedtothe otherendinthereceiver.Atransmittercanconsistofsourceencoder,channelcoder andmodulation.Sourceencoderprovidesanefficientrepresentationoftheinformation throughwhichtheresourcesareconserved.Achannelcodermayincludeerror detection and correction code. A modulation process then converts the base band signal into band pass signal before transmission. Duringtransmission,thesignalexperiencesimpairmentwhichattenuatesthesignals amplitudeanddistortsignalsphase.Also,thesignalstransmittingthroughachannel also impaired by noise, which is assumed to be Gaussian distributed component. Atthereceivingend,thereversedorderofthestepstakeninthetransmitteris performed. Ideally, the same information must be decoded at the receiving end. 1.3ORTHOGONAL FREQUENCY DIVISION MULTIPLEXING Orthogonalfrequencydivisionmultiplexing(OFDM)isamulti-carrierdigitalmodulation technique that has been recognized as an excellent method for high speed bi-directional wirelessdatacommunication.OFDMeffectivelysqueezesmultiplemodulatedcarriers tightlytogether,reducingtherequiredbandwidthbutkeepingthemodulatedsignals orthogonal so they do not interfere with each other. OFDM is similar to FDM but much more spectrally efficient by spacing the sub-channels muchclosertogether(untiltheyareactuallyoverlapping)[1].Thisisdonebyfinding frequenciesthatareorthogonal,whichmeansthattheyareperpendicularina mathematicalsense,allowingthespectrumofeachsub-channeltooverlapanother withoutinterferingwithit.InFigure1.2theeffectofthisisseen,astherequired bandwidth is greatly reduced by removing guard bands (which are present in FDM) and allowing signals to overlap. Chapter 1Introduction 3 Figure 1.2 Spectrum overlap in OFDM [6] 1.4 A TYPICAL OFDM SYSTEM Figure1.3showsadetailedOFDMcommunicationssystem.Eachblockisbriefly defined below: Chapter 1Introduction 4 Figure 1.3 Complete OFDM system 1.4.1SCRAMBLER / DESCRAMBLER Data bits are given to the transmitter as inputs. These bits pass through a scrambler that randomizesthebitsequence.Thisisdoneinordertomaketheinputsequencemore Receiver Transmitter Scrambler Reed Solomon Encoder Convolutional Encoder Interleaver Constellation Mapper Inverse Fast Fourier Transform Addition of Cyclic Prefix Descrambler Reed Solomon Decoder Viterbi Decoder De-Interleaver Constellation De-Mapper Fast Fourier Transform Removal of Cyclic Prefix Channel Chapter 1Introduction 5dispersesothatthedependenceofinputsignalspowerspectrumontheactual transmitted data can be eliminated [2]. At the receiver end descrambling is the last step. De-scrambler simply recovers original data bits from the scrambled bits.

1.4.2REED-SOLOMON ENCODER / DECODER ThescrambledbitsarethenfedtotheReedSolomonEncoderwhichisapartof ForwardErrorCorrection(FEC).ReedSolomoncodingisanerror-correctioncoding technique. Input data is over-sampled and parity symbols are calculated which are then appendedwithoriginaldata[3].Inthiswayredundantbitsareaddedtotheactual message which provides immunity against severe channel conditions. A Reed Solomon code is represented in the form RS (n, k), where 1 2 =mn1.1 t km2 1 2 =1.2 Here mis the numberof bitspersymbol,kisthe numberof input data symbols (to be encoded), n is the totalnumber of symbols (data + parity)in the RScodeword andt is themaximumnumberofdatasymbolsthatcanbecorrected.AtthereceiverReed Solomon coded symbols are decoded by removing parity symbols. 1.4.3CONVOLUTIONAL ENCODER / DECODER Reed Solomon error-coded bits are further coded by Convolutional encoder. This coder addsredundantbitsaswell.Inthistypeofcodingtechniqueeachmbitsymbolis transformed into an n bit symbol; m/n is known as the code rate. This transformation of mbitsymbolintonbitsymboldependsuponthelastkdatasymbols,thereforekis known as the constraint length of the Convolutional code [4]. Chapter 1Introduction 6Viterbialgorithmisusedtodecodeconvolutionalyencodedbitsatthereceiverside. Viterbi decoding algorithm is most suitable for Convolutional codes with k10. 1.4.4INTERLEAVER / DE-INTERLEAVER Interleavingisdonetoprotectthedatafrombursterrorsduringtransmission. Conceptually, the in-coming bit stream is re-arranged so that adjacent bits are no more adjacent to each other. The data is broken into blocks and the bits within a block are re-arranged[5].TalkingintermsofOFDM,thebitswithinanOFDMsymbolarere-arranged in such a fashion so that adjacent bits are placed on non-adjacent sub-carriers. AsfarasDe-Interleavingisconcerned,itagainrearrangesthebitsintooriginalform during reception. 1.4.5CONSTELLATION MAPPER / DE-MAPPER TheConstellationMapperbasicallymapstheincoming(interleaved)bitsontodifferent sub-carriers. Different modulation techniques can be employed(such as QPSK, BPSK, QAMetc.)fordifferentsub-carriers.TheDe-Mappersimplyextractsbitsfromthe modulated symbols at the receiver. 1.4.6INVERSE FAST FOURIER TRANSFORM / FAST FOURIER TRANSFORM ThisisthemostimportantblockintheOFDMcommunicationsystem.ItisIFFTthat basicallygivesOFDMitsorthogonality[1].TheIFFTtransformaspectrum(amplitude andphaseofeachcomponent)intoatimedomainsignal.Itconvertsanumberof complex data points into the same number of points in time domain. Similarly, FFT at the receiversideperformsthereversetaski.e.conversionfromtimedomainbackto frequency domain. Chapter 1Introduction 71.4.7ADDITION / REMOVAL OF CYCLIC PREFIX Inordertopreservethesub-carrierorthogonalityandtheindependenceofsubsequent OFDMsymbols,acyclicguardintervalisintroduced.Theguardperiodisspecifiedin termsofthefractionofthenumberofsamplesthatmakeupanOFDMsymbol.The cyclicprefixcontainsacopyoftheendoftheforthcomingsymbol.Additionofcyclic prefixresultsincircularconvolutionbetweenthetransmittedsignalandthechannel impulseresponse.Frequencydomainequivalentofcircularconvolutionissimplythe multiplicationoftransmittedsignalsfrequencyresponseandchannelfrequency response,thereforereceivedsignalisonlyascaledversionoftransmittedsignal(in frequencydomain),hencedistortionsduetoseverechannelconditionsareeliminated [6].Removalofcyclicprefixisthendoneatthereceiverendandthecyclicprefixfree signal is passed through the various blocks of the receiver. 1.5FIELD PROGRAMMABLE GATE ARRAY By modern standards, a logic circuit with 20000 gates is common. In order to implement large circuits, it is convenient to use a type of chip that has a large logic capacity. A field-programmablegatearrays(FPGA)isaprogrammablelogicdevicethatsupport implementationofrelativelylargelogiccircuits[6].FPGAisdifferentfromotherlogic technologies like CPLD and SPLD because FPGA does not contain AND or OR planes. Instead, FPGA consists of logic blocks for implementing required functions. An FPGA contains 3 main types of resources: logic blocks, I/O blocks for connecting to thepinsofthepackage,andinterconnectionwiresandswitches.Thelogicblocksare arrangedinatwo-dimensionalarray,andtheinterconnectionwiresareorganizedas horizontalandverticalroutingchannelsbetweenrowsandcolumnsoflogicblocks[7]. Theroutingchannelscontainwiresandprogrammableswitchesthatallowthelogic blockstobeinterconnectedinmanyways.FPGAcanbeusedtoimplementlogic circuitsofmorethanafewhundredthousandsequivalentgatesinsize[7].Equivalent Chapter 1Introduction 8gates is a way to quantify a circuits size by assuming that the circuit is to be built using only simple logic gate and then estimating how many of these gates are needed. Figure 1.4 gives a clear picture of the FPGA design flow. Figure 1.4 FPGA design flow [7] 1.6PROJECT OBJECTIVE TheobjectiveofthisprojectistocarryoutanefficientimplementationoftheOFDM system (i.e. transmitter and receiver) using Field Programmable Gate Array (FPGA).FPGAhasbeenchosenasthetargetplatformbecauseOFDMhaslargearithmetic processing requirements which can become prohibitive if implemented in software on a Digital Signal Processor (DSP) [7]. However, the highly pipelined nature of much of the processinglendsitselfwelltoahardwareimplementation.Inaddition,FPGA implementationhastheaddedadvantageofallowinglatemodificationsinresponseto real world performance evaluation. 1.7PROJECT SPECIFICATIONS ThecompleteOFDMsystem,comprisingofthetransmitterandthereceiver,hasbeen implemented on a single FPGA board. The overall specifications are as follows: Chapter 1Introduction 9FPGA board: Altera Cyclone III starter board (24,600 logic elements) HSMCtoSantaCruzdaughtercard(fromTERASIC)forserialport communication Data Input and output: PCs serial port Software used in the host PC: NI LabView 7.1 Software model of the OFDM system created in MATLAB Verilog used as the hardware description language. ModelSim 6.1 used for simulation of the design. Quartus II used to map the design to targeted device (Altera Cyclone III). Top level architecture of the proposed OFDM system is shown in Figure 1.4. Itisverychallengingonhowsoftwarealgorithmmaybemappedtohardwarelogic.A variablemaycorrespondtoawireoraregisterdependingonitsapplicationand sometimesanoperatorcanbemappedtohardwarelikeadders,latches,multiplexers etc. Figure 1.5 Top level architecture of the proposed OFDM system PC RS-232 ReceiverRS-232 Transmitter OFDM Transmitter OFDM Receiver RS 232 Interface CYCLONE III FPGABOARD HSMC to Santa Cruz Daughter card Chapter 1Introduction 10 1.7.1TRANSMITTER SPECIFICATIONS Figure1.5showsatop-levelblockdiagramoftheOFDMtransmitter.Single-Clock operationspeaksitselfforthesynchronousoperationofthesystem.TheResetinput mustbeassertedforatleastoneclockcycleforthesystemtoreset.Outputofthe transmitter is fed to the host PC via the serial port and also to the OFDM receiver. Specifications are listed below: OFDM with 64 sub-carriers (all data sub-carriers) All the sub-carriers are modulated using QPSK IFFT: 64-point. Implemented using FFT radix 22 algorithm Channel coding: Reed Solomon code + Convolutional code Reed Solomon Encoder: RS (15, 9) Convolutional Encoder: m=1, n=2, k=7. Code rate = Block Interleaver and 1/8 Cyclic Prefix Figure 1.6 OFDM transmitters top-level architecture Interface to RS232 Receiver OFDM Transmitter Control Block Clock Reset Interface to RS232 Transmitter InputSerial data Chapter 1Introduction 11 1.7.2RECEIVER SPECIFICATIONS Infigure1.6atoplevelblockdiagramofthereceiverisshown.Itsspecificationsare same as that of the transmitter. Here the recovered (demodulated data would be fed to the serial port. Figure 1.7 OFDM receivers top-level architecture 1.8PROJECT DESIGN FLOW The design procedure consists of following steps: Creating a top level design of the complete system Determining the basic operation of each block and creating the appropriate logic I/O integration of the various logic blocks Description of design functionality using Verilog hardware description language Modelsimisusedtosimulatethedesignfunctionalityandtoreporterrorsin desired behavior of the design Synthesisofthedefinedhardwareisdonewhichincludesslackoptimization, power optimizations followed by placement and routing FPGA bitstream file is fed to the hardware Input is given to the system through the PCs RS232 and hardware is tested OFDM Modulated Data Interface to RS232 Transmitter OFDM Receiver Control Block Clock Reset Recovered Data Chapter 1Introduction 12 Figure 1.8 Project design flow 1.9PROJECT SCOPE

Factors such as data rate, allowable bit rate of the input, code rate of the Forward Error correctionstageandnoiseimmunitycanwelldefinethescopeofthisproject.These factors have been discussed in detail in the subsequent chapters. Top level design Creating logic for each block I/O Integration of the blocks RTL Description of design functionality in Verilog Simulation Synthesis Bit stream file fed to FPGA Hardware Testing Chapter 2Literature Survey 13 CHAPTER TWO LITERATURE SURVEY 2.1EVOLUTION OF OFDMOFDM can be viewed as a collection of transmission techniques. When this technique is appliedinwirelessenvironment,itisreferredtoasOFDM.Inthewiredenvironment, suchasasymmetricdigitalsubscriberlines(ADSL),itisreferredasdiscretemultitone (DMT). In OFDM, each carrier is orthogonal to all other carriers. However, this condition isnotalwaysmaintainedinDMT[8].OFDMisanoptimalversionofmulticarrier transmission schemes. 2.1.1HISTORY OF OFDM AlthoughOFDMhasbecomewidelyusedonlyrecently,theconceptdatesbacksome 40 years. Following table cites some landmark dates in the history of OFDM. Table 2.1 A Brief History of OFDM Year Event 1966 Chang shows that multi-carrier modulation can solve the Multipath problem without reducing data rate [10]. This is generally considered the first official publication on multi-carrier modulation. Some earlier work was Holsingers 1964 MIT dissertation [9] and some of Gallagers early work on waterfilling [11]. Chapter 2Literature Survey 14 1971 Weinstein and Ebert show that multi-carrier modulation can be accomplished using a DFT [12]. 1985 Cimini at Bell Labs identifies many of the key issues in OFDM transmission and does a proof-of-concept design [13]. 1993 DSL adopts OFDM, also called discrete multi-tone, following successful field trials / competitions at Bellcore versus equalizer-based systems. 1999 The IEEE 802.11 committee on wireless LANs releases the 802.11a standard for OFDM operation in 5GHz UNI band. 2002 The IEEE 802.16 committee releases an OFDM-based standard for wireless broadband access for metropolitan area networks under revision 802.16a. 2003 The IEEE 802.11 committee releases the 802.11g standard for operation in the 2.4GHz band. 2003 The multi-band OFDM standard for ultra wideband is developed, showing OFDMs usefulness in low-SNR systems. Chapter 2Literature Survey 15 Frequency Division Multiplexing (FDM) is also a form of the multi-channel transmission. The use of Frequency Division Multiplexing (FDM) goes back over a long period of time, wheremorethanonelowratesignal,suchastelegraph,wascarriedoverarelatively widebandwidthchannelusingaseparatecarrierfrequencyforeachsignal[1].To facilitateseparationofthesignalsatthereceiver,thecarrierfrequencieswerespaced sufficientlyfarapartsothatthesignalspectradidnotoverlap.Emptyspectralregions between the signals assured that they could be separated with readily realizable filters. The resulting spectral efficiency was therefore quite low. 2.2THE OFDM SYSTEM A detailed explanation of the OFDM system was given in the previous chapter, in which different building blocks of an OFDM communication system were discussed. Following is a brief review of those concepts. In1971DiscreteFourierTransform(DFT)wasusedinbaseband modulation/demodulationinordertoachieveorthogonality7.SinceDFThasheavy computationalrequirements,therefore,FastFourierTransform(FFT)wasutilized.For anNpointdiscreteFourierTransformtherequirednumberofcomputationsisN2,but thatforFFTisNlog(N),whichismuchlesserthanDFT.Inthiswaytheproblemof bandwidth inefficiency due to the placement of guard bands between sub-channels was solvedandanewtechniqueOrthogonalFrequencyDivisionMultiplexingcameinto being. AsOFDMisamulti-carriermodulationtechnique,therefore,theinputdataissplitand mappedontodifferentsub-carriers.Eachcarrierismodulatedusingoneofthesingle-carrier modulation techniques discussed above.The OFDM system successfully avoids any inter-channel interference (ICI) because the carriers are kept orthogonal. In addition, a cyclic prefix (CP) is added before the start of eachtransmittedsymboltoactasaguardperiodpreventinginter-symbolinterference Chapter 2Literature Survey 16 (ISI),providedthatthedelayspreadinthechannelislessthantheguardperiod[17]. Thisguardperiodisspecifiedintermsofthefractionofthenumberofsamplesthat make up a symbol. 2.3ADVANTAGES AND DISADVANTAGES OF OFDM Another advantage of OFDM is its resilience to Multipath, which is the effect of multiple reflected signals hitting the receiver. This results in interference and frequency-selective fading which OFDM is able to overcome by utilizing its parallel, slower bandwidth nature. ThismakesOFDMidealtohandletheharshconditionsofthemobilewireless environment. TheintroductionofcyclicprefixmadeOFDMsystemresistanttotimedispersion[18]. OFDMsymbolrateislowsinceadatastreamisdividedintoseveralparallelstreams beforetransmission.Thismakethefadingisslowenoughforthechanneltobe considered as constant during one OFDM symbol interval. Cyclic prefix is a crucial feature of OFDM used to combat the inter-symbol interference (ISI)andinter-channel-interference(ICI)introducedbythemulti-pathchannelthrough which the signal is propagated [1]. The basic idea is to replicate part of the OFDM time-domain waveform from the back to the front to create a guard period. The duration of the guard period should be longer than the worst-case delay spread of the target multi-path environment.Theuseofacyclicprefixinsteadofaplainguardinterval,simplifiesthe channel equalization in the demodulator. In wire system, OFDM system can offer an efficient bit loading technique [1]. It enables a systemtoallocatedifferentnumberofbitstodifferentsubchannelsbasedontheir individual SNR. Hence, an efficient transmission can be achieved. OneofthemajordisadvantagesofOFDMisitsrequirementforhighpeak-toaverage- power ratio (PAPR) [6]. This put high demand on linearity in amplifiers. Chapter 2Literature Survey 17 Second, the synchronization error can destroy the orthogonality and cause interference. Phase noise error and Doppler shift can cause degradation to OFDM system [1]. A lot of effort is required to design accurate frequency synchronizers for OFDM. OFDMshighspectralefficiencyandresistancetoMultipathmakeitanextremely suitabletechnologytomeetthedemandsofwirelessdatatraffic.Thishasmadeitnot only ideal for such new technologies like WiMAX and Wi-Fi but also currently one of the prime technologies being considered for use in future fourth generation (4G) networks. 2.4APPLICATIONS OF OFDM Initially,OFDMapplicationsarescarcebecauseoftheirimplementationcomplexity. Now,OFDMhasbeenadoptedasthenewEuropeandigitalaudiobroadcasting(DAB) standard and for terrestrial digital video broadcasting (DVB) [19]. Infixed-wireapplications,OFDMisemployedinasynchronousdigitalsubscriberline (ADSL)andhighbit-ratedigitalsubscriberline(HDSL)systems.Ithasbeenproposed forpowerlinecommunicationssystemsaswellduetoitsresiliencetodispersive channel and narrow band interference. It has been employed in WiMAX a well. 2.5VERILOG HARDWARE DESCRIPTION LANGUAGE VerilogHDLisoneofthetwomostcommonHardwareDescriptionLanguages(HDL) used by integrated circuit (IC) designers. The other one is VHDL. HDLallowsthedesigntobesimulatedearlierinthedesigncycleinordertocorrect errorsorexperimentwithdifferentarchitectures.DesignsdescribedinHDLare technology-independent, easy to design and debug, and are usually more readable than schematics, particularly for large circuits. Verilog can be used to describe designs at four levels of abstraction [20]: (i) Algorithmic level (much like c code with if, case and loop statements). (ii) Register transfer level (RTL uses registers connected by Boolean equations). (iii) Gate level (interconnected AND, NOR etc.). Chapter 2Literature Survey 18 (iv) Switch level (the switches are MOS transistors inside gates). The language also defines constructs that can be used to control the input and output of simulation. Morerecently Verilogis used as an input forsynthesis programs whichwill generatea gate-leveldescription(anetlist)forthecircuit.SomeVerilogconstructsarenot synthesizable. Also the way the code is written will greatly affect the size and speed of the synthesized circuit. 2.6SYNTHESIS PROCESS IN VERILOG HDL Synthesisistoconstructagate-levelnetlistfromamodelofacircuitdescribedin Verilog. The synthesis process is described in diagram below. Figure 2.1 Synthesis Process in Verilog Environment AsynthesisprogrammaygenerateanRTLnetlist,whichconsistsofregister-transfer levelblockssuchasflip-flops,arithmetic-logic-unitsandmultiplexersinterconnectedby Synthesizer Logic Optimizer RTL module builder Unoptimized gate level netlist Target Technology Area and Timing Constraints Verilog Model Optimized netlist Chapter 2Literature Survey 19 wires. All these are performed by RTL module builder. This builder is to build or acquire fromalibrarypredefinedcomponents,eachoftherequiredRTLblocksintheuser-specified target technology. Theabovesynthesisprocessmayproduceanunoptimizedgatelevelnetlist.Alogic optimizercanusetheproducednetlistandtheconstraintspecifiedtoproducean optimized gate level net list. This net list can be programmed directly into a FPGA chip.Chapter 3 Transmitter Design and Implementation 20 CHAPTER 3 TRANSMITTER DESIGN AND IMPLEMENTATION 3.1INTRODUCTION The proposed OFDM system consists of an OFDM baseband transmitter and an OFDM basebandreceiver.Thischaptergivesdetailsonthecompletearchitectureofthe proposeddesignandelaboratesfurtheronthedesignandimplementationofthe transmitter portion of the project. The transmitter gets its input from the serial port of the host PC. An input stream is sent as input to the transmitter that modulates the incoming stream by splitting it and putting it ontoseparatesub-carriers(64inourcase).Themodulateddataafterpassingthrough variousblocksisgivenasinputtothereceiverandalsosentbacktothehostPC(via serial port) for demonstration purposes. 3.2OFDM SYSTEM HARDWARE ARCHITECTURE ImplementationoftheproposedsystemhasbeendoneonAlterasCycloneIIIstarter board. This board does not have a serial port therefore we used an HSMC to Santa Cruz daughter card (from TERASIC). This daughtercardcontains an Alterastandard HSMC connectorandaserialport.TheHSMCconnectorplugsintotheHSMCconnector present on the Cyclone III board, thereby providing an RS232 physical connection to the FPGA board. AnRS232receivingmoduletakestheserialstreamandextractsthe8bitpayloadby removingthestartandstopbits.Figure3.1showstheformatofdatastreaminserial communications (RS232 standard). Figure 3.1 Serial communication format (8 bit data + start bit + stop bit) StartStop D7D6D5D4D3D2D1D0 Chapter 3 Transmitter Design and Implementation 21 The1-bytedatafromtheRS232receiverisstoredinaFIFOregister.Datafromthe FIFOisgiven(bitbybit)tothetransmittermodule.Figure3.2depictsthehardware architecture of the project highlighting only the transmitter portion. Figure 3.2 Complete Architecture of the proposed OFDM system (transmitter highlighted) OFDM Transmitter HSMC TO SANTA CRUZ CONNECTOR RS232 Receiver FIFO Scrambler RS Encoder Conv. Encoder Interleaver Constellation mapper IFFT Cyclic Prefix RS232 Transmitter FIFO High Speed Mezzanine Connector Interface RS232 port On board 50 MHz clock PLL OFDM Receiver Input Control Unit Output Chapter 3 Transmitter Design and Implementation 22 We can see that the modulated output from the transmitter is fed into another FIFO, and then taken out into the RS232 transmitter (byte by byte) that prepares the data for serial transmissionovertheRS232interfacebyaddingstartandstopbits.Thebaudrateon which the serial port is operating is 115.2 kbps.Thereisa50MHzon-boardclocksourcewhichinconjunctionwiththePLLcore (provided with the Quartus II software) can be used to produce any clock frequency. The output of the PLL then provides clock(s) to all the modules.Figure 3.3 shows an I/O view of the proposed system and Table 3.1 gives a description of the input and output signals of the OFDM system. Figure 3.3 I/O view of the OFDM system Table 3.1 OFDM system signal descriptions Signal name TypeWidthDescription in_dataInput1Data input to the OFDM systemclockInput1Clock signal (via 50 MHz on-board clock) arst_ninput1Asynchronous reset (asserted at negative edge) out_dataOutput1Demodulated output data 3.3 THE TRANSMITTER Figure3.2showsthevariousbuildingblocksofthetransmitter.Thecontrolunit synchronizes the operation all the blocks in order to avoid any timing mismatches. Each one of these blocks will be discussed in detail in the subsequent sections. OFDM system in_data clock arst_n out_data Chapter 3 Transmitter Design and Implementation 23 Asmentionedabove,thetransmittergetsitsinputfromtheFIFOregisteronebitper clock cycle. This implies that the input to the transmitter is I bit wide. It is only when the FIFOisfullthatthetransmitterstartsextractingdatafromit.SimilarlywhentheFIFO getsemptythetransmitterstopstakingdatafromit.Therefore,thetransmittermakes use of certain control and status signals provided by the FIFO to determine when to ask the FIFO for data and when to stop taking input data. Inasimilarfashion,theoutputofthetransmitterisalsostoredinaFIFOregister.In orderforthisFIFOtodeterminewhentostartstoringoutputdatafromthetransmitter, thetransmitterprovidesastatussignalthattellsthisFIFOthatdataispresentonthe output lines. Figure 3.4 shows the I/O diagram for the transmitter and Table 3.2 gives the description of the signals in and out of the transmitter. Figure 3.4 I/O diagram of the transmitter Table 3.2 Transmitter signal descriptions Signal Name TypeWidthDescription in_dataInput1Input data to the transmitter clockInput1Clock 20 MHz (output of PLL) arst_nInput1Asynchronous reset (asserted on negative Transmitter out_data in_data readempty wrfull arst_n clock readreq start_output Chapter 3 Transmitter Design and Implementation 24 edge) wrfullInput1 FIFO status signals - asserted when FIFO is full reademptyInput1 FIFO status signal asserted when FIFO is empty out_dataOutput48Modulated data coming out of the transmitter readreqOutput1 FIFO control signal requests data from FIFO (transmitter asserts this signal when the FIFO is full) start_outputOutput1 Asserted when there is data present on the out_data lines 3.4 FIFO FirstInFirstOutisapopulardatastructure(alsoknownasqueue)thatisusedfor bufferinginordertoprovideflowcontrol.WeobtainedtheFIFOfromAlteras MegafunctionWizard(QuartusII).ThisparameterizedMegafunctionallowscreating FIFOsofanywidthanddepthwithvariousoptionsofcontrolandstatussignals.Using technology specific modules allows for quick prototyping of the design. Hence all we had todowastoprovideappropriateparametersandinterfacetheMegafunctioninour design. Now the following sections describe the various building blocks of the OFDM transmitter as shown in Figure 3.2. The functions of these blocks and their role in the OFDM system were briefly discussed in Chapter 1, therefore here the hardware implementation details of these blocks are being discussed 3.5 SCRAMBLER Ascrambler(oftenreferredtoasarandomizer)isadevicethatmanipulatesadata stream before transmitting. The purpose of scrambling is to eliminate the dependence of a signals power spectrum upon the actual transmitted data and making it more disperse tomeetmaximumpowerspectraldensityrequirements,becauseifthepoweris concentrated in a narrow frequency band, it can interfere with adjacent channels [14]. Chapter 3 Transmitter Design and Implementation 25 3.5.1DESIGN OF SCRAMBLER Figure3.1showstheinput/outputparametersoftheScrambler.Inputbusis1bitwide and arst_n is the asynchronous reset input. A negative edge on the arst_n input resets the Scrambler. A bitislatchedin at the positive edge ofthe clock.SeeTable3.3for a description of the signals. Figure 3.5 Scrambler I/O diagram Table 3.3 Scrambler signal descriptions Signal Name TypeWidthDescription inInput1Input data to the transmitter clockInput1Positive edge clock arst_nInput1Asynchronous reset (Negative edged) enableInput1If high, input is present on the line in outOutput1Output scrambled data Scramblers can be implemented using a Linear Feedback Shift Register (LFSR) [9]. An LFSRisasimpleregistercomposedofmemoryelements(flip-flops)andmodulo-2 adders (i.e. XOR gates). Feedback is taken from two or more memory elements, which areXOR-edandfedbacktothefirststage(memoryelement)oftheLFSR.Inthe proposed design, a standard 7 bit scrambler has been used to randomize the incoming bits. An initial seed value is stored in the LFSR when arst_n is asserted; this value may Scrambler out In clock arst_n enable Chapter 3 Transmitter Design and Implementation 26 be any random bit string except for all zeroes or all ones. If the initial seed contains all zeroes or all ones then the LFSR is locked in a state where every output value is same i.e. either one or zero. Figure 3.6 Scrambler logic diagram Figure3.2showsthebasicconstructionofthescrambler.Afeedbackoutput,whichis actually the modulo-2 added result of the contents of memory elements 4 and 7, is XOR-ed with the input and the result is designated as output and it is also shifted into the first stage.Thesememoryelementsareactuallyflip-flops(D-flipflopsareusedhere);with the output of each flip flop acting as the input for the next flip flop. Figure3.3shows,indetail,thecircuitdiagramofthescrambler.Wecanseethatthe reset (arst_n) is asserted on the negative edge, this is shown by the bubble at the reset pins of the flip-flops. 6543210 + + out in Chapter 3 Transmitter Design and Implementation 27 Figure 3.7 Circuit diagram of Scrambler 3.6 REED SOLOMON ENCODER ReedSolomonforwarderrorcorrectingcodeshavebecomecommonplaceinmodern digitalcommunications.Althoughinventedin1960byIrvingReedandGustave Solomon, then working at MIT Lincoln Labs [21], it was many years before technology caught up and was able to provide efficient hardware implementations. Versions ofReedSolomon codesarenow used inerror correction systems found just about everywhere, including [232:Storage devices (hard disks, compact disks, DVD, barcodes)Wireless communications (mobile phones, microwave links)DigitaltelevisionSatellitecommunications(includingdeepspacemissionslike Voyager)Broadband modems (ADSL, xDSL etc) 3.6.1DESCRIPTION OF THE REED SOLOMON CODE ReedSolomoncodesworkbyaddingextrainformation(redundancy)totheoriginal data. The encodeddatacanthen bestoredortransmitted.Whenthe encodeddatais DQDQ DQ DQ DQ DQ DQ clock arst_n in out Chapter 3 Transmitter Design and Implementation 28 recovereditmayhaveerrorsintroduced,forinstancebyscratchesontheCD, imperfections on a hard disksurface or radio frequency interference with mobile phone reception. The added redundancy allows a decoder (with certain restrictions) to detect whichpartsofthereceiveddataarecorrupted,andcorrectthem[22].Thenumberof errors the code can correct depends on the amount of redundancy added. RS codes are a systematic linear block code. It is a block code because the code is put together by splitting the original message into fixed length blocks. Each block is further subdividedintom-bitsymbols[22].Eachsymbolisafixedwidth,usually3to8bits wide. In the proposed design, each symbol is 4 bits wide. Thelinearnatureofthecodesensuresthatinpracticeeverypossiblem-bitwordisa validsymbol.Forinstancewithan4-bitcodeallpossible4bitwordsarevalidfor encoding, and you don't have to worry about what data you are transmitting. Systematic means that the encoded data consists of the original data with the extra 'parity' symbols appended to it [22].An RS code is partially specified as an RS (n, k) with m-bit symbols, where 1 2 =mn3.1 t km2 1 2 =3.2 2) ( k nt= 3.3 where m is the number of bits per symbol, k is the number of input data symbols (to be encoded), n is the totalnumber of symbols (data + parity)in the RScodeword and t is the maximum number of data symbols that can be corrected. The difference n-k (usually called2t)isthenumberofparitysymbolsthathavebeenappendedtomakethe encoded block. In the proposed design n=15 and k=9 represented by RS (15, 9). It gives m=4 and t=3. Therefore, each symbol is 4 bits wide and a maximum of 3 symbols can be corrected in Chapter 3 Transmitter Design and Implementation 29 the decoder. Figure 3.4 graphically represents an n symbol code showing the parity and data portions. Figure 3.8 RS (n, k) code ThepowerofReedSolomoncodesliesinbeingabletojustaseasilycorrecta corruptedsymbolwithasinglebiterrorasitcancorrectasymbolwithallitsbitsin error. This makes RS codes particularly suitable for correcting burst errors. Usually the encodeddataistransmittedorstoredasasequenceofbits.Intheproposeddesign upto 12 bits could be corrupted affecting at most 3 symbols, and the original message could still be recovered.However it does mean that RS codes are relatively sensitive to evenly spaced errors. 3.6.2GALOIS FIELD ARITHMETICReed Solomon codes are based on finite fields, often called Galois fields. Rather than look at individual numbers and equations, the approach of modern mathematicians is to look at all the numbers that can be obtained from some given initial collection by using operatorssuchasaddition,subtraction,multiplicationanddivision.Theresulting collection is called a field. Some fields, like the set of integers, are infinite.Galois fields have the useful property that any operation on an element of the field will alwaysresultinanotherelementofthefield[23].Thefieldisalsofinite,soitcanbe fullyrepresentedbyafixedlengthbinaryword.Anarithmeticoperationthat,in traditionalmathematics,resultsinavalueoutofthefieldgetsmappedbackintothe field - it's a form of modulo arithmetic [23]. Original data symbolsParity n symbols Chapter 3 Transmitter Design and Implementation 30 Galois arithmetic has very little to do with counting things, 2+2 is not necessarily 4. For easeofhandlingtheGaloisfieldelementsareoftencalledbytheirbinaryequivalent, butthiscanbemisleading.TherearemanyGaloisfields,andpartoftheRS specification is to define which field is used. Galoisarithmeticisideallysuitedtohardwareimplementation[23].Additionand subtractionconsistsofsimplyXORingtwosymbolstogether.Multiplicationisalittle more difficult, (as always) but can be done using purely combinational logic. An RS code with 4 bit symbols will use a Galois field GF (24), consisting of 16 symbols. Thuseverypossible4bitvalueisinthefield.Theorderinwhichthesymbolsappear dependsontheprimitivepolynomial[23].Thispolynomialisusedinasimpleiterative algorithmtogenerateeachelementofthefield.Differentpolynomialswillgenerate different fields. The primitive polynomial used in the proposed design for GF (24) is, 41 ) ( X X X P + + =3.4 TheelementsoftheGaloisfieldGF(24)aregeneratedbyusingthisprimitive polynomial. The symbolis used to give the power representation of each element. Table 3.4 Elements of GF (24) and their binary equivalents Power RepresentationBinary representation 00000

0 1000

10100

20010

30001

41100

50110

60011

71101

81010

90101

101110

110111

121111 Chapter 3 Transmitter Design and Implementation 31

131011

141001 These values are calculated by substitutingfor X in the primitive polynomial such that, + =14 3.5 This will give the value of 4, and for the value of 5 we write, 4 5=3.6 ) 1 (5+ =3.7 2 5 + =3.8 Similarly the rest of the elements are calculated. The final parameter that is used to generate RS codes is the generator polynomial [23]. This polynomial is of order 2t (6 in our case). It is obtained as follows, ) )......( )( )( )( ( ) (2 4 3 2 tX X X X X X G + + + + + =3.9 For our case (i.e. RS (15, 9) and 2t=6) the generator polynomial turns out to be, 6 5 10 4 14 3 4 2 6 9 6) ( X X X X X X X G + + + + + + = 3.10 Givenn,k,thesymbolwidthm,theGaloisfieldprimitivepolynomialPandthe generator polynomial G, the Reed-Solomon code is fully specified. Chapter 3 Transmitter Design and Implementation 32 3.6.3ENCODER DESIGN Since the code is systematic, the whole of the block can be read into the encoder, and then output the other side without alteration. Once the kth data symbol has been read in, the parity symbol calculation is finished, and the parity symbols can be output to give the full n symbols.Theideaoftheparitywordsistocreatealongpolynomial(ncoefficientslongit contains the message and the parity) which can be divided exactly by the RS generator polynomial. That way, at the decoder the received message block can be divided by the RSgeneratorpolynomial.Iftheremainderofthedivisioniszero,thennoerrorsare detected.Ifthereisaremainder,thenthereareerrors.Dividingapolynomialby anotherisnotconceptuallyeasy,butifyoufollowthemathematicsinsomeofthe references it is not too hard to understand. The encoder acts to divide the polynomial represented by the k message symbols D(x) by the RS generator polynomial G(x). Figure3.5(onnextpage)depictsthetoplevelarchitectureoftheproposedReed SolomonEncoder.Thisisabit-serialReedSolomonEncoderwhichmeansthatits input bus is one bit wide. One bit is latched per positive edge of the clock. Having 4 bits per symbolmakesit clear that 4clock cycles are required to input a symbol. arst_n is the same asynchronous reset signal as in Scrambler. Table 3.5 Signal descriptions for Reed Solomon Encoder Signal Name TypeWidthDescription in_dataInput1Input data to the Reed Solomon Encoder clockInput1Positive edge clock arst_nInput1Asynchronous reset (Negative edged) enableInput1If high, input is present on the line in_data outInput1Output RS encoded data Chapter 3 Transmitter Design and Implementation 33 The encoder contains the following three building blocks (as shown in Figure 3.5): Shift Registers Galois field addition and multiplication Redundancy interval controller Figure 3.9 Top-level structure of the Reed Solomon Encoder Thisblockcontains2tshiftregisterseachmbitswide.Therefore,forourcasethere wouldbe6shiftregisterseach4bitswide.Oneofthesesixregistershasparallel loading capability as well.Figure 3.6 is a detailed architecture of Reed Solomon Encoder. It is seen that the reset and clock signals are not shown. We can see the six shift registers. The output of each registerbecomestheshiftinputofthenextregisterstage.AnexceptionisReg5;its outputisXORedwiththeinputdatabitandthenANDedwiththecomplimentof redundancyintervalbit(red),andthenthisoutputoftheANDgatebecomestheshift inputforReg4.OutputsoftheseregistersactasinputsfortheGaloisFieldblock described next. Shift Registers Galois field addition and multiplication Redundancy Interval Controller clock in_data arst_n out Chapter 3 Transmitter Design and Implementation 34 Figure 3.10 Detailed architecture of Reed Solomon Encoder TheGaloisFieldAdderandMultiplierblockperformsalltheGaloisFieldarithmetic functions. Figure 3.7 depicts the internal architecture of the GF multiplier and adder.Itworksasfollows:R0toR5arebasicallyregisteroutputsthatareshiftedoutintothe GFcircuit(asshowninfigure3.6).Thiscircuitbasicallymultipliescontentsofeach register with a constant multiplier which is established by connections to the XOR gates. Forinstance,R1isconnectedto2ndand4thXORgatessoR1ismultipliedby0101 whichis9andisalsoacoefficientofthegeneratorpolynomial.Inthiswayevery register is multiplied by the corresponding coefficient of the generator polynomial. Hence after these multiplications the products are added. This process takes four clock cycles and in the fourth cycle the result is loaded into R5 as shown in figure 3.6. Reg5 Reg4 Reg3 Reg2 Reg1 Reg0 GF Multiplier and Adder 1 MUX 0 SEL Redundancy interval controller out red red in_data Chapter 3 Transmitter Design and Implementation 35 Figure 3.11 Galois Field multiplier and adder Figure3.6showshowtheRedundancyintervalcontrollerisconnectedtothemain circuit. For the first 36 clock cycles (9x4) the redundancy signal (given the name red in figure3.6)islowanddatabitsgointothecircuitandthroughthemultiplexeraswell. Butafterthattheredsignalgoeshighallowingtheparitybitstopassthroughthe multiplexer. Redundancy is the name given to the interval during which data bits are not allowedtogetintothecircuitandparitybitsarebroughtout.Toachievethis,a6-bit counterisemployed.Usingthiscounterahighoutputisobtainedwhenthecounter counts 36 and it is brought back to low when it counts 60 (36+24). 3.7CONVOLUTIONAL ENCODER ConvolutionalcodingispartoftheForwardErrorCorrection(FEC)donein communication systems. The purpose of forward error correction (FEC) is to improve the capacityofachannelbyaddingsomecarefullydesignedredundantinformationtothe databeingtransmittedthroughthechannel[4].Theprocessofaddingthisredundant information is known as channel coding [4]. Convolutional codes operate on serial data, one or a few bits at a time. There are a variety of useful Convolutional, and a variety of D0D1D2D3 R0 R1 R2 R3 R4 R5 Chapter 3 Transmitter Design and Implementation 36 algorithmsfordecodingthereceivedcodedinformationsequencestorecoverthe original data.Convolutional codes are usually described using two parameters: the code rate and the constraint length. The code rate, m/n, is expressed as a ratio of the number of bits into theConvolutionalencoder(m)tothenumberofchannelsymbolsoutputbythe Convolutional encoder (n) in a given encoder cycle. The constraint length parameter, K, denotesthe"length"oftheConvolutionalencoder,i.e.howmanyk-bitstagesare available to feed the combinatorial logic that produces the output symbols. Convolutional codesareoftenusedtoimprovetheperformanceofdigitalradio,mobilephones,and satellite links. In the proposed design a Convolutional encoder with a code rate of has been chosen i.e. m=1 and n=2. A constraint length of 7 is kept because it is standard and its decoding can be efficiently done using the popular Viterbi Decoding Algorithm. 3.7.1ENCODER DESIGN Figure3.8showstheI/OparametersoftheConvolutionalEncoder.Inputbusis1bit wideandarst_nistheasynchronousresetinput.Anegativeedgeonthearst_ninput resets the encoder. A bit is latched in at the positive edge of the clock. For every input bitthereisatwobitwideoutputdesignatedbyevenandodd.Table3.6gives description of the I/O signals of the Convolutional Encoder. Figure 3.12 Convolutional Encoder I/O Diagram Convolutional Encoder even in clock arst_n odd enable Chapter 3 Transmitter Design and Implementation 37 Table 3.6 Signal descriptions for Convolutional Encoder Signal Name TypeWidthDescription inInput1Input data to the Convolutional Encoder clockInput1Positive edge clock arst_nInput1Asynchronous reset (Negative edged) EnableInput1If high, input is present on the line in evenOutput1Least significant bit of the output oddOutput1Most significant bit of the output ConvolutionalEncodercanbeimplementedusingeitherashiftregisterorbyusing AlgorithmicStateMachine[16].However,ashiftregistergivesaneasytoimplement and area efficient solution. For the configuration of m=1, n=2 and k (constraint length) =7, Figure 3.9 shows how the Convolutionalencoderisimplementedintheproposeddesignusingashiftregister. Initiallyallzeroesarestoredintheregister.Whenthefirstinputbitarrivesitisshifted into the register fromleft and the 2bit output appears on the lines designated aseven and odd. Figure 3.13 Convolutional Encoder: Circuit Diagram 6 5 4 3 2 1 0 evenodd in Chapter 3 Transmitter Design and Implementation 38 The even output is generated by adding the contents of 1st, 0, 3rd, 4th and 6th stages of the shift register, whereastheodd output is generated by adding the 5th, 0, 3rd, 4th and 6thstagesoftheregister.Thisadditionismodulo-2additioncarriedoutthroughXOR gates(modulo-2additionisbasicallyaXORoperation).JustliketheScramblerthe memory elements here are D-flip-flops as well. 3.8INTERLEAVER Interleavingismainlyusedindigitaldatatransmissiontechnology,toprotectthe transmissionagainstbursterrors.Theseerrorsoverwritealotofbitsinarow,but seldom occur. The device that performs interleaving is known as Interleaver. Conceptually, the in-coming bit stream is re-arranged so that adjacent bits are no more adjacent to each other. Actually the data is broken into blocks and the bits within a block arere-arranged.Intheproposeddesign,ablockconsistsof64symbols(128bits). Numberofbitsineachsymboldependsuponthecorrespondingsingle-carrier modulation technique to be applied to produce that symbol.Figure15showshowanInterleaverisgenerallyimplemented[23].Twomemory elements (usually RAMs) are used. In the first RAM the incoming block of bits is stored insequentialorder.ThisdatafromthefirstRAMisreadoutrandomly(usingan algorithm) so that the bits are re-arranged and stored in the second RAM and then read out. Figure 3.14 Interleaving concept Asmentionedabovethattheincomingbitstreamisbrokenintoblocks,when interleaving in the OFDM system the block size should be equal to the size of an OFDM Chapter 3 Transmitter Design and Implementation 39 symbol. Since there are 64 sub-carriers and each sub-carrier is modulated using QPSK, therefore in one OFDM symbol there would be 128 bits. Hence, the job of the interleaver would be to re-arrange the bits within the OFDM symbol. 3.8.1INTERLEAVER DESIGN As discussed above, the function that the interleaver has to perform is to read 128 bits, re-arrangethemandreadthemout.ThiscanbeaccomplishedbyusingRAMsfor temporarilystoringthebitsandthenthebitscanbereadoutfromtheRAMsinthe desiredorder.RememberthattheblockbeforetheinterleaveristheConvolutional Encoderthatgivesanoutputoftwobits.Thereforetheinputbusoftheinterleaver should be two bits wide. Figure 3.15 Interleaver I/O diagram (A top-level architecture) Figure 3.12 shows the top-level architecture of the interleaver. Block Memory Address ROM Controller in 2 clock arst_n out 2 enable Chapter 3 Transmitter Design and Implementation 40 Table 3.7 Signal descriptions for Interleaver Signal Name TypeWidthDescription inInput2Input data to the Interleaver clockInput1Positive edge clock arst_nInput1Asynchronous reset (Negative edged) enableInput1If high, input is present on the line in outOutput2Output of the interleaver Note that the input and output buses are two bits wide. The three building blocks of the interleaver are: Block Memory Controller Address ROM The block memory contains the memory elements necessary to store the incoming block of data. There are a total of four memory elements; each is a 64x1 RAM. Four RAMs are used in order to achieve pipelined operation. Two of these RAMs are used for writing a blockwhileanotherblockisbeingreadoutfromtheothertwoRAMs.Inthiswaythe RAMs are alternately switched between reading and writing modes. Hence, reading and writingisdonesimultaneouslywithoutanylatency.Theconfigurationofeachofthese RAMs is such that two bits are written at a time in two memory locations and one bit is readatatime.Recallthatinputtotheinterleaveristwobitwide,thereforethattakes care of it. Two memories each 64x1 is used instead of a single memory 128x1 because two bits are to be read at a time. While writing a block of data (i.e. 128bits), 16bits are alternately written into the64x1 RAMs. That is to say that first 16 bits are written to the first RAM, next 16 to the second RAM, next 16 again to the first RAM and so on. This is done in order to keep the two bits that have to be read (in desired order) in separate RAMs. Chapter 3 Transmitter Design and Implementation 41 Thejobofthecontrolleristoguidetheincomingblockofdatatothecorrectmemory blocks, to switch the RAMs between reading and writing modes, and to switch between the two RAMs for 16 alternate bits in writing mode. This is done by using counters. TheaddressROMisbasicallya64x6ROMthatstoresreadaddressesfortheRAMs. Note that a single ROM is enough for the four RAMs. This is because only two RAMs at a time are in the read mode and the two bits that are read out of the two RAMs are in the samememorylocationsasperthedesign.EachlocationoftheROMis6bitswide because a 6-bit address is required to read from a RAM having 64 locations. Figure 3.13 shows the circuit diagram of interleaver. Counter1 and Counter2 provide for the write addresses for the four RAMs 1A, 2A, 1B and 2B. Counter C is a 3-bit counter thatcontrolsswitchingbetweeneitherRAM1AandRAM2AorRAM1BandRAM2B dependinguponwhichRAMsareinwritemode.Counter1andCounter2are5-bit countersafterevery8thcountcontrolswitchestoeitherCounter1orCounter2;thisis controlledbyCounterC.TheSYNCsignaldecideswhichRAMsmustwriteandwhich should read. When SYNC is 0 RAM 1A and RAM 2A are in write mode and RAM 1B and RAM 2B in read mode, opposite is the case when SYNC is high. ForthefirstdatablockSYNCremains0andthereforetheblockiswrittentoRAM1A and RAM 2A. When the last bit of the block is written SYNC goes high and RAM 1A and RAM 2A go in read mode, whereas RAM 1B and RAM 2B go in write mode and the next block is written to these blocks. At the same time the previousis read out ofRAMs 1A and 2A in the desired order. ContentsoftheAddressRomareshowninTable.NotethattheoutputofROMis connected to the write address pin of all the four ROMs. Chapter 3 Transmitter Design and Implementation 42 Figure 3.16 Circuit diagram of Interleaver Table 3.8 Contents of Address ROM (in Interleaver) ROM location (Decimal)Contents (Decimal) 00 116 232 348 41 517 633 749 82 918 1034 1150 123 data_indata_out RAM 1A WE w_add r_add data_indata_out RAM 1B WE w_add r_add data_indata_out RAM 2A WE w_add r_add data_indata_out RAM 2B WE w_add r_add Counter1 Counter2 C SYNC Input Output Address ROM Chapter 3 Transmitter Design and Implementation 43 1319 1435 1551 164 1720 1836 1952 205 2121 2237 2353 246 2522 2638 2754 287 2923 3039 3155 328 3324 3440 3556 369 3725 3841 3957 4010 4126 4242 4358 4411 4527 4643 4759 4812 4928 5044 5160 5213 5329 5445 5561 5614 5730 5846 5962 6015 6131 6247 6363 Chapter 3 Transmitter Design and Implementation 44 3.9CONSTELLATION MAPPER ConstellationMappermapstheincomingbitsontoseparatesub-carriers.Inthe proposed design there are 64 sub-carriers and each of them is modulated using QPSK, thereforethefunctionofConstellationMapperwouldbetomapeverytwobitsona single carrier, because in QPSK two bits make up one symbol. Figure 3.14 shows the constellation diagram of QPSK. Mapping of bits on constellation pointsisdoneinaccordancewithgraycodesothatadjacentconstellationpointsmay havejustonebitdifferent.Table3.3showsthedatabitsandthecorresponding constellation points. Figure 3.17 QPSK constellation diagram Table 3.9 Mapping of bits to constellation points Data bitsConstellation point 000.707 + j0.707 01-0.707 + j0.707 100.707 j0.707 11-0.707 j0.707 TheblockbeforeConstellationMapperistheInterleaverwhichgivesanoutputoftwo bits per clock cycle. Therefore, two bits are mapped to a constellation point every clock cycle. 3.9.1DESIGN OF CONSTELLATION MAPPER A ROM is used to store the constellation points. Each constellation point is represented by 48 bits in binary. In these 48 bits, the most significant 24 bits represent the real part Chapter 3 Transmitter Design and Implementation 45 andtheleastsignificant24bitsrepresenttheimaginarypart.Inboththerealand imaginary parts the most significant 8 bits are the integer part and the least significant 16 bitsrepresentthefractionalpart.2scomplementnotationhasbeenusedtorepresent negative numbers. The size of ROM is 4x48. The incoming input bits (2 bits) act as address for the ROM. Table 3.4 shows the ROM contentsateachaddresslocation.EachofthesevaluesintheROMisaconstellation point corresponding to the data bits which here act as addresses for the ROM. Table 3.10 Contents of the ROM (in Constellation Mapper) Address (binary)Contents (HEX) 0000B50400B504 01FF4AFC00B504 1000B504FF4AFC 11FF4AFCFF4AFC Figure 3.18 Constellation Mapper Figure 3.15 shows the circuit of a constellation Mapper. It contains nothing but a ROM. Note that the input is two bits wide and the output is 48 bits wide. For a description of the I/O signals of the constellation mapper see Table. Table 3.11 Signal descriptions for Constellation Mapper Signal Name TypeWidthDescription Address ROM (4x48) Data Input 2 Output 48 Clock Chapter 3 Transmitter Design and Implementation 46 inInput2 Input to the constellation mapper (acting as address for the above shown ROM) clockInput1Positive edge clock outOutput48 Output of the constellation mapper (representing 48 bit complex number) 3.10INVERSE FAST FOURIER TRANSFORM In1971DiscreteFourierTransform(DFT)wasusedinbaseband modulation/demodulationinordertoachieveorthogonality[24].SinceDFThasheavy computationalrequirements,therefore,FastFourierTransform(FFT)wasutilized.For anNpointdiscreteFourierTransformtherequirednumberofcomputationsisN(N-1), but that for FFT/IFFT is Nlog (N), which is much lesser than DFT. TheFFT/IFFToperatesonfinitesequences.Waveformswhichareanaloginnature must be sampled at discrete points before the FFT/IFFT algorithm can be applied. The Discrete Fourier Transform (DFT) operates on sample time domain signal which is periodic. The equation for DFT is:

==10/ 2) ( ) (NnN k je n x k X3.11 X(k)representstheDFTfrequencyoutputatthek-thespectralpointwherekranges from 0 to N-1. The quantity N represents the number of sample points in the DFT data frame. The quantity x(n) represents the nth time sample, where n also ranges from 0 to N-1. In general equation, x(n) can be real or complex. The corresponding inverse discrete Fourier transform (IDFT) of the sequence X(k) gives a sequence x(n) defined only on the interval from 0 to N-1 as follows: Chapter 3 Transmitter Design and Implementation 47

==10/ 2) (1) (NkN k je k XNn x3.12 The DFT equation can be re-written into:

==10) ( ) (NnnkNW n x k X 3.13 The quantity nkNWcan be defined as: N k j nkNe W/ 2 =3.14 This quantity is called Twiddle Factor. It is the sine and cosine basis function written in polar form [13].ExaminationofthefirstequationrevealsthatthecomputationofeachpointofDFT requires the following: (N-1) complex multiplication, (N-1) complex addition (first term in suminvolvesej0 =1).Thus,tocomputeNpointsinDFTrequireN(N-1)complex multiplication and N(N-1) complex addition. AsNincreases,thenumberofmultiplicationsandadditionsrequiredissignificant because the multiplication function requires a relatively large amount of processing time evenusingcomputer.Thus,manymethodsforreducingthenumberofmultiplications have been investigated over the last 50 years [12]. 3.10.1RADIX-22 ALGORITHM When the number of data points N in the FFT/IFFT is a power of 4 (i.e., N = 4v), we can, of course, always use a radix-2 algorithm for the computation. However, for this case, it is more efficient computationally to employ a radix-r FFT algorithm.Chapter 3 Transmitter Design and Implementation 48 In the decimation-in-frequency algorithm, the outputs or the frequency domain points are regrouped or subdivided. Consider the FFT equation: N k jNke n xNk X/ 210) (1) ( =

= 3.15 As an example we consider N=16. We split or decimate the N-point input sequence into four subsequences, x(4n), x(4n+1), x(4n+2), x(4n+3), n = 0, 1, ... , N/4-1. Therefore, we getX(k),X(k+N/4),X(k+N/2)andX(k+3N/4).Thisprocessiscalleddecimationin frequency.ThisdecimationcontinuesuntileachDFTbecomesa4pointDFT.Each4 point DFT is known as a butterfly when we represent it graphically. Figure 3.16 shows a radix-4 FFT butterfly. Since in the proposed design there are 64 sub-carriers so the input to FFT would be 64 complex numbers, hence a 64 point FFT would be required.For a 4n point FFT n stages are required and N/4 4 point DFTs per stage. Therefore in our case there would be 3 stages (64 = 43) and 16 4 point DFTs per stage or we can say 16 butterflies pre stage. Figure 3.19 Radix-4 FFT butterfly Chapter 3 Transmitter Design and Implementation 49 Inthedecimation-in-frequencyFFTalgorithm,theoutputsaredecimated;therefore, inputs to the FFT are given in the actual order [25]. In this way we get the output in a re-arranged order. Intheproposeddesignradix-22DITFFTalgorithmistargetedbecauseitsbutterflyis simple like that of radix 2 and no. of complex multiplications are less like radix 4.Figure 3.17 shows a radix 2 butterfly, its simplicity speaks for itself. Figure 3.20 Radix-2 FFT Butterfly In the radix-22 algorithm, a radix-4 butterfly is created using two radix-2 butterflies. The benefitofusingtheradix2algorithmistheeaseofcontrollingthebutterflyduetoits simplicity and the decreased number of stages and complex multipliers. 3.10.2IFFT DESIGN From here on whenever I mention FFT, it will ll incorporate both IFFT and FFT. Basically therearetwowaystoimplementFFTinhardware,oneisusingpipelinedarchitecture andtheotherisusingmemory-basedarchitecture.Theformerrequireslesshardware resourcesandhenceoccupieslessarea,butrequiresgreaternumberofclockcycles. Ontheotherhandinthememory-basedarchitecturemorehardwareresourcesare requiredbutittakeslessnumberofclockcycles.Intheproposeddesignpipelined architecturehasbeenchoseninordertomaketheFFTdesignareaefficient. Additionally, fixed point FFT implementation has been carried out to avoid any overflows resulting from the complex multiplications.Chapter 3 Transmitter Design and Implementation 50 Figure3.18showstheI/OdiagramofIFFTanddescriptionoftheI/Oparametersis given in Table 3.7. Figure 3.21 IFFT I/O diagram Table 3.12 Signal descriptions for IFFT Signal Name TypeWidthDescription arst_nInput1Asynchronous reset (negative edged) clockInput1Positive edged clock enableInput1 When high data is present on the realinput and imginput lines realinputInput24Real part of the input complex number ImginputInput24Imaginary part of the input complex number RealoutputOutput24Real part of the output complex number ImgoutputOutput24Imaginary part of the output complex number Complex data is fed in one data-point per clock cycle. The enable signal is asserted the clock cycle previous to presenting the first data-point. FigureA.2isablockdiagramofa64-pointRadix-22fixed-pointFFTexample.The moduleconsistsofsixradix-2butterflies,shiftregistersassociatedwitheachbutterfly, two complex multipliers, two twiddle factor generators, and a controller that provides the controlsignals.Thefeedbackshiftregistersvaryinlengthfrom1to32-bits,andare labeled accordingly. IFFT realoutput clock arst_n enable imgoutput realinput imginput Chapter 3 Transmitter Design and Implementation 51 Figure 3.22 Architecture of 64-point-22 FFT Figure 3.23 bf2i and bf2ii radix 2 butterflies Each group of two butterflies, consisting of a bf2i and a bf2ii, together emulate a radix-4 butterfly. Figure 3.19 shows the internals of each and how they are connected together. ThesemodulesoperateonaprincipalknownasSingle-pathDelayFeedback(SDF) [25].TheFFTRadix-2butterflymusthavetwoinputsinordertoproducethenextFFT intermediatevalue,butthedatainourscenarioisavailableonlyinaserialmode.The bf2i bf2ii bf2i bf2ii bf2i bf2ii X X Twiddle Factor Generator Controller freg32freg16freg8freg4freg2freg1 Chapter 3 Transmitter Design and Implementation 52 SDFmechanismprovidesasolutionwherethefirstinputisdelayeduntilthesecond inputispresented,afterwhichthecalculationcanproceed.Boththebf2iandbf2ii modulesaccomplishthisbymultiplexingthefirstinputtoashiftregisterofsufficient length so that that data-point is present at the butterfly input when the second data-point appears. A counter provides the control signals for these multiplexers, which are internal to the butterfly modules. The counter additionally provides signals to the bf2ii for switching the adder operations, andswappingtherealandcomplexinputwires.Thesemechanismseffecta multiplication of the input by j. Inordertoavoidoverflow,thedatasetisscaleddownasitpropagatesthroughthe pipeline. The FFT operation consists of a long series of summations, and thus either the dynamicrangeofthenumericalpresentationmustbelarge(floating-pointofblock floating-point),orthenumericaldatamustbescaleddown.Sincethemoduleisfixed point, the latter strategy is used. 3.11CYCLIC PREFIX ADDER Cyclic prefix is basically a replica of a fractional portion of the end of an OFDM symbol thatisplacedatthebeginningofthesymbol.Itcompletelyremovesinter-symbol interference that can occur due to Multipath. Cyclic prefix is effective only if its duration is greater than the delay spread. 3.11.1DESIGN OF CYCLIC PREFIX ADDER ThearchitectureofcyclicprefixaddersimplyconsistsofanaddressROMthatstores addresses,aRAMtostoreincomingdatainsequentialorderandacounterthat provides read addresses to the RAM. Figure 3.20 shows the top-level architecture of the cyclic prefix adder. Refer to Table 3.7 for the description of I/O signals. Chapter 3 Transmitter Design and Implementation 53 Figure 3.24 Top level architecture of cyclic prefix adder Table 3.13 Signal descriptions for Constellation Mapper Signal Name TypeWidthDescription arst_nInput1Asynchronous reset (negative edged) clockInput1Positive edged clock enableInput1 When high data is present on the realinput and imginput lines InInput48Input complex number OutOutput48Output complex number In the proposed design, the last eight symbols (complex numbers) of the OFDM symbol are replicated at the beginning of the symbol, therefore a total of 72 (64 + 8) symbols are actually transmitted.Cyclic Prefix Adder RAM Address ROM Address counter in clock arst_n enable out Chapter 4 Receiver Design and Implementation 54 CHAPTER 4 RECEIVER DESIGN AND IMPLEMENTATION 4.1 INTRODUCTIONThischapter gives detailed description about the implementation of the receiver part of theproject.ThereceiverhasbeenimplementedonthesameCycloneIIIboard.It consumes about 5600 out of the 24,600 logic elements present in the board. TheOFDMreceivingunitreceivesitsinputdirectlyfromthetransmitterwheneverits outputisavailable.Thereceiverfollowsanexactreverseprocedureofwhichwas followedinthetransmitter.Itreceivesthecomplex(modulated)outputpointsand performs demodulation and recovers the original bits sent to the transmitter. 4.2THE RECEIVER I/O diagram of the receiver is shown in Figure 4.2. We can see that there are no control orstatussignalstoorfromaFIFO;thereasonisthatthemodulateddata,fromthe transmitter,isdirectlyfedtothereceiverasinput.Descriptionoftheshownsignalsis given in Table 4.1. Figure 4.1 I/O diagram of the OFDM receiver OFDM Receiver out_data in_data enable arst_n clock start_output Chapter 4 Receiver Design and Implementation 55 Table 4.1 OFDM Receiver signal descriptions Signal Name TypeWidthDescription in_dataInput48Input data to the receiver clockInput1Clock 20 MHz (output of PLL) arst_nInput1 Asynchronous reset (asserted on negative edge) enableInput1 When asserted data is present on the in_data lines out_dataOutput1Demodulated data coming out of the receiver start_outputOutput1 Asserted when there is data present on the out_data lines Figure 3.2 shows the hardware architecture of the complete OFDM system highlighting thereceiverpartthistime.Thevariousblocksthatconstitutethereceiverareshown. The receiver, just like the transmitter, operates at a clock frequency of 20 MHz provided by the on-board PLL. Nowtherestofthechapterisdedicatedtothedetaileddescriptionanddesignofthe blocks inside the OFDM receiver as shown in Figure 4.2. Chapter 4 Receiver Design and Implementation 56 Figure 4.2 Complete Architecture of the proposed OFDM system (receiver highlighted) 4.3CYCLIC PREFIX REMOVER Thecyclicprefixwasaddedatthetransmittingendinordertoavoidinter-symbol interference, therefore during reception it must be eliminated for any further processing of the received signal. This is done by simply skipping the first eight sub-carriers in the OFDM Receiver HSMC TO SANTA CRUZ CONNECTOR RS232 Receiver FIFO Cyclic prefix remover FFT Constellation demapper Interleaver Viterbi decoder RS decoder Descrambler RS232 Transmitter High Speed Mezzanine Connector Interface RS232 port On board 50 MHz clock PLL Input Control Unit FIFO OFDM Transmitter Chapter 4 Receiver Design and Implementation 57 received OFDM symbol. In hardware this is implemented in the control unit. The control unitonlyenablesthenextblock(FFT)whenthefirsteightbitsofthereceivedOFDM symbols have been skipped. 4.4FAST FOURIER TRANSFORM Details on FFT/IFFT algorithm and hardware implementation were given in the previous chapter.TheonlydifferencebeingthatifitwasgivenforIFFT(althoughFFTwas mentionedatsomeplaces).InordertoimplementFFTinhardwarethealgorithmis same, only the difference is that the divider is removed and the real and imaginary parts attheinputareswappedi.e.realbecomesimaginaryandimaginarybecomesreal. Samegoesfortheoutputi.e.realandimaginarypartsattheoutputareswappedas well. Figure 3.3 depicts the scenario. Figure 4.3 FFT 4.5CONSTELLATION DE-MAPPER ThefunctionoftheconstellationdemapperistomaptheQPSKsymbols(complex numbers)comingfromtheoutputofFFTtothedatapointsshownintheconstellation diagramshown inFigure 3.4. Basicallyit is the inverse procedure ofwhatwas done in the constellation mapper at the transmitter. realinput realoutput IFFT (without divider) imginput imgoutput imginput realinput imgoutput realoutput Chapter 4 Receiver Design and Implementation 58 Figure 4.4 QPSK constellation diagram 4.5.1DESIGN OF CONSTELLATION DE-MAPPER ThemappingofdatapointstoQPSKsymbols(asdoneinthetransmitter)isshownin Table 4.3. Table 4.2 Data points mapped to constellation points Address (binary)Constellation pointsConstellation points (HEX) 000.707 + j0.70700B50400B504 01-0.707 + j0.707FF4AFC00B504 100.707 j0.70700B504FF4AFC 11-0.707 j0.707FF4AFCFF4AFC Therefore,basicallytheincomingconstellationpointsaremappedontothedatapoints as shown in Table 3.4. Figure 4.5 shows the I/O diagram of the constellation demapper and Table 3.5 shows the description of the signals. Figure 4.5 I/O diagram of constellation demapper Constellation Demapper clock arst_n in out Chapter 4 Receiver Design and Implementation 59 Table 4.3 Signal descriptions for Constellation De-mapper Signal Name TypeWidthDescription inInput48Input constellation points clockInput1Positive edge clock outOutput2Output data points corresponding to Table 4.3 arst_nInput1Asynchronous reset (Negative edged) Insteadofgoingintothehardwarearchitecture,thedesignisshownusingtheVerilog code. A simple switch-case structure is used to construct the design. The code is shown below: Figure 4.6 Verilog code showing the logic behind implementation of constellation demapper 4.6DE-INTERLEAVER Inthepreviouschapterinterleavingwasdefinedasaprocessinwhichbits,withina block of 128 bits, are re-arranged in order to avoid burst errors. De-interleaving performs the inverse task. It re-arranges the interleaved bits into their original order. Recalltherow-columnmethodofinterleavingdiscussedinthepreviouschapter.De-interleaving is done the same way, the difference being that the number of rows and the numberofcolumnsforde-interleavingareinterchanged.Forexampleifweperform always @(in) begin case ({in[47], in[23]}) 2'b00: tmp_out = 2'b00; 2'b01: tmp_out = 2'b10; 2'b10: tmp_out = 2'b01; 2'b11: tmp_out = 2'b11; default: tmp_out = 2'b00; endcase end Chapter 4 Receiver Design and Implementation 60 interleavingonablockof16bitsusingamatrixwith8rowsand2columns,thenthe interleaved pattern can be de-interleaved using a matrix with 2 rows and 8 columns. Hence the only difference in the hardware architectures of interleaver and de-interleaver is thecontents of the address ROM, which actually provides the read addresses to the RAM that stores the data to be de-interleaved. Table 4.4 shows the new contents of the address ROM for the de-interleaver. Table 4.4 Contents of Address ROM (in De-Interleaver) ROM location (Decimal)Contents (Decimal) 00 18 216 324 432 540 648 756 81 99 1017 1125 1233 1341 1449 1557 162 1710 1818 1926 2034 2142 2250 2358 243 2511 2619 2727 2835 2943 3051 3159 324 3312 3420 3528 Chapter 4 Receiver Design and Implementation 61 3636 3744 3852 3960 405 4113 4221 4329 4437 4545 4653 4761 486 4914 5022 5130 5238 5346 5454 5562 567 5715 5823 5931 6039 6147 6255 6363 4.7VITERBI DECODERTheViterbiDecoderdecodesConvolutionalcodes.WehaveusedtheAlterasViterbi Decoder IP core in our design. Alteras Viterbi IP core is a parameterized IP core that is synthesizableandallowsforparallelaswellashybridimplementationoftheViterbi decoder. 4.8REED SOLOMON DECODER TheReedSolomondecoderdecodesthecodesgeneratedbytheReedSolomon Encoder.FortheimplementationoftheReedSolomonDecoderwehaveagainused Alteras Reed Solomon Decoder IP. Chapter 4 Receiver Design and Implementation 62 4.9DESCRAMBLER This block simply descrambles the scrambled data. 4.9.1DESCRAMBLER DESIGN Figure3.1shows the input/output parameters of the Descrambler. A bit islatchedin at the positive edge of the clock. See Table 4.5 for a description of the signals. Figure 4.7 De-scrambler I/O diagram Table 4.5 De-scrambler signal descriptions Signal Name TypeWidthDescription inInput1Input data to the Descrambler clockInput1Positive edge clock arst_nInput1Asynchronous reset (Negative edged) enableInput1If high, input is present on the line in outOutput1Output data Figure 4.8 shows the Descrambler. Note that the structure is quite same. Descrambler out In clock arst_n enable Chapter 4 Receiver Design and Implementation 63 Figure 4.8 De-scrambler logic diagram

6543210 + + out in Chapter 5 Simulation, Synthesis and Results 64 CHAPTER 5 SIMULATION, SYNTHESIS AND RESULTS 5.1INTRODUCTIONThis chapter discusses the simulation results obtained from the ModelSim with random inputsamplesandalsotheimportantsynthesisresultsobtainedfromQuartusII.The accuracy of the output has been compared to the output from MATLAB simulation. The resultisdividedinto2differentsections,forOFDMTransmitterandOFDMReceiver. The output from each of the modules is shown and followed by the overall output. 5.2SIMULATION OF OFDM TRANSMITTER 5.2.1SCRAMBLER ToverifyproperfunctioningoftheScramblerwasinitiallyfedwithaseedvalueof 1110101 and the following input bit stream was given to the Scrambler: in: 0110101000 The output was: out: 1101110001

Figure 5.1 Scrambler simulation results

After a dry run of the scrambler using high-level modelling in Verilog it was verified that the output was correct. Chapter 5 Simulation, Synthesis and Results 65 5.2.2REED SOLOMON ENCODER InordertochecktheproperfunctioningofReedSolomonEncoderatestbenchwas written in Verilog. The input given to the encoder through the test bench was a string of alternating 36 (9 symbols) bits starting with 0. Such that: in: 555555555H ItiswellknownintheartthatifalltheinputsymbolstoaReedSolomonencoderare identical, then the parity symbols will all be identical as well and will be equal to the input symbols. Therefore, the output turned out to be out: 555555555555555H Figure 5.2 Reed Solomon Encoder simulation results Otherinputcombinationswerealsogivenanddesiredresultswereachievedthat verified proper functioning of the Encoder. 5.2.3CONVOLUTIONAL ENCODER AftersimulationoftheaboveshownVerilogcodethefollowingwaveformwas generated. It can be seen that first of all a low pulse was given to the arst_n (reset) input inordertoinitializetheshiftregisterwithallzeroes.Nextthefollowingbitstreamwas given at the input, in: 1011101 The output turned out to be, out: 11010001011100Chapter 5 Simulation, Synthesis and Results 66

Figure 5.3 Simulation Waveform of the Convolutional Encoder For a 7 bit input a 14 bit output is generated. Figure 14 shows the resultant waveforms afterthesimulationoftheConvolutionalEncoder.Onceagainthiscircuitwastaken through a dry run using high level modeling in Verilog and the results were verified. 5.2.4INTERLEAVER The waveform for the interleaver goes upto 128 clock cycles. Therefore, it is not shown here. For an input block of data containing alternate 1s and 0s the output was out: 0000000011111111000000001111111100000000.so on This clearly shows how bit positions have been changed. 5.2.5CONSTELLATION MAPPER Followingwaveformshowsthatwhenaninputof10wasgiventotheConstellation Mapper the output was, out: 00b504ff4afch which is correct according to table 3.4.

Figure 5.4 Constellation Mapper simulation results 5.2.6IFFT The IFFT was tested by giving the following 64 complex data points, h00b504000000, h030000000000, h00b504000000,, h00b504000000 Chapter 5 Simulation, Synthesis and Results 67 which is equivalent to 0.707, 3, 0.707,, 0.707 Figure 5.5 IFFT simulation results The outputs were, h2f8bc000000,h5db504000000,h0000005db504andsoon.Onverificationwith MATLAB the results turned out to be correct. 5.2.7CYCLIC PREFIX ADDER The inputs given to the cyclic prefix adder were 47'h000000100101,47'h000010100001,47'h001110100101,47'h110010100101, 47'h000010100101,47'h010101000101,47'h011110100101,47'h000011100101... 47'h000011100101 The outputs turned out to be47'h000011100101,47'h000011100101,47'h000011100101,47'h000011100101, 47'h000011100101,47'h000011100101,47'h000011100101,47'h000011100101, 47'h000000100101,47'h000010100001,47'h001110100101,47'h110010100101, 47'h000010100101,47'h010101000101,47'h011110100101,47'h000011100101... 47'h000011100101 Notethatthefirsteightoutputsareactuallythelasteightinputsandtherestofthe output points are same as the inputs. The following waveform shows the same Figure 5.6 Cyclic Prefix Adder simulation result Chapter 5 Simulation, Synthesis and Results 68 5.3SYNTHESIS OF OFDM TRANSMITTER Table5.1showssomeimportantsynthesisresultsforeachmoduleoftheOFDM transmitter. Table 5.1 Important Synthesis results for the OFDM Transmitter Module (Entity)Number of logic elementsNumber of memory bits Scrambler170 Reed Solomon Encoder490 Convolutional Encoder100 Interleaver38640 Constellation Mapper0192 IFFT19926336 Cyclic Prefix Adder846528 Total219013696 5.4SIMULATION OF OFDM RECEIVER TheCyclicPrefixRemoversimplyremovesthecyclicportionaddedatthetransmitting end,andthesimulationofthenextblockFFTissimilartoIFFTsoitisnotshown.In addition to these blocks the simulations of Viterbi Decoder and Reed Solomon Decoder are also not shown because their ip cores are used in the project. Simulation results for the Constellation De-Mapper, De-Interleaver and De-Scrambler follow. 5.4.1CONSTELLATION DE-MAPPER Asdescribedinpreviouschapters,theconstellationdemapperbasicallymapsthe incoming QPSK constellation points to actual data according to table 3.4. On the following inputs: h00b50400b504 (which is 0.707 + j0.707) and hFF4AFC00B504 (which is -0.707 + j 0.707) The outputs turned out to be, 00 and 01 Chapter 5 Simulation, Synthesis and Results 69 As shown in Figure 5.7. The results are in accordance with table 3.4 Figure 5.7 Constellation De-Mapper simulation results 5.4.2DE-INTERLEAVER Just like the interleaver the simulation waveform of de-interleaver extends to 128 cycles so cant be shown here. 5.4.3DE-SCRAMBLER The inverse of scrambling is done by the De-Scrambler. For the input, b111111111000000000 the output was, b110111111111000010 whichisshowninfigure5.8.TheoutputhasbeenverifiedusingMATLAB(using scrambler block in Simulink). Figure 5.8 Scrambler simulation results 5.5SYNTHESIS OF OFDM RECEIVER Table 5.2 Important Synthesis results for the OFDM Receiver Module (Entity)Number of logic elementsNumber of memory bits DeScrambler190 Chapter 5 Simulation, Synthesis and Results 70 Reed Solomon Decoder2000 Viterbi Decoder900256 De-Interleaver38640 Constellation De-Mapper1000 FFT19926336 Total54397232 References 71 REFERENCES [1]AhmedR.S.BahaiandBurtonR.Saltzberg,MultiCarrierDigital Communications. Kluwer Academic Publishers, 2002. [2] Scrambler (Randomizer), Wikipedia the free encyclopedia http://en.wikipedia.org/wiki/Scrambler_%28randomizer%29. [3] Encoding-DecodingReedSolomoncodes,AdinaMatacheDepartmentof Electrical Engineering University of Washington http://www.ee.ucla.edu/~matache/rsc/node3.html#SECTION00021000000000000000. [4] ATutorialonConvolutionalCodingwithViterbiDecoding,Spectrum Applications http://home.netcom.com/~chip.f/viterbi/tutorial.html. [5] Interleaver, Wikipedia the free encyclopedia http://en.wikipedia.org/wiki/Interleaver. [6] JeffreyG.Andrews,RiasMuhammad,FundamentalsofWIMAX.PrenticeHall Communications Engineering, 2006. [7]AseemPandey,ShyamRatanAgrawalla&ShrikantManivannan,VLSI Implementation of OFDM, Wipro Technologies, September 2002. [8] DusanMatiae,OFDMasapossiblemodulationtechniqueformultimedia applications in the range of mm waves, TUD-TVS, 1998. [9] J.L.Holsinger,Digitalcommunicationoverfixedtime-continuouschannels withmemory,withspecialapplicationtotelephonechannels,PhDthesis, Massachusetts Institute of Technology, 1964. [10] R.W.Chang,Synthesisofband-limitedorthogonalsignalsformultichannel datatransmission,BellSystemsTechnicalJournal,45:17751796,December 1966. [11] R.G.Gallager,InformationTheoryandReliableCommunications.Wiley,1968. [12] S. Weinstein and P. Ebert Data transmission by frequency-division multiplexing usingthediscreteFouriertransform.IEEETransactionsonCommunications, 19(5):628634, October 1971. [13] L.J.CiminiAnalysisandsimulationofadigitalmobilechannelusing orthogonalfrequencydivisionmultiplexing.IEEETransactionson Communications, 33(7):665675, July 1985. [14]LatticeSemiconductorwhitepaper,ImplementingWiMAXOFDMTimingand Frequency Offset Estimation in Lattice FPGAs, 2005. References 72 [15]Doelz,M.L.,HealdE.T.andMartinD.L."BinaryDataTransmissionTechniques for Linear Systems." Proc. I.R.E., 45: 656-661, May 1957. [16] S.B.WeinsteinandP.M.Ebert,Datatransmissionbyfrequency-division multiplexingusingthediscreteFouriertransform,IEEETrans.Communications, COM-19(5): 628-634, Oct. 1971. [17] OrthogonalFrequencyDivisionMultiplexingTutorial,Intuitiveguideto Principles of Communications http://www.complextoreal.com [18] MagisNetworksWhitepaper,OrthogonalFrequencyDivisionMultiplexing (OFDM) Explained, Inc. 2001 [19] OrthogonalFrequency-DivisionMultiplexing(OFDM),theInternationalUnion of Radio Science (URSI), Lulea University of Technology, 2002 [20] Michael D. Ciletti, Advanced Digital Design with the Verilog HD Xilinx DesignSeries. Prentice Hall, 2002. [21] Reed Solomon error-correction code, Wikipedia the free Encyclopedia http://en.wikipedia.org/wiki/Reed-Solomon_error_correction [22] BernardSklar.DigitalCommunications-FundamentalsandApplications. Communication Engineering Services, Tarzana, California, 2003 [23] Interleaver, Wikipedia the free encyclopedia http://en.wikipedia.org/wiki/Interleaver [24] DFT, Wikipedia the free Encyclopediaen.wikipedia.org/wiki/Discrete_Fourier_transform [25] Fast Fourier Transform, Molfram MathWorldmathworld.wolfram.com/FastFourierTransform.html Appendix A RTL code in Verilog for OFDM Transmitter 73 APPENDIX A RTL CODE IN VERILOG FOR OFDM TRANSMITTER //********************************************** // OFDM System - OFDM Transmitter and // Receiver //********************************************** module OFDMSystem ( input in, input clock, input arst_n, output TxD, output start_output ); wire clock1, clock2; wire out_fifo, readreq; wire wrempty; wire readfull, readempty; wire [7:0] in_data; wire wrfull; wire [7:0] q; wire [47:0] out_data; wire idle, RxD_data_ready; wire rdempty1, rdfull1, wrempty1, wrfull1; wire [9:0] rdusedw; wire [6:0] wrusedw; wire TxD_busy, startserialtrans; reg start_serialtrans, start_trans; reg [2:0] skipbytecount; //****************************************** //PLL //****************************************** PLL pll( clock, clock1, clock2 ); //***************************************** //FIFO //***************************************** Appendix A RTL code in Verilog for OFDM Transmitter 74 fifo input_data_fifo ( in_data, clock1, readreq, clock2, RxD_data_ready, out_fifo, readempty, readfull, wrempty, wrfull ); //*************************************** // OFDM Transmitter module //*************************************** OFDM_transmitter transmitter ( clock1, arst_n, out_fifo, out_data, wrfull, readreq, readempty, start_output ); //**************************************** // RS-232 Asyncronous Receiver //**************************************** async_receiver SerialReceiver( clock2,arst_n, in,RxD_data_ready,in_data,idle ); //**************************************** // RS-232 Asyncronous Transmitter //**************************************** async_transmitter serialtrans( clock2, arst_n, start_trans,q,TxD,TxD_busy ); //******************************************* // FIFO for storing transmitter's

Documents

FYP Final Report