of 68/68
Issue 72 Third Quarter 2010 Xcell journal Xcell journal SOLUTIONS FOR A PROGRAMMABLE WORLD SOLUTIONS FOR A PROGRAMMABLE WORLD INSIDE Xilinx Rad-Hard FPGA Reaches for the Stars Biometrics App Rides Reconfigurable Hardware How to Maintain Repeatable Results in Xilinx FPGA Designs Timing Constraints Tutorial ISE Design Suite 12.2 Tips 4th-Generation Partial Reconfiguration INSIDE Xilinx Rad-Hard FPGA Reaches for the Stars Biometrics App Rides Reconfigurable Hardware How to Maintain Repeatable Results in Xilinx FPGA Designs Timing Constraints Tutorial ISE Design Suite 12.2 Tips 4th-Generation Partial Reconfiguration www.xilinx.com/xcell/ Xilinx 7 Series: FPGAs Make Play for Logic IC Dominance Xilinx 7 Series: FPGAs Make Play for Logic IC Dominance

Xcell Journal Issue 72 - Xilinx - All Programmable · Issue 72 Xcell journal Third Quarter 2010 SOLUTIONS FOR A PROGRAMMABLE WORLD INSIDE Xilinx Rad-Hard FPGA Reaches for the Stars

  • View
    213

  • Download
    0

Embed Size (px)

Text of Xcell Journal Issue 72 - Xilinx - All Programmable · Issue 72 Xcell journal Third Quarter 2010...

  • Issue 72Third Quarter 2010

    Xcell journalXcell journalS O L U T I O N S F O R A P R O G R A M M A B L E W O R L DS O L U T I O N S F O R A P R O G R A M M A B L E W O R L D

    INSIDE

    Xilinx Rad-Hard FPGA Reaches for the Stars

    Biometrics App RidesReconfigurable Hardware

    How to Maintain RepeatableResults in Xilinx FPGA Designs

    Timing Constraints Tutorial

    ISE Design Suite 12.2 Tips 4th-Generation Partial Reconfiguration

    INSIDE

    Xilinx Rad-Hard FPGA Reaches for the Stars

    Biometrics App RidesReconfigurable Hardware

    How to Maintain RepeatableResults in Xilinx FPGA Designs

    Timing Constraints Tutorial

    ISE Design Suite 12.2 Tips 4th-Generation Partial Reconfiguration

    www.xilinx.com/xcell/

    Xilinx 7 Series: FPGAs Make Play for Logic IC Dominance

    Xilinx 7 Series: FPGAs Make Play for Logic IC Dominance

  • Learn more about the new Spartan-6 and Virtex-6 FPGA baseboards and FMC modules designed by Avnet at www.em.avnet.com/drc

    Avnet Electronics Marketing introduces three new development kits

    based on the Xilinx Targeted Design Platform (TDP) methodology.

    Designers now have access to the silicon, software tools and

    reference designs needed to quickly ramp up new designs. This

    approach accelerates time-to-market and allows you to focus on

    creating truly differentiated products.

    Critical to the TDP methodology is the FPGA Mezzanine Card

    (FMC) from the VITA standards body. Avnet has collaborated with

    several industry-leading semiconductor manufacturers to create a

    host of FMC modules that add functionality and interfaces to the

    new baseboards, allowing for easy customization to meet design-

    specific requirements.

    ©Avnet, Inc. 2010. All rights reserved. AVNET is a registered trademark of Avnet, Inc.

    New baseboards for Spartan®-6and Virtex®-6 FPGAs» Spartan-6 LX16 Evaluation Kit» Spartan-6 LX150T Development Kit» Virtex-6 LX130T Development Kit

    New FMC Modules for Baseboards

    » Dual Image Sensor FMC

    » DVI I/O FMC

    » Industrial Ethernet FMC

    More are soon to be released!

    Development kits help ramp up new Spartan®-6 or Virtex®-6 FPGA designs

    1 800 332 8638www.em.avnet.com

  • L E T T E R F R O M T H E P U B L I S H E R

    Xilinx, Inc.2100 Logic DriveSan Jose, CA 95124-3400Phone: 408-559-7778FAX: 408-879-4780www.xilinx.com/xcell/

    © 2010 Xilinx, Inc. All rights reserved. XILINX, the Xilinx Logo, and other designated brands includedherein are trademarks of Xilinx, Inc. All other trade-marks are the property of their respective owners.

    The articles, information, and other materials includedin this issue are provided solely for the convenience ofour readers. Xilinx makes no warranties, express,implied, statutory, or otherwise, and accepts no liabilitywith respect to any such articles, information, or othermaterials or their use, and any use thereof is solely atthe risk of the user. Any person or entity using suchinformation in any way releases and waives any claim itmight have against Xilinx for any loss, damage, orexpense caused thereby.

    PUBLISHER Mike [email protected]

    EDITOR Jacqueline Damian

    ART DIRECTOR Scott Blair

    DESIGN/PRODUCTION Teie, Gelwicks & Associates1-800-493-5551

    ADVERTISING SALES Dan [email protected]

    INTERNATIONAL Melissa Zhang, Asia [email protected]

    Christelle Moraga, Europe/Middle East/[email protected]

    Miyuki Takegoshi, [email protected]

    REPRINT ORDERS 1-800-493-5551

    Xcell journal

    www.xilinx.com/xcell/

    Teens to Technologists: Thanks for a Great Childhood’ve seen many things over my years covering the electronics industry, but at the DesignAutomation Conference this past June, I witnessed a first: a group of EEs and EDA folks, eyesabrim with tears of pride. What could cause such a reaction in what would otherwise seem a

    prosaic event? No, it wasn’t a leak from one of the foundry demos, nor an announcement that allEDA tools will henceforth be government subsidized. In fact, the moment came in the closing sec-onds of an event called “High School Panel: You Don’t Know Jack,” when four very bright teen pan-elists looked over the crowd and then did the unexpected: thanked them. “Thank all of you for thechips and technologies you create—thank you for making my childhood so great,” said one of them,without a hint of irony. The gesture prompted engineers from Broadcom, Qualcom, Nvidia,Cadence and Synopsys alike…and yours truly…to all tear up.

    “High School Panel: You Don’t Know Jack” is becoming a regular highlight of the DesignAutomation Conference’s Pavilion Panel series. As in years past, this year’s panel featured JasperDesign Automation CEO Kathryn Kranen interviewing four teenagers (two girls and two boys)about their technology usage, what products are in, what products are out and what features theywould like to see in future gadgets. The panel is meant to give attendees a glimpse into the tech-nology usage of this finicky yet vitally important set of consumers and purchasing influencers. Thisyear’s foursome was exceptionally impressive and surprisingly gracious. Over years of exposure totechnology and social media, these kids have become master multitaskers while still maintainingstellar GPAs (three are off to prominent colleges, while the fourth has one year of high school left).

    If you are the parent of a teen, you probably won’t find it too shocking that all four panelistsdescribed how, from the moment school lets out, they immediately connect to the Internet, mostlyvia laptops. “I have to stay connected,” said one boy. A fellow panelist boots her laptop and down-loads the photos she took that day from her cell phone or camera. She opens the photo files in AdobePhotoshop to airbrush any skin blemishes and then downloads those modified pictures onto herFacebook page. All four panelists said they have tens of photo albums on Facebook and have friendswith hundreds. Increasingly, they are adding video to their Facebook profiles or launching their ownYouTube channels. Facebook is the hub of their social lives because it allows them, as one panelistsummarized, “to see what my friends are doing and what outings I was not invited to attend.”

    The kids gave Facebook, Twitter, YouTube and Hulu big thumbs-up, while giving thumbs-downto the once popular MySpace, which panelists said has turned into a site to merely hear band demos.Panelists liked the iPhone, even though none of them own one because of the relatively high price ofthe phone and the access plan. But they did not like the iPad—“it’s just a big iPod Touch that I can’tfit into my pocket,” said one panelist. Panelists also said they prefer laptops over desktops but notethat desktops are more reliable and are upgradable, which is good for power gaming. They had mixedfeelings about TV, indicating they almost never watch TV on the flat-screen anymore but insteadcatch their favorite shows at a time of their choosing on the Web, typically via Hulu or YouTube.

    The technology improvements these young people would most like to see are largely in line withthe top IC and system design challenges of the day. Longer battery life topped the list. A close sec-ond was devices and applications that better facilitate multitasking. “I have six different IM/chatsthat I use regularly and it’s hard to use all those at once,” said one panelist. “I’d like to have them allin one place.”

    What all these data points mean, I’ll leave you to interpret. But certainly one thing is clear. Thetechnology you create is having a remarkable effect on our youth and, seemingly, the future they willmarch into. And if these DAC panelists are any indication, it’s a future that we can all be proud of.

    I

    Mike SantariniPublisher

  • C O N T E N T S

    VIEWPOINTS XCELLENCE BY DESIGN APPLICATION FEATURES 1212

    2424

    Letter from the Publisher Teen Panelists at DAC Say Thanks for the Memories…3

    Xcellence in A&DNew Xilinx Rad-Hard FPGA Reaches for the Stars…12

    Xcellence in Automotive Multiple MicroBlazes Ease Integration in Real-Time System…18

    Xcellence in ISM Biometrics May Be Killer App for Dynamic Partial Reconfiguration…24

    Xcellence in CommunicationsA Better Crypto Engine, the Programmable Way…32

    Xcellence in Wireless CommunicationsLTE Simulator Rides Xilinx Virtex-5 FPGAs…36

    Cover StoryXilinx Redefines State of the Art with New 7 Series FPGAs

    66

    1818

  • T H I R D Q U A R T E R 2 0 1 0 , I S S U E 7 2

    Xperts Corner Maintaining Repeatable Results in Xilinx FPGA Designs…40

    Xplanation: FPGA 101A Tutorial on Timing Constraints…46

    Xplanation: FPGA 101Simplifying Metastability with IDDR…52

    XTRA READING

    THE XILINX XPERIENCE FEATURES4040

    Xtra, Xtra The latest Xilinx tool updates and patches, as of July 2010…55

    Are You Xperienced? Training at the cusp of the programmable imperative…58

    Xamples… A mix of new and popular application notes…60

    Tools of Xcellence A bit of news about our partners and their latest offerings…62

    Xclamations! Share your wit and wisdom by supplying a caption for our wild and wacky artwork…66

    4646

    6666

    Xcell Journal recently received 2010 APEX Awards of Excellence in the

    categories “Magazine & Journal Writing” and “Magazine & Journal Design and Layout”.

  • 6 Xcell Journal Third Quarter 2010

    Xilinx Redefines State of the Art With New 7 Series FPGAs Xilinx Redefines State of the Art With New 7 Series FPGAs

    COVER STORY

  • The new 7 series is the first XilinxFPGA family created entirely under thewatch of Gavrielov, who joined the compa-ny in late 2007 after serving as a CEO ofVerisity, a design tool provider. Before that,he worked for many years in managementat ASIC house LSI Logic. Gavrielov has setXilinx on an aggressive path to growth,with the main driver being an industry-leading line of FPGAs, which culminates inthe 7 series, and the Targeted DesignPlatform strategy (see cover story, XcellJournal No. 68; http://www.xilinx.com/publications/archives/xcell/Xcell68.pdf ).

    To enable this growth, the 7 series boastsseveral significant refinements, including anew unified and scalable architecture, a pri-mary emphasis on power reduction andmassive capacity, enabling better overall sys-tem performance (Figure 1).

    Starts With a Unified ArchitectureUp until the introduction of the 7 series,Xilinx’s FPGA landscape has primarilycentered on the high-performance Virtexfamily and the high-volume Spartan®

    family. When Xilinx originally intro-duced these two lines in the late 1990s,the Virtex and Spartan devices used radi-cally different architectures. From a userperspective, the two families had notabledifferences, and so did the IP for eachdevice and the design experience in work-ing with them. If you wanted to increasethe size of your end product from aSpartan design to a Virtex design or viceversa, the differences in architecture, IPand pin counts became apparent.

    But with the unified architecture of the7 series, those variances disappear. Withthe 7 series, Xilinx will not be introducinga new device under the Spartan name andhas instead developed a complete lineup ofFPGAs—primarily in three families, fromlowest cost to highest performance—allbased on the familiar Virtex FPGA archi-tecture (Figure 2).

    Virtex remains the moniker for the 7series’ highest-end FPGAs. This newVirtex-7 family delivers breakthroughcapacity with up to 2 million logic cells andbetter than twofold system performanceimprovement over previous generations.

    by Mike SantariniPublisher, Xcell JournalXilinx, [email protected]

    FPGAs have advanced remarkably ever sincethey first hit the market in the mid-1980s as1,500-ASIC-gate-equivalent devices. Twodecades later, with the launch of Xilinx’s new7 series, the FPGA stands poised to fulfill itshistoric promise of one day displacing ASICsas the electronics industry’s mainstream logicIC. With the introduction of the 7 seriesFPGAs, Xilinx® is transitioning from beingjust a PLD maker to a premier supplier oflogic ICs by providing lower total cost ofownership for low- to medium-volumeapplications and equivalent total cost ofownership for higher-volume applicationstraditionally addressed by ASICs and ASSPs.What’s more, this total-cost-of-ownershipbenefit combines with the traditional FPGAadvantages of faster time-to-market and riskreduction. Together, all of these factors meanthat FPGAs are emerging as the de factologic IC solution for most applications.

    As part of the 7 series release, Xilinx willbring to market an unprecedented 2 mil-lion-logic-cell FPGA, which is 2.5x thecapacity of the largest Virtex®-6 device.Depending on whom you ask, how youdesign and what application you are target-ing, that means the largest 7 series FPGAdelivers the clout of anywhere from 15 mil-lion to 40 million equivalent ASIC gates.Thus, in the last 10 years, Xilinx hasincreased the capacity of its FPGAs by morethan 30x at equivalent price points of thedevices it produced 10 years ago.

    But a huge capacity increase is onlythe beginning of the 7 series story. Thesebeefy FPGAs run faster than the previ-ous-generation Virtex-6, but at half thepower consumption.

    “ASICs are not dead, nor will they dieentirely, but they really are only viable fora very small number of applications thathave the very highest volumes,” saidMoshe Gavrielov, Xilinx’s chief executiveofficer. “Where once you had to ask whywould you go with an FPGA, today youhave to seriously ask yourself why wouldn’twe use an FPGA?”

    Third Quarter 2010 Xcell Journal 7

    Three families of 28-nm devices attack the mainstream and high-end ASIC and ASSP markets.

    COVER STORY

  • For a smooth transition from Spartan-6FPGAs in the low-cost market, the newArtix™-7 family leads the industry inprice, power and small form factor for cost-sensitive, low-power applications.

    The final member of the trio smoothlyfills the space between the Virtex-7 on thehigh end and the Artix-7 line at the mass-market level. The Kintex™-7 introduces anew price/performance advantage andgives Xilinx a platform for displacing main-stream ASICs and ASSPs.

    Victor Peng, senior vice president forprogrammable platforms development at

    Xilinx, predicts that having a solidmidrange product in the form of Kintex-7will allow Xilinx to offer a comprehensiveFPGA lineup that is more highly applica-tion targeted.

    “In previous generations, Xilinx wouldfill that middle ground by creating a high-er-performance, higher-capacity version ofSpartan and at the same time a lower-cost,lower-capacity and lower-performance ver-sion of Virtex,” said Peng. “But theSpartan and Virtex architectures, IP andpin counts were very different. With theArtix, Kintex and Virtex families all built

    in a unified 7 series architecture, cus-tomers will find it much easier to migratetheir designs across the families, enablingbetter leverage of their developmentinvestments in IP.”

    They will also be able to migrate designblocks in the 7 series to the logic portion ofthe upcoming Extensible ProcessingPlatform (see cover story, Xcell JournalNo. 71; http://www.xilinx.com/publications/archives/xcell/Xcell71.pdf ), because the EPPand 7 series devices all use the sameVirtex logic architecture structure. Further,the common logic architecture supports

    the ARM AXI4 (Advanced ExtensibleInterface) protocol. This means that Xilinx’sinternal IP developers and hundreds of IPpartners can more easily target and imple-ment AXI-compliant IP on Xilinx FPGAs.Chances are many customers have theirown IP already built to comply with AXI,further facilitating moving designs fromASIC or ASSP to 7 series FPGAs.

    Peng notes that in addition to offeringgreat benefits for customers and IP part-ners, the unified architecture allowsXilinx to become more focused andaligned in all product development efforts

    going forward. “It means our organizationcan focus on doing things once ratherthan twice,” said Peng.

    28-nm HPL: Right Mix of Power, Capacity and Performance With the new 7 series family, Xilinx hasalso modified its manufacturing strategy tobetter align with the realities of modern ICdesign by choosing to implement itsdevices on a newly refined high-k metalgate (HKMG) high-performance, low-power (HPL) process with Taiwanesefoundry TSMC.

    Traditionally, FPGA vendors have imple-mented their designs on the highest-per-formance variation of each new siliconprocess as fast as foundries could make theprocesses available. However, starting with90-nm process technologies, leakage startedto become a big problem. It only got worseat 65 nm and 40 nm. At the 28-nm processnode, if unaddressed, leakage current canaccount for well over 50 percent of a device’spower consumption. In addition to usingpower when a device isn’t running, duringoperation leakage creates extra heat, whichin turn increases the leakage. Especially incontinual-use, high-performance applica-tions, this vicious cycle can lead to shorteneddevice lifetimes and catastrophic IC failures.This greatly impacts the viability of using anFPGA in a given application as well as thereliability of a system.

    The foundries have made remarkablestrides to stem the leakage in their high-performance processes at 28 nm. Xilinxworked with its new foundry partner,TSMC, to refine the foundry’s newHKMG HPL process for the 7 seriesFPGA, emphasizing low power combinedwith the usual gains in capacity and systemperformance when shrinking the geometry.

    Peng said that by going with the HPLrather than the HP process, Xilinx willreduce static power by 50 percent with lessthan a 3 percent impact on performance.The use of the HPL process combined withthe comprehensive power-savings enhance-ments implemented in the 7 series results ina 50 percent reduction in total power com-pared with devices at the same densities inthe last generation of products.

    8 Xcell Journal Third Quarter 2010

    COVER STORY

    2x Price / Performance

    FPGAMarket

    2x PowerReduction

    2x System Performance

    2.5xCapacity

    ASICs/ASSPMarket

    40nm

    7 Series(28nm)

    Figure 1 – The new Xilinx 7 series pushes the boundaries for FPGAs.

  • Third Quarter 2010 Xcell Journal 9

    The 50 percent lower-power benefitgives design teams two options, said Peng.“You either run a similar-size Virtex-6 orSpartan-6 design at half the power in the 7series, or you can double the size of the logicfunctions in your [new] design and remainat your previous power budget,” he said. “Bygoing with the HPL process, we have givencustomers much more usable performanceas well as more logic gates to implementmore functions in their designs.”

    Xilinx CEO Gavrielov notes that bychoosing the higher-capacity but lower-power variant of the 28-nm process, Xilinxis leading the FPGA industry in aligningwith the microprocessor industry. Almost adecade ago, MPU makers realized thatcranking up clock rates in these newerprocess geometries would only createextremely leaky, thermally challengeddevices that would fail.

    “We learned from the processor side ofthe semiconductor business that given therealities of processes today, the best way to

    achieve performance is through higherintegration and efficiency, as opposed tosimply making things move faster,” saidGavrielov. “With today’s processes, if youjust make things faster, you drain morepower and create thermal problems—which degrades power and performance.We need to pay a lot of attention to end-customer applications and ensure we strikethe right balance between meeting low-power needs while simultaneously meeting

    the system performance needs of the appli-cation. We believe we will deliver a greatvalue proposition with the 7 series FPGAsthat will delight customers.”

    Peng notes that had Xilinx gone withan HP process for an incremental clockspeedup, the significant increase in powerfor the fairly insignificant increase in per-formance would have burdened users withpaying extra attention to designing aroundpower and thermal issues. They mighthave incurred extra systems costs associat-ed with adding elaborate heat spreading or

    even fan or liquid cooling and relatedpower circuitry to the end system.

    HPL is just one of about a dozen tech-nologies Xilinx employs to reduce power inthe 7 series, Gavrielov said. For example,Xilinx reduced configuration logic voltagefrom 2.5 to 1.8 V, and optimized each of thehard blocks—DSP, Block RAM, SelectIO™and others—using HVT, RVT and LVT tran-sistors to reduce static power while optimiz-ing performance and area. As a result, each

    DSP slice consumes 1/12 the power of theequivalent logic implementation. By optimiz-ing the ratio of these tightly integrated hardblocks throughout the FPGA fabric, Xilinxwas able to achieve the greatest performanceand lowest power while preserving flexibility.

    Customers can also use the intelligentclock-gating feature introduced in the ISE®

    Design Suite 12 to give their 7 series designsan additional 20 percent reduction indynamic power consumption. And finally,users can get a dramatic savings in powerconsumption by leveraging Xilinx’s fourth-

    COVER STORY

    Unpublished Work © Copyright 2009 Xilinx

    3 New Families Based on a Unified Architecture

    Industry’s Best Price / Performance

    Compared to Virtex-6

    ✔ Comparable performance

    ✔ 50% lower cost

    ✔ 50% less power

    Industry’s HighestSystem Performance

    and Capacity

    Compared to Virtex-6

    ✔ 2.5x larger

    ✔ Up to 2M logic cells

    ✔ 1.9Tbps serial bandwidth

    ✔ Up to 28Gbps line rate

    ✔ EasyPath cost reduction

    Lowest Power and Cost

    Compared to Spartan-6

    ✔ 30% more performance

    ✔ 35% lower cost

    ✔ 50% less power

    ✔ 50% smaller footprint

    All Optimized for Power & Improved Price/Performance

    Common Logic Cells, BRAMs, Interfaces

    Easy Design Scalability

    Figure 2 – The three new families in the 7 series unified architecture offer users a smooth path from lowest cost to highest volume.

  • generation partial-reconfiguration method-ology to effectively “turn off ” portions ofthe design when they are not in use.

    The upshot? By going with an HPLprocess, taking other power reductionmeasures and rolling out its new devices ina unified architecture, Xilinx now offers acomprehensive line of FPGAs, from thehigh-volume low-power lines to thoseboasting the highest system performanceand capacity the industry has seen to date.

    The Virtex-7, Kintex-7 and Artix-7 FamiliesPatrick Dorsey, senior director of market-ing at Xilinx, said the three new families inthe 7 series will allow Xilinx to capture aneven greater share of the ASIC and ASSPmarket and penetrate even more deeplyinto a broader number of vertical markets,from low-powered medical devices to thehighest-performing wired and wireless net-working equipment.

    At the entry level, “The new Artix-7family provides the lowest absolute powerand cost, with small-form-factor packag-ing,” said Dorsey. Densities range from20,000 to 355,000 logic cells. The devicesare 30 percent faster and consume 50 per-cent less power than Spartan-6 FPGAs at35 percent lower price points. When mov-ing from Spartan-6 FPGAs to Artix-7devices, designers can expect up to 85 per-cent lower static power and up to 35 per-cent lower dynamic power consumption.

    GTP serial transceivers support linerates up to 3.75 Gbits/second. Other keyfeatures include 3.3-volt-capable I/O forinterfacing to legacy components and wire-bond packaging for the lowest cost, withoptional chip-scale packaging for the small-est form factor and 1.0-mm ball spacing forlow-cost PCB manufacturing.

    Dorsey said that because the new familyis based upon the Virtex architecture, it alsonow includes many of the advanced featuresof the Virtex family that were not availablein the Spartan line. For example, the Artix-7includes an enhanced System Monitor ana-

    log function, now called XADC (analogcapability), to allow users to monitor thefunctionality, temperature, touch sensor,motion control and other real-world analogactivities in the system. The integratedXADC technology will enable a whole newclass of mixed-signal applications.

    Further, with these refined specs, theArtix-7 FPGAs better target the low-powerperformance requirements for applicationssuch as ultrasound equipment. The devicesalso now address the small-form-factor, low-power requirements of lens control modulesfor high-end commercial digital cameras aswell as next-generation automotive infotain-ment systems driven by 12 V. Artix-7 devicesalso meet the strict SWAP-C (size, weight,power and cost) requirements of militaryavionics and communications applications.

    Kintex-7 FPGA FamilyDorsey said that with the new midrangefamily, Kintex-7, Xilinx now providesFPGAs with the best price-for-perform-ance on the market. “With the Kintex-7family, we offer devices that are less thanhalf the price and power consumption ofVirtex-6 FPGAs but equal in performanceand functionality,” said Dorsey.

    The Kintex-7 devices will be especiallywelcomed in applications that require cost-effective signal processing, he said. That’sbecause they offer abundant DSP slices(from 120 to 1,540), up to 5,663 kbits ofdistributed static RAM and 28,620 kbits ofinternal block static RAM, and between fourand sixteen 10.3-Gbps GTX serial trans-ceivers. Dorsey said Kintex-7 devices will beequally attractive to Virtex users seeking alower-cost alternative as well as to customerswho have traditionally used Spartan FPGAsbut are scaling their designs to the next levelof system performance. Indeed, with logicdensities ranging from 30,000 to 400,000gates and with 40 percent higher perform-ance than Artix-7 FPGAs, Kintex-7 devicesequal the performance of Virtex-6 and aresignificantly faster than Spartan-6 FPGAs.

    Dorsey said the Kintex-7 parts are idealfor implementing Long Term Evolution(LTE) wireless radio and baseband subsys-tems. And thanks to the recent release ofXilinx’s fourth-generation partial-reconfigu-ration technology, 7 series customers can fur-ther reduce power and cost, for widedeployment in femto, pico and mainstreambase stations. The serial connectivity, memo-ry and logic performance of these devices is agood fit for high-volume wired communica-tions as well, Dorsey added, citing equip-ment such as 10G passive optical network(PON) optical line terminal (OLT) line cardsthat bring high-speed networking to theneighborhood and home as one example.

    In addition, Kintex-7 FPGAs are alsosuited for use in high-definition 3D flat-panel displays in consumer electronics mar-kets; video-over-Internet Protocol bridgesthat enable next-generation broadcast video-on-demand systems; and high-performanceimage processing required for militaryavionics and ultrasound equipment that cansupport up to 128 high-resolution channels.

    Virtex-7 FPGA FamilyFor its part, the high-end Virtex-7 familytakes the industry’s most successful FPGAarchitecture to new heights by deliveringmore than a doubling in capacity and 30percent faster system performance alongwith 50 percent lower power than theVirtex-6 FPGA predecessors.

    Dorsey said the Virtex-7 FPGAs are wellsuited for communications systems requir-ing the highest performance, capacity andbandwidth. With its Virtex-7T and Virtex-7XT variants, this FPGA line boasts ultra-high-end devices that push the limits ofFPGA technologies in terms of the numberand performance of embedded serial trans-ceivers, DSP slices, memory blocks andhigh-speed I/O to establish new bench-marks for the industry.

    Virtex-7 devices offer up to 36 GTX10.3-Gbps serial transceivers, ultrahigh-endlogic capacity with as many as 2 million

    10 Xcell Journal Third Quarter 2010

    COVER STORY

    T h e Vi r t e x - 7 f a m i l y t a k e s t h e i n d u s t r y ’ s m o s t s u c c e s s f u l F P G Aa r c h i t e c t u r e t o n e w h e i g h t s b y d o u b l i n g t h e c a p a c i t y a n d b o o s t i n g s y s t e m p e r f o r m a n c e a t 5 0 p e r c e n t l o w e r p o w e r.

  • Third Quarter 2010 Xcell Journal 11

    logic cells and the highest parallel I/Obandwidth in the industry, with up to1,200 SelectIO™ pins. This I/O configura-tion enables the greatest number of parallelbanks of 72-bit DDR3 memory availableon the market, supporting 2,133 Mbps.

    Meanwhile, the new Virtex-7XTdevices also provide the highest serialbandwidth in a single FPGA, with up to72 GTH transceivers at 13.1 Gbps, or 80GTH and GTX transceivers (24 runningat 13.1 Gbps and 56 at 10.3 Gbps, respec-tively). In addition, the devices featurehigher DSP-to-logic ratios for greaterthroughput, with up to 3,960 DSP slicesat 600 MHz delivering 4.7 TMACs. Also,the 7XT FPGAs have higher on-chipBRAM-to-logic ratios with up to 65Mbits, for low-latency data buffering.Dorsey said that Xilinx will eventually adddevices with 28-Gbps transceivers to thisfamily; the release details are forthcoming.

    The new Virtex-7 FPGAs target the high-est-performance wireless, wired and broad-cast infrastructure subsystems, said Dorsey.The teraMACC signal-processing capabili-ties of Virtex-7 FPGAs enable advancedradar and high-performance computing sys-tems. Product developers can replace ASICsand multichip-set ASSP solutions with sin-gle-FPGA implementations of 100GE linecards to increase bandwidth, while simulta-neously lowering the power and cost. Otherapplications include 100-Gbit OpticalTransport Network (OTN) muxponders forintegrated multiplexer/transponder applica-tions, 300G Interlaken bridges and 400Goptical network cards.

    In addition, these ultrahigh-end devicesprovide the logic density, performance andI/O bandwidth needed to build next-gen-eration test and measurement equipment.For systems where ASIC production is jus-tified, Virtex-7 FPGAs enable designers touse fewer devices during prototyping andemulation in order to lower cost andreduce interconnect/design complexity.

    EasyPath—a Further Cost AlternativeDorsey said the company’s EasyPath™ pro-gram extends the value of Xilinx 7 seriesFPGAs to provide the lowest total cost ofownership for medium- to higher-volume

    applications on the order of 100,000 units.This total cost of ownership requires thatcustomers assume only the development andunit cost. In addition, they receive the fulladvantages of time-to-market and risk reduc-tion that FPGAs offer. This further bolstersXilinx’s value as a strategic logic IC supplier.

    EasyPath provides a cost reduction bycoupling Xilinx’s FPGA manufacturingprocess to the customer’s design. This resultsin the same silicon with the same features,but only guaranteed to work with a givendesign. Dorsey said that EasyPath-7 takes sixweeks to complete from design freeze andoffers a guaranteed 35 percent cost reductionand no minimum-order quantity, with nocustomer engineering effort required—all fora $300,000 nonrecurring engineering cost

    “Now you have the peace of mind thatonce you design the FPGA, you can get tolower cost by targeting either Kintex-7 orArtix-7 and, if further cost reductions arenecessary to support higher volumes, bygoing to EasyPath-7,” said Dorsey. “What’smore, if you’ve already completed yourFPGA design and want to go with EasyPath,you can let your purchasing department han-dle the rest, since no further customer engi-neering resources are required.”

    Next-Generation Targeted Design PlatformsAlong with announcing the new family,Xilinx is also launching a second generationof Targeted Design Platforms, application-specific design aids which the company firstrolled out in 2009 in tandem with therelease of the Virtex-6 and Spartan-6FPGAs. Xilinx’s Targeted Design Platformstrategy gives system designers access tosimpler, smarter design methodologies forcreating FPGA-based solutions through theintegration of five key elements: FPGAdevices, design tools, IP, development kitsand targeted reference designs.

    Early-access ISE Design Suite softwaresupporting the new FPGA families hasshipped to a limited number of early-adopter customers and partners. First ship-ments of the devices will begin in the firstquarter of 2011.

    For more information on the 7 seriesFPGAs, visit http://www.xilinx.com/technology/roadmap/7 series-fpgas.htm.

    COVER STORY

  • 12 Xcell Journal Third Quarter 2010

    New Xilinx Rad-Hard FPGA Reaches for the StarsNew Xilinx Rad-Hard FPGA Reaches for the Stars

    XCEL LENCE IN AEROSPACE & DEFENSE

  • by Maury WrightPresidentWDC Marketing

    Electronic systems designs headed to spacenaturally require high reliability, but thedesign task is further complicated by expo-sure to radiation that can cause sporadic cir-cuit failures. From a functional perspective,FPGAs with inherent reconfigurable attrib-utes are a perfect match for space. FPGAsenable a single system to perform multipletasks and let mission teams remotely recon-figure a system, either fixing a bug oradding new functionality. Now Xilinx® hasan FPGA—the Virtex®-5QV—that is rad-hard and can deliver the full benefits ofprogrammability to space programs. Thedesign teams get an off-the-shelf solutionwith all the advantages of a 65-nanome-ter commercial SRAM-based FPGA,including ready access to developmentand prototyping tools.

    It’s hard to underestimate the value anFPGA can offer in an application such asspace-bound systems. Once a system, satel-lite, rocket or spacecraft is deployed, thereis little or no ability to make hands-onchanges to it, so the reprogrammability ofan FPGA is a huge benefit. To be sure,microprocessors and microcontrollers canalso be reprogrammed. But FPGAs excel indata-flow applications where functionssuch as packet inspection or signal-process-ing algorithms implemented in hardwarelogic offer far more processing throughputthan do traditional microprocessors. Andthe FPGA hardware can be easily reconfig-ured to support new algorithms.

    Given the advances in circuit densityand the mix of hardwired IP blocks andconfigurable logic, the latest FPGA tech-nologies can capture the bulk of a sys-tem’s functionality. For example, theVirtex-5QV includes Ethernet MACfunctions and high-speed transceivers togo along with DSP slices and config-urable logic (for details on the FPGAcapabilities, see sidebar, next page).

    A rad-hard IC that's derived from acommercial FPGA family also offers signif-icant benefits in the development process.Design teams can do development work

    Third Quarter 2010 Xcell Journal 13

    Virtex-5QV device offers a flexible,cost-effective alternative for design teams working on advanced, reconfigurable space applications.

    Virtex-5QV device offers a flexible,cost-effective alternative for design teams working on advanced, reconfigurable space applications.

    XCEL LENCE IN AEROSPACE & DEFENSE

  • with readily available commercial devicesand development tools and then seamlesslymove the design to the rad-hard targetsystem platform at any point in the devel-opment process.

    Space Presents Reliability ChallengesTo deploy FPGAs in space applications,however, designers have to understand theenvironment and learn how to mitigateissues that affect reliability. For example, anumber of radiation-induced effects havebeen identified as a problem area for space-based designs. The list includes single-event upsets, single-event functionalinterrupts, single-event latchups, single-

    event transients and total ionizing doseeffects. (See the second sidebar for moreinformation on these effects.)

    Designers working on space applica-tions haven’t traditionally had the free-dom to use ICs such as FPGAs withoutcarefully considering ways to mitigateradiation effects. Specialty ASIC houseshave radiation-hardened IC manufactur-ing processes. But ASIC design cycles arelengthy and expensive, and the quantityof devices the application will actuallyneed simply doesn’t justify the time andeffort, given viable alternatives.

    The radiation-hardened ASIC processesare also many generations behind state-of-

    the-art commercial IC processes. Forexample, the rad-hard ASICs are still in the150-nm or less-dense process nodes.Indeed, modern FPGAs offer performanceand circuit density that match those ofradiation-hardened ASICs, along withmuch faster development cycles.

    Radiation Tolerance and TMRIn the past, designers who wanted to useFPGAs have had to combine radiation-tol-erant ICs with techniques that further mit-igate single-event upset (SEU) effects.Xilinx has long addressed the need for radi-ation resistance in space-targeted designs.Radiation-tolerant FPGAs such as the

    14 Xcell Journal Third Quarter 2010

    XCEL LENCE IN AEROSPACE & DEFENSE

    Rad-Hard FPGA Delivers State-of-the-Art Benefits

    The Virtex-5QV offers a unique value proposition. This FPGA is rad-hard out of the box and also offers state-of-the-artreprogrammable-logic density and hardwired IP blocks. Design teams working on space applications get ASIC-likecircuit density without the ASIC NRE costs.The FPGA includes more than 130,000 logic cells for large, complex designs. The architecture is based on six-input LUTs

    and the IC employs a diagonal interconnect structure that ultimately packs designs more efficiently in terms of silicon utiliza-tion and results in better performance and lower power consumption.

    The design is based on the second generation of Xilinx’s Advanced Silicon Modular Block (ASMBL™) column-based archi-tecture. ASMBL has allowed Xilinx to produce mixes of configurable logic and hardwired IP that are optimized for specificapplications.

    The Virtex-5QV includes 320 Enhanced DSP slices to complement the programmable logic. Each slice includes a 25 x 18-bit multiplier, an adder and an accumulator. Designers can cascade the IC’s 36-kbit Block RAM elements to produce large, gen-eral-purpose memory arrays. The device includes 298 such blocks. Each block can also be configured as two 18-kbit blocks, sothere is little wasted silicon for applications requiring smaller RAM arrays.

    For networking and I/O operations, the Virtex-5QV includes a number of hardwired IP blocks. Six Ethernet media-accesscontroller (MAC) functions can operate in 10-, 100 and 1,000-Mbps modes. Eighteen RocketIO™ transceivers support datatransfers at rates ranging from 150 Mbps to 3.125 Gbps. The MACs can use some of the RocketI/O transceivers for physical-layer (PHY) connections or link to external PHYs via a soft Media Independent Interface implemented in programmable logic.

    The IC also includes three PCI Express® blocks compatible with the PCI Express Base Specification version 1.1. Designs canimplement x1-, x4- or x8-lane channels with each of the three blocks. The RocketIO transceivers are also available for PCIExpress I/O.

    The device features a number of other functions important in high-performance system designs. Six clock management tiles(CMTs) can each generate clocks that operate up to 450 MHz. Each CMT includes dual digital clock managers (DCMs) anda phase-locked loop (PLL). The DCMs enable zero-delay buffering, frequency synthesis and clock-phase shifting. The PLLs addsupport for input jitter filtering and phase-matched clock division.

    Xilinx will manufacture the IC in a 65-nm copper CMOS process with a 1-V core voltage. A ceramic flip-chip column grid arraypackage will ensure signal integrity. And Xilinx will guarantee operation over the full military temperature range of -55ºC to +125ºC.

    – Maury Wright

  • Third Quarter 2010 Xcell Journal 15

    Virtex-4QV have immunity to single-eventlatchup (SEL), and can withstand a totalionizing dose (TID) up to 300 krads(Si).Xilinx combines these radiation-tolerant ICswith system-level techniques such as triple-modular redundancy (TMR) to ensure relia-bility. In a TMR design, three separateinstantiations of a system perform the sametask. A voting circuit compares the resultsand considers it correct if at least two sys-tems produce the same result.

    Xilinx has developed a tool that cangreatly simplify the implementation of aTMR methodology. The TMRTool acceler-ates the design cycle by allowing the designteam to focus on design and debug ratherthan TMR. The tool works seamlessly withany HDL and synthesis tool to automati-cally build TMR into a design.

    The TMRTool also goes beyond base-line TMR functionality. It triplicates allclocks and throughput logic to protectagainst single-event transients (SETs). Italso triplicates feedback logic and insertsmajority voters on all feedback paths. Andthe tool triplicates all outputs and usesminority voters to detect and disable incor-rect output paths.

    By inserting voters on all feedback paths,the TMRTool overcomes a problem with thetechnology in designs with finite statemachines. In most TMR-based state-machine designs, an SEU that causes an errorin one of the three state machines ultimatelyrequires that the state machines be reset forsynchronization. But the voters in feedbackpaths ensure that the state machines remaincontinuously synchronized and can operatecontinuously through SEUs.

    Rad-Hard by DesignWhile earlier FPGAs such as the Virtex-4QV have been successfully deployed inspace applications and have been radiation-tolerant, the new Virtex-5QV was designed

    from the ground up with a rad-hard-by-design (RHBD) methodology. The result-ing FPGA is truly a rad-hard space-gradeIC. Where the Virtex-4QV requires thatdesigners add mitigation, the Virtex 5QVis rad-hard out of the box.

    The Virtex-5QV design team’s specificgoal was to provide intrinsic hardness from

    SEU, SET and other effects to critical circuitelements in the device. As with all SRAM-based FPGAs, configuration memory con-trols all aspects of device operation and istherefore critical to reliable operation. TheVirtex-5QV design utilizes dual-node latchesthat control write operations to memorycells. Writes occur only when both latches areenabled synchronously. The implementationoffers 1,000 times the hardness to SEUs rel-ative to memory latches in commercial ver-

    sions of the FPGA. Moreover, the latch is vir-tually impervious to proton interaction.

    Upset Hardening in HardwareEffectively obviating the need for TMR atthe application design level, the Virtex-5QV design team used a variety of tech-niques in implementing the underlying

    FPGA memory and circuit elements in thedevice. Xilinx took special care in harden-ing the 35 million configuration cells andthe 81,920 user flip-flops. Both incorpo-rate a clever self-redundant storage circuitthat has double the normal number oftransistors. The result is a very low suscep-tibility to SEU. Xilinx optimized the layoutof this important structure by fabricatingmany variants and subjecting them to in-beam irradiation studies.

    XCEL LENCE IN AEROSPACE & DEFENSE

    Figure 1 – Xilinx built its latest space-grade FPGA, the Virtex-5QV, to be rad-hard.

    The Vir tex-5QV design team aimed to provide intr insic hardness fromsingle-event upsets and transients to cr i t ical e lements in the device.

    A novel la tch implementat ion of fers almost 1,000 t imes the hardness to SEUs of commercial versions of the FPGA.

  • In addition, Xilinx made sure that all theclock, data and asynchronous inputs to theflip-flops are protected. These are capable ofsuppressing single-event transients and pre-vent them from turning into upsets.

    Xilinx employs proprietary methods toprotect against SEUs during the critical start-

    up period when the FPGA is configured.One result is that designers don't have totake extra steps to ensure the elimination ofsingle-event functional interrupts (SEFIs),which traditionally require an intrusiveFPGA restart and reconfiguration. GarySwift, senior staff engineer for space products

    development and radiation testing at Xilinx,said the implementation reduces GEO (geo-stationary earth orbit) environment SEFI ratesby more than two orders of magnitude (rela-tive to the already low once per century for therad-tolerant Virtex-4QV family) to about thanone SEFI fault in 10,000 years. “Interestingly,

    XCEL LENCE IN AEROSPACE & DEFENSE

    Understanding Radiation-Induced Effects

    Going back to the 1950s, engineers have documented the adverse effect that radiation can have on electronic circuits.With the advent of ICs and the constant move to finer process geometries, the potential for radiation-induced errorsgrew. The impact ranges from soft errors that are easy to detect and correct to actual device failures. Design engineers working on space applications must prepare for a number of problems.

    • A single-event upset (SEU) is a change of state in an IC, such as a change in the value of a memory bit caused by a radi-ation strike. SEUs are also called soft errors because the instance of an SEU has no long-term effect on the IC and infact, the soft error can often be found and fixed.

    • A single-event functional interrupt (SEFI) is similar to an SEU in that it is typically the result of a single ion strike. ButSEFIs result in a temporary instance of some element of the IC not functioning properly. In some cases, an SEFI-induced fault remains until power cycles, and in other cases the condition is truly temporary.

    • A single-event latchup (SEL) is a potentially more damaging event typically caused by ions or protons generated by cos-mic rays or solar flares. The radiation induces a high-current state that results in a full or partial loss of IC functionality.In some cases, power-cycling an IC eliminates the SEL condition. In other cases the device may be permanently flawed.

    • The single-event transient (SET) encompasses the concept of an SEU, but includes more-complex errors induced by aradiation strike. An SET, for instance, might affect a clock and propagate multiple errors throughout memory or logic.

    • Total ionizing dose (TID) effects lead to the failure of an IC based on the aggregate exposure to radiation over time.Typically, performance parameters decline in an IC as the TID, measured in rads, increases over time. Radiation createselectron-hole pairs in the oxide layer of an IC—slowly changing the threshold voltage of transistors.

    Xilinx and its partners have gone to great lengths to understand the radiation effects and to test and characterize mitiga-tion techniques in FPGAs. Back in 2002, Xilinx and the Jet Propulsion Laboratory founded the Xilinx Radiation TestConsortium (XRTC—originally referred to as the Single-Event Effects Consortium, or SEEC). The consortium currentlyhas 14 members, including universities, research laboratories, and defense and space contractors.

    The partners are generally organizations that need to use FPGAs in space applications and have a vested interest in accu-rately assessing radiation-induced effects and radiation-tolerant and –hardened designs. The consortium website includes acomprehensive library of research papers (http://www.xilinx.com/esp/aero_def/radiation_effects.htm) that have validated thereliable use of SRAM-based FPGAs in space applications.

    Memory—including SRAM—has long been considered among the most susceptible circuits to radiation-induced effectsand specifically to SEUs. That fact led some to question the suitability of SRAM-based FPGAs in space applications.

    SRAM-based FPGAs, however, offer tremendous flexibility relative to programmable devices based on nonvolatile mem-ory. For starters, SRAM-based FPGAs, and specifically those from Xilinx, have consistently offered the highest level of inte-gration available in programmable ICs. Moreover, SRAM-based FPGAs are easily reconfigured, allowing a system to servemultiple applications and allowing teams to remotely reconfigure a system to fix a flaw in the system implementations thatisn’t revealed until after deployment. The capacity and performance of the most advanced SRAM-based FPGAs far exceedthose of the most advanced programmable nonvolatile devices.

    — Maury Wright

    16 Xcell Journal Third Quarter 2010

  • Third Quarter 2010 Xcell Journal 17

    the upset mitigation is so effective with vir-tually no susceptibility to protons that GEOis the worst-case orbit for SEFIs and SEUs,exactly the opposite of what space radiationexperts have learned to expect,” said Swift.

    In addition, Xilinx’s Digitally ControlledImpedance (DCI) allows adjustment outputimpedance and input termination valueswithout external components. Finally,Xilinx ensured its Block RAMs are pro-tected from outputting erroneous data inspite of upsets via an error-detection-and-correction circuit to eradicate upsets.

    Virtex-5QV Delivers ResultsThe Virtex-5QV delivers results unmatchedby previous FPGAs. The ICs are fully char-acterized for space radiation effects in heavyion and proton environments. TheseFPGAs will withstand a TID of 700krads(Si), based on method 1019 as definedin MIL-STD-833.

    Immunity to single-event latchup isdefined by a threshold to linear-energytransfer (LET)—the amount of energytransferred to material as an ionizing particletravels through that material. The Virtex-5QV meets the MIL-STD-833 requirementthat LET is greater than 100 MeV/mg-cm2

    (mega electronvolts per milligram per cen-timeter squared). In short, the device isessentially immune to SEL effects.

    SEU immunity in the configurationmemory and control logic is defined in termsof deployment in a GEO environment rela-tive to a space platform that travels 36,000km per day. Based on 35 Mbits on an IC thatcould be subject to an SEU, the IC will suf-fer 3.8 x 10-10 errors per bit per day.

    With the availability of the Virtex-5QVFPGA, space teams will have access to astate-of-the-art reprogrammable platformbuilt on a 65-nm copper process technology.The teams can prototype their work withwidely available commercial FPGAs and eas-ily accessible development tools, and thendeploy systems that the Xilinx Radiation TestConsortium has proven to be reliable in theharsh radiation environment in space.

    The Virtex-5QV device will sample in thecurrent quarter, with general productionavailability planned for first half of 2011.For more information, visit http://www.xilinx.com/products/virtex5qv/index.htm.

    Maury Wright is an electronics engineer turnedtechnology journalist and industry consultant,with broad experience in technology areas rangingfrom microprocessors to digital media, wirelessand power management. Wright worked at EDNMagazine for 22 years, serving as editor-in-chiefand editorial director for five years. Wright alsoserved as editor of the EE Times Digital Homeand Power Management websites.

    XCEL LENCE IN AEROSPACE & DEFENSE

    Rad TolerantFPGAs

    Rad TolerantFPGAs

    Rad-HardFPGAs

    200KRad

    300KRad>700KRad

    XQR2V1000

    XQR2V3000

    XQR2V6000

    XQR4VSX55

    XQR4VFX60

    XQR4VFX140

    XQR4VLX200

    XQR5VFX130

    Rel

    ativ

    e P

    erfo

    rman

    ce

    2004 2006 2008 2010

    GetonTarget

    Is your marketingmessage reachingthe right people?

    Hit your target by advertising your product or service in the Xilinx

    Xcell Journal, you’ll reach thousands of qualified engineers, designers, and

    engineering managers worldwide.

    The Xilinx Xcell Journal is an award-winning publication, dedicated specifically to helping programmable

    logic users – and it works.

    We offer affordable advertising rates and a variety

    of advertisement sizes to meet any budget!

    Call today: (800) 493-5551 or e-mail us at

    [email protected]

    Figure 2 – The Virtex-5QV rad-hard FPGA is a crowning achievement in Xilinx’s long, rich history of providing leading-edge IC technology for space applications.

  • 18 Xcell Journal Third Quarter 2010

    Multiple MicroBlazes Ease Integrationin Real-Time Automotive SystemMultiple MicroBlazes Ease Integrationin Real-Time Automotive System

    XCEL LENCE IN AUTOMOTIVE

  • by Martin ThompsonPrincipal Product EngineerTRW [email protected]

    It is a commonly held view that it is harderto develop software for multiple-processorsystems than single-processor systems. Butin fact, this is not always the case. Ourdesign team at TRW Conekt, the consul-tancy arm of TRW Automotive, recentlyundertook a project that demonstrates howpartitioning the hardware to match theproblem at hand allows development of veryefficient systems using many processors.

    Our team was tasked with providingembedded processing electronics to run incars for a project known as “Foot-LITE”(led by MIRA Ltd. and sponsored by theU.K. government-backed TechnologyStrategy Board, the Department forTransport and the Engineering andPhysical Sciences Research Council). Thisproject provides feedback to drivers abouttheir driving habits from the perspectiveof both safety and fuel economy.

    The system gives the feedback to thedriver in two ways. First, a dashboard-mounted smartphone display system(designed by Brunel University and devel-oped by HW Communications Ltd.) pro-vides real-time communication with thedriver about events that require immediateattention. In addition, the system collectscontinuous journey data, including videostreams of particular “events,” and thenuploads them to an Internet-based serverfor users to view at their leisure. The deci-sion about which events to flag to the useris made by an algorithm developed bypartner Ricardo UK, based on drivingadvice from another partner, the Instituteof Advanced Motorists.

    The project will fit this system to a smallfleet of 30 vehicles (see Figure 1). Test driverswill be members of the public employed byproject partner Hampshire County Council.

    The project has progressively incorpo-rated the research results obtained by thecollaboration of 12 industrial, govern-mental and academic partners. Thismeans that we’ve needed a very flexiblesolution to our processing challenges.

    Third Quarter 2010 Xcell Journal 19

    When facing a project with ever-changing requirements and software contributions from multiple locations, multiprocessors can actually help.

    When facing a project with ever-changing requirements and software contributions from multiple locations, multiprocessors can actually help.

    XCEL LENCE IN AUTOMOTIVE

  • Base SystemWe had available to us a processing sys-tem, already under development in anoth-er project, to perform image-processingtasks (Figure 2).

    This system is based around a singleXilinx® Spartan®-3A XC3SD3400A deviceconnected to four independent blocks ofDDR memory, an architecture thatallows users to implement many different

    processor/logic configurations. For exam-ple, you could use the whole FPGA fabricas pure logic resources crafted entirely inHDL. Alternatively, you could use high-er-level tools such as the Xilinx EDK toimplement four (or more) soft-coremicrocontrollers. Each of the four wouldhave access to its own private DDR mem-ory device, protecting data from interfer-ence by the other microcontrollers. Forother simple tasks, you could include

    additional processors by making use ofembedded BRAM blocks.

    In addition, I/O to the outside world isconfigurable using small daughterboards,which allows for quick turnaround of cus-tomized I/O sets for different projects.

    The project partners decided very earlyon that a USB interface would be desirable,as it allows you to add a wide variety ofperipherals to the system. This necessitated

    some form of USB stack—we obtained oneusing the Petalinux implementation ofucLinux—and a daughterboard with aUSB host device.

    The use of Linux also gives us a simpleway to manage the SPI flash devices thatthe system provides for FPGA bitstreamand application code storage. Weinstalled a simple JFFS2 file system toallow in-field application updates, eitherover Ethernet (using FTP) or by booting

    with a USB Memory Stick that contains ascript to upload new code to the internalflash. In a traditional embedded system,all this would require the software team towrite low-level application code.However, with Linux available to us, wecan easily write simple Bash scripts tocontrol these processes.

    Foot-LITE AlgorithmsRicardo developed the core algorithmsthat assess the driver’s actions and imple-mented them on its rCube rapid prototyp-ing system (http://www.ricardo.com/en-gb/Engineer ing-Consul t ing/Automotive-E x p e r t i s e / C o n t r o l s - - E l e c t r o n i c s /Embedded-Software/rCube/). We used thisapproach for initial simulator trials andthree test vehicles. In the test vehicles, anembedded vision system (based around anexisting TRW product—coincidentallyalso containing an FPGA) measured dis-tance to the vehicle in front and assessedthe vehicle’s position within the lane. Aradar system provided an alternativesource of range information in the testvehicles. As a step toward a productionimplementation, we eliminated the radarsystem for the larger-scale trials, as thevision system provided sufficient informa-tion for the application

    We fitted the vehicle with a forward-facing video camera and processing subsys-tem, which are combined into a small unitthat fits near the rear-view mirror.Embedded algorithms in this subsystemprocess the video images to measure thedistance between the car and the edge ofthe lane. In addition, a parallel algorithmdetects vehicles in front of the Foot-LITEvehicle and provides a measure of the head-way distance. This subsystem transmits itsdata to the Foot-LITE system unit usingthe automotive-standard controller-areanetwork (CAN) bus.

    20 Xcell Journal Third Quarter 2010

    Home PC

    CAN

    BackOffice DB

    Internet

    Camera &

    Engine

    ImageProcessing

    GPS

    Bluetooth

    On Car

    Figure 1 – Foot-LITE system

    XCEL LENCE IN AUTOMOTIVE

    At the beginning, we envis ioned provid ing a s ingle-processor system. However, i t was soon apparent that

    a dedicated processor would ease the task o f in tegrat ion for each i terat ion o f a lgor i thmic development .

  • Third Quarter 2010 Xcell Journal 21

    We integrated a three-axis accelerometerand yaw-rate sensing package into theFoot-LITE unit, potentially providing theFoot-LITE algorithms with access to high-rate, low-latency vehicle dynamics infor-mation when required.

    The Foot-LITE algorithms fuse all thisdata to provide a set of simple outputs tothe driver relating to his or her driving style.

    Algorithm ImplementationAt the beginning, we envisioned providinga single-processor system. However, it wassoon apparent that a dedicated processorwould ease the task of integration for eachiteration of algorithmic development. Weisolated the main host processor and theFoot-LITE algorithm processor by imple-

    menting communications using theMicroBlaze™ Fast Simplex Link (FSL) bussystem. This allows a complete isolation ofthe processors’ memories (unlike the popu-lar shared-memory methodologies), whichgreatly eases the integration task, since bugscannot “migrate” from one processor toanother via memory corruptions.

    In addition, there is no competition forprocessor cycles, which means our partnerscan be confident that any changes we make

    to the host application will not affect theirapplication’s performance.

    We developed a collection of wrapperfunctions that allow us to “drop in” thecode-generated C from the Simulink®

    compiler without having to make majorchanges to the interface. We provide asmall amount of nonvolatile memoryonboard via an I2C bus, which is requiredfor storing various tune parameters withinthe Foot-LITE algorithms. This necessitat-ed a simple wrapper to provide the algo-rithms with easy access from the Simulinkenvironment to read this memory at start-up and write it back at shutdown.

    The system needed to measure accelera-tions and yaw rate, as well as communicat-ing over the CAN bus with the lane- and

    vehicle-detection system. As we already hadlow-level CAN drivers and were concernedas to the timeliness of a Linux applicationmeasuring the vehicle dynamics informa-tion at 40-millisecond rates, we decided toinsert a third MicroBlaze into the system.This saved porting CAN drivers to Linux,and allowed deterministic performance viaanother isolated processing node—criticalto the algorithms—which made use of thedynamics measures. In addition, this

    approach allowed us to split the task ofwriting the software to allow parallel devel-opment. Once again, we used FSL as theinterface between the dynamics processorand the Foot-LITE algorithm processor.

    Video Capture and CompressionThe initial conception of the system pro-vided simple measures of lane width andoffset, distance to the vehicle in front andso on from the vision system, transmittedover CAN to the Foot-LITE algorithmunit. The project partners decided toenhance this setup by capturing videoframes for transmission to the server, toprovide off-line contextual assistance forinterpreting the advice the system gave.Given that the requirement was only for“Internet-quality” video (300 x 200 pixelsat 5 Hz), we felt we could easily assign afourth MicroBlaze to the task of compress-ing the video stream to a simple set ofJPEG images in real time. The image com-ing from the camera was a wide-VGA (720x 480 at 30 Hz) video stream. Clearly,downsampling the image was a task to beperformed in hardware.

    We designed a simple peripheral to han-dle the downsampling operation by simplydropping alternate pixels and lines to pro-duce a 360 x 240 image. This peripheralalso drops four in five frames to producethe required frame rate. Nothing morecomplex is needed to produce visiblyacceptable results, since the JPEG processrenders aliasing artifacts invisible. We usedSystem Generator to develop this peripher-al, as it makes export to EDK very straight-forward, and we already have experience ofusing System Generator for more-compleximage-processing tasks.

    The downsampling peripheral busmasters the data into the SDRAM con-nected to the JPEG processor, which thencompresses each frame, as it arrives, into acircular buffer until the Foot-LITE algo-rithm sends a flag. The JPEG processorsends the compressed video frames (againover FSL) to the host MicroBlaze. Weused a code library from the IndependentJPEG Group and found that it neededvery little optimization to operate at 5 Hz.Again, having an isolated processor

    Figure 2 – Processing module is Spartan-based.

    XCEL LENCE IN AUTOMOTIVE

  • enabled another software engineer (basedat a different site) to work on this aspectof the system in parallel.

    Bluetooth to Smartphone and OBDEase of installation was a critical factor forthe project. Minimizing the number of

    wires that the system requires was also animportant consideration. We choseBluetooth as the interface to the smart-phone. Drivers for standard USBBluetooth dongles are standard in theucLinux kernel, although we had to buildthe user-space tools ourselves. These have a

    number of dependencies on other items ofcode, which are also cross-compiled withthe Petalinux tool set and added to theucLinux file system.

    Once we had decided on Bluetooth forthe smartphone interface, it was a naturalchoice to use Bluetooth for the interface to

    22 Xcell Journal Third Quarter 2010

    LMB

    Other

    Linux MicroBlaze60MHzFPU,

    32KB I-Cache,32KB D-Cache

    EPCSPI

    BRAM32KB

    MPMC

    64MB120MHz

    DDR

    USBHost

    Bluetooth

    Flash

    Ether-net

    AlgorithmMicroBlaze

    60MHzFPU,

    8KB I-Cache,2KB D-Cache

    UART/Timer/GPIO

    BRAM16KB

    MPMC

    64MB120MHz

    DDR

    DynamicsMicroBlaze

    60MHzFPU,

    8KB I-Cache,2KB D-Cache

    BRAM16KB

    MPMC

    64MB120MHz

    DDR

    JPEG MicroBlaze60MHzFPU,

    8KB I-Cache,2KB D-Cache

    BRAM16KB

    MPMC

    64MB120MHz

    DDR

    VideoIn

    VideoDownsample

    PLB

    FSL

    Bus Key:

    I2C

    NVM

    SPI

    Dynamicssensors

    UART/Timer/GPIO

    CAN

    Car

    UART/Timer/GPIO

    UART/Timer/GPIO

    Figure 3 – FPGA block diagram showing major external elements

    XCEL LENCE IN AUTOMOTIVE

  • Third Quarter 2010 Xcell Journal 23

    the On-Board Diagnostic System. Wemade use of a standard off-the-shelfBluetooth-OBD interface module, remov-ing another wired link from the system.

    Easier DebuggingDebugging a system with multiple, parallelthreads of execution is always challenging.But splitting the system across multipleprocessors actually makes things easier.There is no requirement for a multithread-aware debugger (as might be needed whentrying to debug multiple processors withinthe Linux environment). The Xilinxdebugger (XMD) can connect to multipleprocessors, and by using TCL (the ToolCommand Language, which XMD under-stands), we can automate the setup anddownload of the code under test to multi-ple processors. Of course, the commonembedded-system debug approach usingprintf statements was also available, sinceeach processor has its own serial port.

    Another tool of great value when debug-ging the interprocessor communicationswas ChipScope™ Pro. This embeddedlogic analyzer built into the FPGA fabricallowed us to capture the data passing overthe FSL links and narrow down subtle bugsto either the sender or the receiver, andfrom there to the offending line of code.

    The isolation provided by using fourprocessors means that once a particular ele-ment is debugged it will (to a large extent)not need to be looked at again. There arenone of the weird interactions that alwayscause problems when integrating disparatecode into a large, monolithic application,or when running multiple processes on asingle processor.

    FPGA ImplementationIn this project, there is almost no HDL—simply a top-level wrapper integrating theEDK-based design with a tiny piece ofwatchdog code to guarantee the systemshuts down after the driver has turned theignition off. EDK generates the vastmajority of the FPGA (the MHS file ismore than 1,300 lines long!), with SystemGenerator producing the video downsam-pler. We configured all four microcon-trollers with caches and floating-point

    units. With four processors, four DDRmemory interfaces and a collection ofperipherals (including Ethernet, SPI, IIC,CAN, UARTs, timers, GPIOs), around70 percent of the device’s lookup tablesare occupied (around 28,000 LUTs). As isusual with microcontroller-based FPGAs,the block memories are very highly uti-lized at better than 90 percent, or 119BRAMs, but the DSP blocks are relative-ly lightly used: Only the floating-pointunits in each processor (eight in each, fora total of 32) need them.

    Bringing it All TogetherThe host microprocessor boots the Linuxkernel from internal flash and then mountsits local file systems. The slave processorseach have an FSL-based boot loader, whichaccepts a standard S-record, parses it,copies it into local memory and executes it.The Linux processor simply streams the S-record from the file system direct to theFSL pseudo-file (using the built-in dd util-ity). As described already, all interprocessorcommunication takes place over a fullyconnected mesh of FSL links. These are all32 bits wide and operate at 60 MHz, pro-viding plenty of low-latency communica-tion bandwidth. Although avoiding sharedmemory may seem limiting, the upside isthat this system provides the isolationbenefits already discussed. The hardwarearchitecture matches the applicationrequirement subdivision well, which cre-ates an intuitive software partition.

    The Foot-LITE algorithm microproces-sor sends triggers to the JPEG compressorwhen required and communicates with thesmartphone display. The Linux processorintermediates between the Bluetooth com-munications and the rest of the system(Figure 3). In addition to the immediatesignals to the driver, it sends a continuousstream of information about the state ofthe vehicle and occasional streams of videofor onward transmission to a central servervia the smartphone.

    At the end of a journey, when the driv-er switches off the ignition, the mainprocessor informs the slave processors,which can then perform their own shut-down procedures (such as writing updated

    parameters to the nonvolatile tune storage)before informing the main processor thatthey are in a safe state for shutdown. At thispoint, the host processor signals to thepower supply and the system enters a verylow-power sleep mode, awaiting the nextturn of the ignition. In the unlikely casethat the software does not send the shut-down signal within a couple of minutes ofthe ignition turnoff, a hardware timer inthe FPGA fabric pulls the power, to avoidflattening the vehicle battery.

    In the final stages of the project, twoacademic members of the consortium(Newcastle University and the Universityof Southampton) will analyze the datastreamed out of the vehicle in actual high-way use to evaluate the effectiveness of thesystem in altering driver behavior.

    The FPGA AdvantageFPGAs provide huge flexibility, whichmeets the needs of evolving projectsmuch more easily than a fixed hardwareplatform. The ability to mix in customhardware for intensive applications (forexample, video) is also beneficial. If youuse Linux to gain the huge benefits ofready-made high-level access to periph-erals such as Ethernet, you don’t need tocompromise on real-time performance,as you can push those critical tasks intotheir own microprocessor. Finally, if alarge, geographically distributed team isdeveloping the software, having a hard-ware architecture that matches the func-tional split provides benefits indevelopment and integration.

    For more information, contact the authorat [email protected] You can readmore about the Foot-LITE project athttp://www.foot-lite.net. MIRA Ltd. is theproject lead. The other industrial projectpartners are Auto-txt Ltd., HampshireCounty Council, HW CommunicationsLtd., The Institute of Advanced MotoristsLtd., Ricardo UK Ltd., Transport forLondon, TRW Conekt and Zettlex PrintedTechnologies Ltd. The academic partners areBrunel University, Newcastle UniversityTransport Operations Research Group andUniversity of Southampton TransportationResearch Group.

    XCEL LENCE IN AUTOMOTIVE

  • 24 Xcell Journal Third Quarter 2010

    Making Biometrics the Killer App of FPGA Dynamic Partial Reconfiguration

    Making Biometrics the Killer App of FPGA Dynamic Partial ReconfigurationRun-time reconfigurable hardware technologybrings key advantages in the design of automatic personal recognition systems.

    Run-time reconfigurable hardware technologybrings key advantages in the design of automatic personal recognition systems.

    XCEL LENCE IN I SM

    by Francisco FonsPhD CandidateUniversity Rovira i Virgili, Tarragona, [email protected]

    Mariano FonsPhD CandidateUniversity Rovira i Virgili, Tarragona, [email protected]

  • In the current era of communications andinformation technologies, automatic bio-metric personal recognition systems repre-sent the state of the art in high-performancesignal- and image-processing applications. Infact, it is not difficult to find in our daily livessystems requesting our personal authentica-tion/identification before allowing us to usethem; electronic tellers, computers, mobilephones and even cars require such authoriza-tion. Many end-user applications thatdemand better levels of security than PINs,passwords or ID cards use personal recogni-tion algorithms based on biometric (physio-logical or behavioral) characteristics, usuallydelivering them as a kernel.

    As a proof of concept, we developed anautomatic fingerprint authentication system(AFAS) on the second smallest Xilinx®

    FPGA device in the Virtex®-4 LX family,making use of the Xilinx Early Access PartialReconfiguration design flow and tools. Theexperimental results demonstrate it is possi-ble to embed a full, highly demanding bio-metric recognition algorithm in such a smallFPGA at an extremely low cost, processing itin real-time while preserving data accuracyand precision in its physical implementationby multiplexing functionality on the fly overa reduced set of resources placed in a partial-ly reconfigurable region (PRR) of the device.These promising results, together with theproven maturity of the technology we used,encourage us to move this solution fromresearch to industry, in an attempt to makepartial reconfiguration (PR) available to theconsumer world in the way of secure com-mercial products.

    Basics of BiometricsComputationally complex applicationsprocessed in real time, driven at low rates ofpower consumption and synthesized at lowcost are unavoidable requirements today inthe design and development of embeddedsystems, particularly when addressed tomass-production niches. In this context,dynamic partial self-reconfiguration of sin-gle-context FPGAs arises as a firm techno-logical alternative, able to deliver a highfunctional density of resources to efficient-ly balance all those demands for time-,power- and cost-sensitive applications.

    Finally, cost-effectiveness is probably themost important reason for biometrics tomake use of partial reconfigurability. Inaggressive markets like consumer electronicsor automotive, vendors must market theirsystems at a competitive cost. Customersdemand products with the highest level ofsecurity at the lowest possible price point.

    The way to improve security and reliabil-ity is by increasing the computational powerof the biometric recognition algorithm. Thisincrement of computation usually involves alike increment in execution time and also incost (resources). However, the cost is hardlyaffected in those scenarios where the design isbased on dynamic-partial-reconfigurationtechnology. Using PR, designers can parti-tion that new computation and schedule it as

    new processing stages added to the currentsequential execution flow of the application.Thus, cost often can be held invariant tofunctional changes of the algorithm.

    Designers can partition the biometricrecognition algorithm into a series of mutu-ally exclusive stages that are processedsequentially, where the outputs or results ofone stage become the input data for the next.This sequential order means designers canmultiplex hardware resources in time andcustomize them to execute a different task orrole at each moment, increasing their func-tional density and thus keeping constantthe total number of resources needed toprocess the entire algorithm. Moreover, thereconfiguration overhead is short enough

    Software-defined radio, aerospace mis-sions and cryptography are some of theknown applications that exploit the bene-fits of dynamic partial reconfiguration ofprogrammable logic devices today. In thiscontext, our group is applying PR to anapplication space that hasn’t traditionallyleveraged it: biometrics. As security hasbecome a major issue in today’s digitalinformation environment, especially forapplication fields like e-commerce, e-health, e-passports, e-banking or e-voting,among others, we believe the use of PR inbiometrics holds great promise.

    However, biometrics is complex. Itrequires stringent and computationallyintensive image/signal processing in realtime, along with a great deal of flexibility.

    In addition, personal recognition algo-rithms are in continuous evolution. As theresearch community expends major effortin this field, error rates like false acceptanceand false rejection are improving. As a con-sequence, consumers are growing moreconfident about biometric systems, andacceptance is increasing. Given thatprogress in biometrics technology is expect-ed to continue in the future, biometricproducts already in the market will have toadmit upgrades in the field just to avoidgetting obsolete, and for this they requireopen system architectures. In this regard,the flexible hardware found in run-timereconfigurable FPGA devices enables theversatility and scalability needed.

    Third Quarter 2010 Xcell Journal 25

    XCEL LENCE IN I SM

    Given that progress in b iometr icstechnology is expected to cont inue in the fu ture , b iometr ic productsalready in the market wi l l have

    to admit upgrades in the f ie ld jus t to avoid ge t t ing obsole te ,

    and for th is they require open system archi tec tures.

  • so as not to eclipse the benefitsgained by hardware acceleration.

    Furthermore, reconfiguring oneset of resources on the fly will notinterrupt the rest of the resourcesavailable in the FPGA. In this way,the resources that are not reconfig-ured continue to operate and guar-antee the link with the exteriorworld for the entire life cycle of theapplication.

    Our challenge in this work con-sisted in demonstrating that PR fitswell in the development of complexpersonal recognition algorithmsbased on biometric characteristics,making use of a two-dimensionaldesign abstraction level throughwhich the functionality is managednot only in space but also in time.We describe this target step by stepin the next sections.

    Automatic Fingerprint Authentication SystemFingerprint verification is one ofthe most popular and reliable bio-metric techniques used in automat-ic personal recognition. Essentially,the technique splits the AFASapplication into two processes orstages carried out at different timesand in different conditions: enroll-ment and recognition.

    Enrollment is the system config-uration process through which theuser gets registered. Generally, theuser exposes his or her fingerprint tothe system, which submits it to a setof computationally intensive image-processing phases aimed at extract-ing all relevant, permanent and distinctiveinformation that will permit the system tounequivocally recognize the fingerprint’sgenuine owner. This set of characteristicsbecomes the user ID, which the system storesin its database. This process is normally con-ducted off-line, in a secure environment andunder the guidance of expert staff.

    Once the user is registered, the next timehis or her fingerprint is exposed to the systemin the recognition stage, the system willcheck to see if it corresponds with any

    authorized member within the database. Allthe processing tasks performed in the enroll-ment are repeated now to again extract thosedistinctive characteristics from the live fin-gerprint sample. The system then comparesthese characteristics with the informationstored as user templates in the database toconclude whether the live scan matches anyof the registered templates. Recognitioncomes in two modalities depending on thesize of the database: authentication, when aone-to-one (or one-to-few) matching is

    processed; and identification, when thematching is one-to-many due to the factthat many users are registered in the sys-tem. Recognition is normally performedonline in a less-secure environment andunder real-time constraints.

    Each of these stages is, in its turn,partitioned into a series of mutuallyexclusive tasks designed to extract fromthe fingerprint image such informationas will distinguish one user from theothers. With that object in view, thesystem carries out specific computa-tions, such as image processing (2Dconvolution, morphologic operations),trigonometrics (sin, cos, atan, sqrt) [1]or statistics (average value, variance).

    Thus, the biometric application isorganized in a set of tasks that areprocessed following a sequential flow.A task cannot start unless the previoustask has finished, since the output dataof a given task is the input data for thenext one in the chain. Moreover, mostof these tasks are repeated in bothenrollment and recognition stages.

    Figure 1 enumerates the tasks thattake place in the presented algorithm.The first task is the image acquisition.Depending on the size of the sensor, asystem may acquire the whole image atone touch (complete image sensor) orin slices (sweeping sensor). In the sec-ond scenario—which was the case weused—an additional image reconstruc-tion phase is necessary. The full finger-print image gets composed by the setof consecutive and partially overlappedslices acquired [2].

    Once we have the whole recon-structed image, the next task consists of

    segmenting it in the foreground (that is, theregion of interest, based on the ridges andvalleys of the fingertip skin) from the back-ground. We perform this process by con-volving the image, pixel by pixel, withdirectional filters made up of Sobel masks ofkernel 5x5. Afterwards, we normalize theimage at a specific mean and variance.

    Next, we enhance this normalized imagethrough an isotropic filtering, whichretrieves relevant image information fromsome potential regions of the captured

    26 Xcell Journal Third Quarter 2010

    XCEL LENCE IN I SM

    Figure 1 – Spatial partitioning and floorplanning of the AFAStake place in one static region and one reconfigurable region of the Virtex-4. Temporal partitioning of the application in

    sequential stages occurs in the reconfigurable region.

  • Third Quarter 2010 Xcell Journal 27

    image initially lost or disturbed by the noisein the acquisition phase, making use of akernel 13x13 [3]. Once this step hasimproved the quality of the image, the nexttask is to compute the field orientationmap, which determines the dominant direc-tion of ridges and valleys in each localregion of the image foreground. The result-ant field orientation is then submitted to anew filtering stage (kernel 5x5) to obtain arefined field orientation map.

    Until this point, the image has beenworked at 8-bit gray scale. Now, in thebinarization process, Gabor directionalfilters of kernel 7x7 convolve the gray-scale image to improve the definition ofthe ridges and valleys and convert each ofthe gray-scale pixels to a 1-bit binary(black or white) dot. The image is thensubmitted to a new loop to smooth andredraw the shapes of the resultant ridgesand valleys. Later, the thinning or skele-tonization task converts the black-and-white image to one with black ridges onepixel wide. From that image it is not diffi-cult to extract the fingerprint characteris-

    tic points or minutiae, that is, the ridgeendings and bifurcations.

    Finally, with the minutiae and the fieldorientation data already obtained, the fin-gerprint template and sample can bealigned. The first way of accomplishing thisis through a brute-force algorithm thatmoves one image over the other—takinginto consideration both translation androtation movements as well as some admis-sible tolerances due to the image distortioncoming from the skin elasticity in theacquisition phase—to find the best align-ment between them [4]. The next step is tomatch the sample and template to obtain alevel of similarity between them, which theautomatic system will use to decide if bothimages correspond to the same person [5].

    All this processing, illustrated in Figure4, is performed on fingerprint images of500-dpi resolution, 8-bit gray scale and upto 280 x 512 pixels, acquired throughsweeping technology via the thermal fin-gerprint sensor FingerChip from AtmelCorp. and computed in the Xilinx Virtex-4XC4VLX25 FPGA device.

    System ArchitectureThe Virtex-4 FPGA device becomes thecomputational unit of the AFAS platform.Flash memory plays the role of system data-base, storing nonvolatile information likebitstreams as well as specific application datasuch as user fingerprint templates or config-uration settings of the biometric algorithm.The system also uses DDR-SDRAM mem-ory to temporarily store intermediate data orimages obtained in each processing stage.We implemented a serial communicationlink, in our case an RS-232 transceiver con-nected to a UART controller—the lattersynthesized in the resources of the FPGA—to use for debugging purposes, just to trans-fer the resulting image of each stage to a PCin order to visualize the fingerprint imagesor results of each step. Finally, a sweepingfingerprint sensor, addressed to capture thebiometric characteristic of the user, acts asinput of the recognition algorithm, asdepicted in Figure 2.

    Regarding the computation unit, theFPGA is detached in two regions, as shownin Figure 3: a static region occupied by a

    XCEL LENCE IN I SM

    Reg

    Reg

    Reg

    Reg

    BM BM

    BM

    BM

    BM

    BMPRR FIFO

    PRR FIFO

    INTs

    Cfg FIFO

    SelectMAPI/F

    AFAS I/F

    PLATFORMFLASH

    DDRSDRAM

    RS-232

    XILINX ML401 PLATFORM

    UART CONTROLLER

    MULTI-PORTMEMORY CONTROLLER

    INT CONTROLLER TIMER EXT MEMORY CONTROLLER

    BRAMLOCAL MEMORY

    LINEARFLASH

    FINGERPRINTSENSOR

    VIRTEX-4 XC4VLX25SYSTEM ON CHIP

    PRRFIFO

    FPGA

    ICAPI/F

    PARTIALLY RECONFIGURABLE REGION

    APPLICATION SPECIFIC HARDWARE COPROCESSORS

    PRR RECONFIGURATION CONTROLLER

    FPGA CONFIGURATION MEMORY

    MMU MST MMU SLV

    MICROBLAZENPI DXCL IXCL PLBV46

    PLBV46

    ILMB DLMB

    Figure 2 – System architecture and functional components breakdown of the suggested AFAS.

  • whole multiprocessor CoreConnect bus sys-tem; and a partially reconfigurable regionthat is used to place—on demand and mul-tiplexed in time as long as the processingadvances—the custom biometric coproces-sors or IP responsible for the different

    sequential tasks of the recognition algorithm.The multiprocessor CoreConnect bus sys-tem mainly comprises a MicroBlaze™processor and other standard peripheralsalong with a custom reconfiguration con-troller, this one linked to the ICAP port.

    All the processing tasks are enumeratedfrom 0 (static) to B in Figure 1, accordingto sequential execution order. Customhardware coprocessors implement all thetasks in the PRR, with the exception of thefingerprint acquisition process, which theMicroBlaze performs in software.

    The reason behind this specific hard-ware/software partitioning is that thesweeping sensor needs an integration timeof 5 milliseconds to acquire consecutiveslices. That’s enough time for it to performthe image reconstruction on the fly directlyin software under MicroBlaze control.Therefore, it is not necessary to implementthis image reconstruction with a customhardware coprocessor.

    The image acquisition consists of cap-turing 100 slices at a rate of 5 ms per slice,with each slice consisting of 280 x 8 pix-els. Software handles the reconstruction inreal time by detecting the overlapping ofrows of pixels between each two consecu-tives image slices.

    We implemented the rest of the tasks,however, as custom hardware coprocessorsin the PRR of the FPGA simply because ofreal-time constraints. Once the processingof each particular task is finished, the recon-figuration controller, located on the staticregion of the device and instructed by theMicroBlaze processor, replaces the coproces-sor currently instantiated in the PRR by theone corresponding to the next stage of thebiometric algorithm. The reconfigurationcontroller does this job by simply down-loading the new partial bitstream into thePRR and transferring this data directly fromDDR-SDRAM to the internal FPGA con-figuration memory via the ICAP interface.

    It is important to note that we used astandard interface based on FIFO memo-ries and flip-flop registers between the stat-ic and the reconfigurable regions. Thisallows us to develop standard biometriccoprocessors or IP placed in the PRR thatare totally independent of the multiproces-sor bus the system uses, be it AMBA®,CoreConnect, Wishbone or some other, asdepicted in Figure 2. This point is funda-mental in order to guarantee standardiza-tion and portability of the biometricalgorithm to different platforms.

    28 Xcell Journal Third Quarter 2010

    XCEL LENCE IN I