Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett

Techniques for Low Power Turbo Coding in Software Radio

Joe AntoonAdam Barnett

Software Defined Radio

• Single transmitter for many protocols• Protocols completely specified in memory• Implementation:

– Microprocessors– Field programmable logic

Why Use Software Radio?

• Wireless protocols are constantly reinvented– 5 Wi-Fi protocols– 7 Bluetooth protocols– Proprietary mice and keyboard protocols– Mobile phone protocol alphabet soup

• Custom DSP logic for each protocol is costly

So Why Not Use Software Radio?

• Requires high performance processors• Consumes more power

Inefficient general fork

Efficient applicationspecific fork

Inefficient Field-programmable

Turbo Coding

• Channel coding technique• Throughput nears theoretical limit• Great for bandwidth limited applications

– CDMA2000– WiMAX – NASA ‘s Messenger probe

Turbo Coding Considerations

• Presents a design trade-off• Turbo coding is computationally expensive• But it reduces cost in other areas

– Bandwidth– Transmission power

Reducing Power in Turbo Decoders

• FPGA turbo decoders– Use dynamic reconfiguration

• General processor turbo decoders– Use a logarithmic number system

Generic Turbo Encoder

ComponentEncoder

ComponentEncoderInterleave

Data stream

Generic Turbo Decoder

Decoder DecoderInterleave

Decoder Design Options• Multiple algorithms used to

decode• Maximum A-Posteriori (MAP)

– Most accurate estimate possible– Complex computations required

• Soft-Output Viterbi Algorithm– Less accurate– Simpler calculations

Decoder

FPGA Design Options

• Goal Make an adaptive decoder

DecoderReceived Data

Parity

Original sequence

Tunable Parameter

Lowpower,

accuracy

Highpower,

accuracy

Component Encoder

• M blocks are 1-bit registers• Memory provides encoder state

M MGeneratorFunction

Encoder State

Viterbi’s Algorithm

• Determine most likely output• Simulate encoder state given received values

s0 s1 s2

r0 p0 r1 p1 r2 p2

d0 d1 d2

Viterbi’s Algorithm

• Write: Compute branch metric (likelihood)• Traceback: Compute path metric, output data• Update: Compute distance between paths

• Rank paths by path metric and choose best• For N memory:

– Must calculate 2N-1 paths for each state

Adaptive SOVA

• SOVA: Inflexible path system scales poorly• Adaptive SOVA: Heuristic

– Limit to M paths max– Discard if path metric below threshold T– Discard all but top M paths when too many paths

Implementing in Hardware

Branch Metric

AddCompare

Select

Survivor memory

Control

• Controller – – Control memory– select paths

• Branch Metric Unit– Compute likelihood– Consider all possible

“next” states

• Add, Compare, Select – Append path metric– Discard paths

• Survivor Memory– Store / discard path bits

Add, Compare, Select Unit

Present State Path

Values

Next State Path Values

Compute,Compare

BranchValues

PathDistance

Threshold

Dynamic Reconfiguration

• Bit Error Rate (BER)– Changes with signal strength– Changes with number of paths used

• Change hardware at runtime– Weak signal: use many paths, save accuracy– Strong signal: use few paths, save power– Sample SNR every 250k bits, reconfigure

Dynamic Reconfiguration

Experimental Results

K (Number of encoder bits) proportional to average speed, power

• FPGA decoding has a much higher throughput• Due to parallelism

• ASOVA performs worse than commercial cores• However, in other metrics it is much better

– Power– Memory usage– Complexity

Future Work

• Use present reconfiguration means to design– Partial reconfiguration– Dynamic voltage scaling

• Compare to power efficient software methods

Power-Efficient Implementation of a Turbo Decoder in SDR System

• Turbo coding systems are created by using one of three general processor types– Fixed Point (FXP)

• Cheapest, simplest to implement, fastest

– Floating Point (FLP)• More precision than fixed point

– Logarithmic Numbering System (LNS)• Simplifies complex operations• Complicates simple add/subtract operations

Logarithmic Numbering System

• X = {s, x = log(b)[|x|]}– S = sign bit, remaining bits used for number value

• Example– Let b = 2,– Then the decimal number 8 would be represented

as log(2)[8] = 3– Numbers are stored in computer memory in 2’s

compliment form (3 = 01111101) (sign bit = 0)

Why use Logarithmic System?

• Greatly simplifies multiplication, division, roots, and exponents– Multiplication simplifies to addition

• E.g. 8 * 4 = 32, LNS => 3 + 2 = 5 (2^5 = 32)

– Division simplifies to subtraction• E.g. 8 / 4 = 2, LNS => 3 – 2 = 1 (2^1 = 2)

Why use Logarithmic System?

• Roots are done as right shifts– E.g. sqrt(16) = 4,

LNS => 4 shifted right = 2 (2^2 = 4)

• Exponents are done as left shifts– E.g. 8^2 = 64, LNS => 3 shifted left = 6 (2^6 = 64)

So why not use LNS for all processors?

• Unfortunately addition and subtraction are greatly complicated in LNS.– Addition: log(b)[|x| + |y|] = x + log(b)[1 + b^z]– Subtraction: log(b)[|x| - |y|] = x + log(b)[1 - b^z]

• Where z = y – x

• Turbo coding/decoding is computationally intense, requiring more mults, divides, roots, and exps, than adds or subtracts

Turbo Decoder block diagram

• Use present reconfiguration means to design– Partial reconfiguration– Dynamic voltage scaling

• Compare to power efficient software methods

• Each bit decision requires a subtraction, table look up, and addition

Proposed new block diagram

• As difference between e^a and e^b becomes larger, error between value stored in lookup table vs. computation becomes negligible.

• For this simulation a difference of >5 was used

How it works

• For d > 5• New Mux (on right) ignores SRAM input and simply

adds 0 to MAX result.• d > 5, pre-Decoder circuitry disables the SRAM for

power conservation.

Comparing the 3 simulations

• Comparisons were done between a 16-bit fixed point microcontroller, a 16-bit floating point processor, and a 20-bit LNS processor.

• 11-bits would be sufficient for FXP and FLP, but 16-bit processors are much more common

• Similarly 17-bits would suffice for LNS processor, but 20-bit is common type

Power Consumption

Latency

• Recall: Max*(a,b) = ln(e^a+e^b)

Power savings

• Pre-Decoder circuitry adds 11.4% power consumption compared to SRAM read.

• So when an SRAM read is required, we use 111.4% of the power compared to the unmodified system

• However, when SRAM is blocked we only use 11.4% of the power we used before.

Power savings

• The CACTI simulations for the system reported that the Max* operation accounted for 40% of all operations in the decoder

• The Max* operations for the modified system required 69% of the power when compared to the unmodified system.

• This leads to an overall power savings of69% * 40% = 27.6%

Conclusion

• Turbo codes are computationally intense, requiring more complex operations than simple ones

• LNS processors simplify complex operations at the expense of making adding and subtracting more difficult

Conclusion

• Using a LNS processor with slight modifications can reduce power consumption by 27.6%

• Overall latency is also reduced due to ease of complex operations in LNS processor when compared to FXP or FLP processors.

Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett

Documents

6.7 Thomas Barnett

Performance 2014 - Antoon Bloem - De Nieuwe Zaak

Livres lus en classe en 2ème période (novembre-décembre)€¦ · Léonard le Têtard Antoon Krings Zabeth Blaise et Thérèse les Punaises Antoon Krings Lorette la Pâquerette

Exterior COOL TURBO COOL COOL TURBO COOL TURBO COOL TURBO …

Barnett & Finnemore

Precalculo - Barnett

Ijaam by Sinan Antoon (1)

Barnett Haskins

Is de hypothekensector futureproof? Antoon Splinter

BIBLIOGRAPHY OF WRITINGS BY HENDRIK ANTOON LORENTZ · BIBLIOGRAPHY OF WRITINGS BY HENDRIK ANTOON LORENTZ Compiled by A.J. Kox 1875a [“Oplossing van een prijsvraag.”] Nieuw Archief

T European particulate matter emissions Jan Berdowski, Antoon Visschedijk, Tinus Pulles

Hendrik Antoon Lorentz - The Einstein Theory of Relativity

Joseph Vogl, Allen Feldman & Sinan Antoon: Fear · Dictionary of Now Joseph Vogl, Allen Feldman & Sinan Antoon: Fear DE → EN WElcomE Bernd Scherer, ... His third novel, Ya Maryam,

Rusty Barnett

COLUMBUS RECREATION & PARKS DEPARTMENT BARNETT …€¦ · COLUMBUS RECREATION & PARKS DEPARTMENT BARNETT COMMUNITY CENTER 1184 BARNETT ROAD @ LIVINGSTON AVENUE COLUMBUS 43227 614.645.3065

Miracle Barnett

Antoon van Aken Lady Jane 2 Het proces

2019 - 2023 - Antoon van Dijkschool · 2020-01-14 · omringd met de zorg die ze nodig hebben. Onderwijs op de Antoon van Dijkschool kent samenhang tussen cognitief, ... ruimte voor

Curriculum Vitae ANTOON DE BAETS - Concerned … · Curriculum Vitae ANTOON DE BAETS 7 July 2018 Please access the latest version of this c.v. at concernedhistorians.org/va/cv.pdf

The Einstein Theory of Relativity - Hendrik Antoon Lorentz.pdf