Upload
others
View
31
Download
0
Embed Size (px)
Citation preview
Viterbi AlgorithmAdvanced Architectures
Lecture 15
6.973 Communication System Design – Spring 2006
Massachusetts Institute of Technology
Vladimir Stojanović
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Radix 2 ACS
Radix-2 trellis 2-way ACS Radix-2 ACS Unit
6.973 Communication System Design 2
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Γ
Γ
ΓΓ
Γ
ΓΓ
Γ
Γ Γ Γxn
0 x0 x0 0 λ λ= min ( (n - 1 n - 1,x1 0
n - 1n - 1 + +1x
n
nn
n
0
0
1
1
x0 x0
0x
x1
0x0 λ
n0x1 λ
nx11 λ
x1
n1x0 λ
n1x0 λ
n0x0 λ
n0x0 λ
n0x1 λ n
1x1 λ
n1x0 λ
nx0 d n
x0 d
nx1 dn
nx0
nx1
x0
n - 1
n - 1n - 1
Γ1xn - 1
Γ x1n - 1
Γ0xn - 1
n - 1
Com
pare
Sele
ct
2-wayACS
2-wayACS
⊕
Figure by MIT OpenCourseWare.
3 Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Radix-4 trellis
8-state Radix-2 trellis 4-state subtrellis 8-state Radix-4 trellis
Figure by MIT OpenCourseWare.
0
n - 2 n - 2 n - 2n n nn - 1 n - 1
1 1 1
1 1
1 1 12
2
2
2
2
2
3 3 3
3 3
3 3
3
4 4 4 4 4
5 5 5
5
5 5
55
6 6 6
7 7 7 7 7 7 7 7
2 2 4
6
6 6
6
6
4 4
0 0 0 0 0 0 0
2Radix
Idealspeedup
Radix - 2k complexity speed measures
Complexityincrease
Area efficiencyk
1 1 1 110.750.5
2 233
24
4 448816 Figure by MIT OpenCourseWare.
Radix-4 ACS
Radix-4 trellis 4-way ACS Radix-4 ACS unit
6.973 Communication System Design 4
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
nx0000 λ
nx0000 λ n
x0001 λ nx0010 λ n
x0011 λ
nx0100 λ n
x0101 λ nx0110 λ n
x0111 λ
nx1000 λ n
x1001 λ nx1010 λ n
x1011 λ
nx1100 λ n
x1101 λ nx1110 λ n
x1111 λ
nx0001 λ
nx0010 λ
nx0011 λ
Γx00n
Γx01n
Γx10n
Γx11n
Γ x00n - 2
Γ x10n - 2
Γ x01n - 2
Γ x11n - 2
Γ x00n - 2 Γx 00
n
n
Γ x10n - 2 Γx 10
n
Γ x01n - 2 Γx 01
n
Γ x11n - 2 Γx 11
nΓ x11
n - 2
Γ x01n - 2
Γ x10n - 2
Γ x00n - 2
Γx 00n
d x 00n
d x 00n
d x 10n
d x 01n
d x 11n
n - 24-wayACS
4-wayACS
4-wayACS
4-wayACS
Com
pare
Sele
ct⊕
⊕
⊕
Figure by MIT OpenCourseWare.
Radix-4 ACS implementation
� Use ripple carry adders and comparators � Take advantage of the ripple
profile to hide the compare
� Delay 17% longer than 2-way ACS due to� Increased adder fanout� 4:1 mux instead of 2:1 mux
� Overall, results in 1.7x speedup compared to 2-way ACS
6.973 Communication System Design 5
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure from from Black, P. J., and T. H. Meng. "A 140-Mb/s, 32-state, Radix-4 Viterbi Decoder." IEEE Journal of Solid-State Circuits 27 (1992): 1877-1885. Copyright 1992 IEEE. Used with permission.
Image removed due to copyright restrictions.
Radix-4 placement and routing
� Paths given as Hamiltonian cycles (visit each node in the graph once)
6.973 Communication System Design 6
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Images removed due to copyright restrictions.
Modulo arithmetic for ACS� Viterbi algorithm inherently bounds the maximum
dynamic range ∆max of state metrics ∆max ≤ λmaxlog2N (N-number of states, λmax maximum
branch metric of the radix-2 trellis) � Number theory
� Given two numbers a and b such that |a-b|< ∆ � Comparison |a-b| can be evaluated as |a-b| mod 2∆ without
ambiguity � Hence state metrics can be updated and compared
modulo 2∆max � Choose state metric precision to implement modulo by ignoring
the state metric overflow � Required state metric precision equal to twice the maximum
dynamic range of the updated state metrics � Required number of bits is Γbits =ceil[ log2(∆max+kλmax) ] + 1
� k – accounts for branch metric addition � Example values (for the 32-state radix-4 decoder)
� k=2, λmax=14, ∆max=70, Γbits=8
6.973 Communication System Design 7
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Branch metric unit� Example (8-level soft input, R=1/2, K=6 (32 state)
� λ(S1S2)=|G1-S1|+|G2-S2| (G-received sample, S-expected sample) (λmax=14) � 4 bits required for radix-2 branch metrics � 5 bits for the radix-4 branch metrics
6.973 Communication System Design 8
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Images removed due to copyright restrictions.
State-metric initialization
� Need to start from right state metrics for dynamic range bound to hold (and for modulo arithmetic to be valid)
� This is b/c there are constraints on the state metric values imposed by the trellis structure � For example state-0 and state-1 have a common ancestor state
one iteration back � This constrains the state metrics to differ at most by the λmax
� Find the right initial metric through simulation (with all zero inputs) until steady state is reached
6.973 Communication System Design 9
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure removed due to copyright restriction.
Decoder block diagram
6.973 Communication System Design 10
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure by MIT OpenCourseWare.
Decoder output
Decoder inputs
2 x 3 G 1, G 2 G 1, G 2
4 x 4 4 x 4
2 x 3
Radix-16 trace-back unit
Radix-4 pretrace-back unit
Decision memory
Radix-4 ACS array
Radix-4 branch metric unit
Metrics Metrics
Radix-2 BMU Radix-2 BMU
32 x 4 Radix-16 decisions
32 x 4 Radix-16 decisions
32 x 2 Radix-4 decisions
16 x 5 Metrics
LIFO
4
4 Decisions
Decision memory organization
� Survivor paths always merge L=5K steps back
6.973 Communication System Design 11
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure by MIT OpenCourseWare.
Decision vector
Decodeblock
Survivor pathmerge block
Trace-back
Writeblock
Writeregion
Readregion
Write
D DL
{ {
Advanced algorithmic transformations
� Sliding block Viterbi decoder (SBVD) � Based on two important observations (µ+1=constraint
length of the code) � 1. Survivor paths merge L=5µ iterations back into the trellis � 2. After K=5µ steps, state metrics independent on the initial
value of state metrics � Unknown state at time n can be decoded using only
information from the block [n-K, n+L] � Cannot store all the values in the memory
� Have to obtain them “on-the-fly”
6.973 Communication System Design 12
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
SBVD implementation
Can find shortest path by running forward or backward
At step m � Forward processing
� 4 survivors
� Backward processing � 4 shortest paths
� Combined � Smallest concatenated state metric � Starting state for trace-back of the shortest path
�
�
� Continue fw and backw operation Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
6.973 Communication System Design 13
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Forward vs. Forward-Backward
� Can decode more than one state (M – states)� Fw-Bw has reduced decoding delay and skew buffer memory
6.973 Communication System Design 14
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Continuous stream processing
�
�
�
Cut the incoming stream in overlapping chunks Process in parallel Outputs are non-overlapping
6.973 Communication System Design 15
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Systolic SBVD architecture
� Since block is bounded � Can unroll and pipeline
� ACS � Trace-back
� M=2L � Has to be a multiple of L
6.973 Communication System Design 16
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Example for L=2
6.973 Communication System Design 17
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
ACS units
6.973 Communication System Design 18
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figures from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
A 64-state example
� Not fully parallel (8 radix-4 ACS units) � 2 radix-4 butterflies in each cycle � 8 cycles for 64 states radix-4 (i.e. two radix-2 steps)
6.973 Communication System Design 19
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Images removed due to copyright restrictions.
Readings� [1] A.P. Hekstra "An alternative to metric rescaling in Viterbi
decoders," Communications, IEEE Transactions on vol. 37, no. 11, pp. 1220-1222, 1989.
� [2] P.J. Black and T.H. Meng "A 140-Mb/s, 32-state, radix-4 Viterbi decoder," Solid-State Circuits, IEEE Journal of vol. 27, no. 12, pp. 1877-1885, 1992.
� [3] P.J. Black and T.Y. Meng "A 1-Gb/s, four-state, sliding block Viterbi decoder," Solid-State Circuits, IEEE Journal of vol. 32, no. 6, pp. 797-805, 1997.
� [4] M. Anders, S. Mathew, R. Krishnamurthy and S. Borkar "A 64-state 2GHz 500Mbps 40mW Viterbi accelerator in 90nm CMOS," VLSI Circuits, 2004. Digest of Technical Papers. 2004 Symposium on, pp. 174-175, 2004.
6.973 Communication System Design 20
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 20MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
06.