Upload
truongcong
View
223
Download
0
Embed Size (px)
Citation preview
The implementation challenges of polar codesRobert G. MaunderCTO, AccelerComm
February 2018
Abstract
Although polar codes are a relatively immature channel coding technique with no previous standardisedapplications, they have been selected by the 3rd Generation Partnership Project (3GPP) to provide error correctionin the New Radio (NR) standard for 5th Generation (5G) mobile communications. The hardware acceleration ofpolar encoding and decoding will be necessary in order in to meet the strict requirements in many applications of5G. However, the processes of polar encoding and decoding are complicated and it is not trivial to translate theminto hardware. This white paper provides a tutorial of the polar encoding and decoding processes, before discussingthe challenges of their hardware implementation.
I. INTRODUCTION
In mobile communication, channel coding may be used to protect information against the effects oftransmission errors, which may be caused by noise, interference or poor signal strength. More specifically,a channel encoder is used to encode the information in the transmitting device, which may be a basestation,a handset or another user device. This allows a corresponding channel decoder to be used in the receivingdevice, in order to mitigate the transmission errors and recover the transmitted information.
In recent decades, several high-performance channel codes have been developed, which allow infor-mation to be reliably transmitted at rates that closely approach the theoretical limit that is imposed bythe channel capacity. Specifically, turbo codes have been used in 3rd Generation (3G) and 4th Generation(4G) mobile communication standards, while Low Density Parity Check (LDPC) codes have been adoptedin WiFi and satellite standards. More recently, polar codes [1] have emerged, offering particularly strongerror correction performance for short messages. However, polar codes are much less mature than turboand LDPC codes, having no previous standardised applications.
At the time of writing, the 3rd Generation Partnership Project (3GPP) is defining the so-called NewRadio (NR) standard [2], as a candidate for 5th Generation (5G) mobile communication. Here, polar codeshave been selected to provide channel coding in the control channel of the enhanced Mobile BroadBand(eMBB) applications of NR, as well as in the Physical Broadcast Channel (PBCH). Polar codes have alsobeen identified as a candidate to provide channel coding for the data and control channels of the UltraReliable Low Latency Communication (URLLC) and massive Machine Type Communication (mMTC)applications of NR.
In addition to setting a strict requirement for ultra-reliable error correction, 5G imposes a requirementfor the error correction to be completed quickly, with a lower latency than in 3G or 4G. Owing tothis, many 5G applications will require polar encoding and decoding to be implemented using high-performance hardware acceleration, which must consume a minimal amount of hardware resources andpower consumption.
Sections II and III of this white paper provide tutorials for the algorithms that underpin the processesof polar encoding and decoding, respectively. Following this, Section IV discusses the challenges ofimplementing these algorithms in hardware. Finally, we offer some concluding remarks in Section V.
c©AccelerComm 2018 www.accelercomm.com 1
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
II. POLAR ENCODER
A polar encoder comprises three successive components, namely information block conditioning, thepolar encoder kernal and encoded block conditioning, as shown in Figure 1. These components arediscussed in the following paragraphs.
The input to the information block conditioning component may be referred to as an information block,which comprises K number of information bits, where K may be referred to as the information block size.The information block conditioning component interlaces the K information bits with N −K redundantbits, which may be frozen bits [1], Cyclical Redundancy Check (CRC) bits [3] and/or Parity Check (PC)-frozen bits [4] in the NR polar code. Here, frozen bits always adopt a value of 0, while CRC and PC-frozenbits adopt values that are obtained as functions of the information bits. The information block conditioningcomponent generates the redundant bits and interlaces them into positions that are identified by a prescribedmethod, which is also known to the polar decoder. Furthermore, the information block conditioningcomponent additionally performs code block segmentation, interleaving and scrambling operations in theNR polar code, as shown in Figures 6 – 8. The output of the information block conditioning componentmay be referred to as a kernal information block, which comprises N number of kernal information bits,where N may be referred to as the kernal block size. Here, the information block conditioning must becompleted such that N is a power of 2 that is greater than K. In the NR polar code, N may adopt valuesof up to Nmax = 1024.
Polar encoder in transmitter
Informationblock
conditioning
Polardecoderkernal
Encodedblock
conditioning
Informationblock
conditioning
Polarencoderkernal
Encodedblock
conditioning
blockencodedSoft
blockEncoded
M
M
blockInformation
blockinformationRecovered
K
KDemodulator
Modulator
Channel
blockinformation
Kernal
blockinformation
kernalRecovered
blockencodedKernal
blockencodedkernalSoft
N
N
N
N
Polar decoder in receiver
Fig. 1: Top-level schematic of a polar encoder and decoder.
The input to the polar encoder kernal is a kernal information block and its output may be referred toas a kernal encoded block, which comprises N number of kernal encoded bits. The operation of the polarencoder kernal may be illustrated by a polar code graph representation, which is exemplified in Figure 2.Here, the symbol ⊕ represents a binary eXclusive-OR (XOR) operation. Note that the graph comprises Ninputs on its left edge and N outputs on its right edge, corresponding to the N kernal information bits andthe N kernal encoded bits, respectively. The graph comprises log2(N) stages, each of which comprisesN/2 vertically aligned XORs, giving a total of log2(N)N/2 XORs. Note that there are data dependenciesbetween successive stages, which enforces a left to right processing schedule. More specifically, the datadependencies prevent the computation of the XORs in a particular stage until after the XORs in the stageto its left have been computed.
Note that successive graph representations have recursive relationships. More specifically, the graphrepresentation for a polar encoding kernal operation having a kernal block size of N = 2 comprises asingle stage, containing a single XOR. The first of the N = 2 kernal encoded bits is obtained as the
c©AccelerComm 2018 www.accelercomm.com 2
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
Input 0
Input 1
Input 2
Input 3
Output 0
Output 1
Output 2
Output 3
Input 0
Input 1
Output 0
Output 1
Output 0
Output 1
Output 7Input 7
Input 4
Input 5
Input 6
Output 4
Output 5
Output 6
Input 2
Input 3
Input 0
Input 1
Output 2
Output 3
N = 8 graph
N = 4 graph
N = 2 graph
Stage 2Stage 1Stage 0
Fig. 2: Polar code graphs for N ∈ {2, 4, 8}.
c©AccelerComm 2018 www.accelercomm.com 3
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
XOR of the N = 2 kernal information bits, while the second kernal encoded bit is equal to the secondkernal information bit. For greater kernal block sizes N , the graph representation may be considered tobe a vertical concatenation of two graph representations for a kernal block size of N/2, followed by anadditional stage of XORs, as shown in Figure 2. In analogy with the N = 2 kernal described above, thefirst N/2 of the N kernal encoded bits are obtained as XORs of corresponding bits from the outputs ofthe two N/2 kernals, while the second N/2 of the kernal encoded bits are equal to the output of thesecond N/2 kernal.
The input to the encoded block conditioning component of the polar encoder is a kernal encoded blockand its output may be referred to as an encoded block, which comprises M number of encoded bits, whereM may be referred to as the encoded block size. The resultant polar coding rate is given by R = K/M ,where the encoded block conditioning must be completed such that M is greater than K, although Mmay be higher or lower than N . The encoded block conditioning component may use various techniquesto generate the M encoded bits. More specifically, repetition [5] may be used to repeat some of the Nbits in the kernal encoded block, while shortening or puncturing techniques [5] may be used to removesome of the N bits in the kernal encoded block. Note that shortening removes bits that are guaranteed tohave values of 0, while puncturing removes bits that may have either of 0 or 1 values. In addition to thisrate matching operation, the encoded block conditioning component also performs sub-block interleaving,channel interleaving and code block concatenation operations in the NR polar code, as shown in Figures 6– 8. Following polar encoding, the encoded block may be provided to a modulator, which transmits itover a communication channel.
The complete polar encoding process is exemplified in Figure 3, for the case where a particulararrangement of frozen bits is used to convert the K = 4 information bits [1001] into the M = 8 encodedbits [00001111].
1
Frozen bit
Frozen bit
Frozen bit
Frozen bit
Info bit 0
Info bit 1
Info bit 2
Info bit 3
Encoded bit 0
Encoded bit 1
Encoded bit 2
Encoded bit 3
Encoded bit 4
Encoded bit 5
Encoded bit 6
Encoded bit 7
0
0
0
1
0
0
0
1
0
0
1
1
0
0
1
1
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
Fig. 3: Example polar encoding process, using the N = 8 polar code graph, illustrating the case where aparticular arrangement of frozen bits is used to convert the K = 4 information bits [1001] into the M = 8encoded bits [00001111].
c©AccelerComm 2018 www.accelercomm.com 4
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
III. POLAR DECODER
In the receiver, the demodulator’s role is to recover information pertaining to the encoded block.However, the demodulator is typically unable to obtain absolute confidence about the value of the Mbits in the encoded block, owing to the random nature of the noise in the communication channel. Thedemodulator may express its confidence about the values of the bits in the encoded block by generatinga soft encoded block, which comprises M number of encoded soft bits. Each soft bit may be representedin the form of a Logarithmic Likelihood Ratio (LLR)
LLR = ln
[Pr(bit = 0)
Pr(bit = 1)
],
where Pr(bit = 0) and Pr(bit = 1) are the probabilities that the corresponding bit has the value 0and 1, respectively. Here, a positive LLR indicates that the demodulator has greater confidence that thecorresponding bit has a value of 0, while a negative LLR indicates greater confidence in the bit value1. The magnitude of the LLR expresses how much confidence, where an infinite magnitude correspondsto absolute confidence in this bit value, while a magnitude of 0 indicates that the demodulator has noinformation about whether the bit value of 0 or 1 is more likely.
A polar decoder comprises three successive components, namely encoded block conditioning, the polardecoder kernal and information block conditioning, as shown in Figure 1. These components are discussedin the following paragraphs.
The input to the encoded block conditioning component of the polar decoder is a soft encoded blockand its output may be referred to as a soft kernal encoded block, which comprises N number of kernalencoded LLRs. In order to convert the M encoded LLRs into N kernal encoded LLRs, infinite-valuedLLRs may be interlaced with the soft encoded block, to occupy the positions that correspond to the0-valued kernal encoded bits that were removed by shortening in the polar encoder. Likewise, 0-valuedLLRs may be interlaced with the soft encoded block, to occupy the positions where kernal encodedbits were removed by puncturing. In the case of repetition, the LLRs that correspond to replicas of aparticular kernal encoded bit may be summed and placed in the corresponding position within the softkernal encoded block. Additionally, the encoded block conditioning component must perform the inverseof the sub-block interleaving, channel interleaving and code block concatenation operations in the NRpolar code, as shown in Figures 6 – 8.
The input to the polar decoder kernal is a soft kernal encoded block and its output may be referred toas a recovered kernal information block, which comprises N number of recovered kernal information bits.The polar decoder kernal may operate on the basis of various different algorithms, including SuccessiveCancellation (SC) decoding [1] and Successive Cancellation List (SCL) decoding [6], which are detailedin Sections III-A and III-B, respectively.
The input to the information block conditioning component of the polar decoder is a recovered kernalinformation block and its output may be referred to as a recovered information block, which comprisesK number of recovered information bits. The recovered information block may be obtained by removingall redundant bits from the recovered kernal information block. Additionally, the information blockconditioning component must perform the inverse of the code block segmentation, interleaving andscrambling operations in the NR polar code, as shown in Figures 6 – 8.
c©AccelerComm 2018 www.accelercomm.com 5
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
A. SC decodingA polar decoder kernal that operates on the basis of SC decoding may be considered to have a similar
graph structure to a polar encoder, as illustrated in Figure 2. An SC decoder performs computationspertaining to the XORs in the graph, according to a sequence that is dictated by data dependencies.However, the functionality of each XOR in the graph varies, when performing operations on LLRs and atdifferent steps in the SC decoding process. More specifically, there are three types of computations thatcan be performed by a particular XOR in the graph, depending on the availability of LLRs provided onthe connections on its right-hand side, as well as upon the availability of bits provided on the connectionson its left-hand side.
The first occasion when an XOR can contribute to the SC decoding process is when an LLR has beenprovided by each of the connections on its right-hand side. As shown in Figure 4(a), we refer to the firstand second of these two LLRs as xa and xb, respectively. This enables the XOR to compute an LLR xcfor the first of the two connections on its left-hand side, according to the f function
xc = f(xa, xb)
= 2 tanh−1(tanh(xa/2) tanh(xb/2)) (1)≈ sign(xa)sign(xb)min(|xa|, |xb|), (2)
where sign(·) returns −1 if its argument is negative and +1 if its argument if positive. Here, (2) is referredto as the min-sum approximation.
ud = ubxd = g(xa, xb, ua)
ua xa
xb
(b)
xc = f(xa, xb) xa
xb
(a)
ub
ua uc = XOR(ua, ub)
(c)
Fig. 4: The three computations that can be performed for an XOR in the polar code graph: (a) the ffunction, (b) the g function and (c) partial sum calculation.
Later in the SC decoding process, a bit ua will be provided on the first of the connections on the left-hand side of the XOR, as shown in Figure 4(b). Together with the LLRs xa and xb that were previouslyprovided using the connections on the right-hand side, this enables the XOR to compute an LLR xd forthe second of the two connections on its left-hand side, according to the g function
xd = g(xa, xb, ua)
= (−1)uaxa + xb. (3)
Later still, a bit ub will be provided on the second of the connections on the left-hand side of theXOR, as shown in Figure 4(c). Together with the bit ua that was previously provided using the first ofthe connections on the left-hand side, this enables the partial sum computation of bits uc and ud for thefirst and second connections on the right-hand side of the XOR, where
uc = XOR(ua, ub), (4)ud = ub. (5)
As may be appreciated from the discussions above, the f function of (1) or (2) may be used to propagateLLRs from right-to-left within the graph, while the partial sum computations of (4) and (5) may be used
c©AccelerComm 2018 www.accelercomm.com 6
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
to propagate bits from left-to-right and while the g function of (3) may be used to switch from propagatingbits to propagating LLRs.
In order that LLRs can be propagated from right to left, it is necessary to provide LLRs on theconnections on the right-hand edge of the graph. This is performed at the start of the SC decodingprocess, by providing successive LLRs from the soft kernal encoded block on successive connections onthe right-hand edge of the graph. Likewise, it is necessary to provide bits on the connections of the left-hand edge of the graph, in order to facilitate the propagation of bits from left to right. Here, a further datadependency beyond those described above is imposed. If the position of a particular connection on the left-hand edge of the graph corresponds to the position of an information bit in the kernal information block,then the bit that is input into that connection depends on the LLR that is output from that connection.More specifically, if a positive LLR is output on the connection, then a value of 0 may be selectedfor the corresponding bit of the recovered kernal information block and then input into the connection.Meanwhile, a negative LLR allows a value of 1 to be selected for the corresponding bit of the recoveredkernal information block and then input into the connection. In the case of a connection correspondingto a redundant bit within the kernal information block, the value of that redundant bit may be input intothe connection as soon as it is known. Here, frozen bits always adopt the value 0, but the value of CRCand PC bits will not become available until related information bits have been recovered.
In combination, the data dependencies described above impose a requirement for the information bitswithin the recovered kernal information block to be obtained one at a time on the connections on theleft edge of the graph, in order from top to bottom. More specifically, the SC decoding process beginsby using the f function (1) or (2) to propagate LLRs from the right hand edge of the graph, to the topconnection on the left-hand edge of the graph, allowing the first bit to be recovered. Following this, eachsuccessive bit from top to bottom is recovered by using the partial sum computations of (4) and (5) topropagate bits from left to right, then using the g function of (3) for a particular XOR to switch from bitpropagation to LLR propagation, before using the f function to propagate LLRs to the next connection onthe left-hand edge of the graph, allowing the corresponding bit to be recovered. This process is illustratedin the example of Figure 5.
B. SCL decodingIn the SC decoding process described in Section III-A, the value selected for each bit in the recovered
information block depends on the sign of the corresponding LLR, which in turn depends on the valuesselected for all previous recovered information bits. If this approach results in the selection of the incorrectvalue for a particular bit, then this will often result in the cascading of errors in all subsequent bits. Theselection of an incorrect value for an information bit may be detected with consideration of the subsequentfrozen bits, since the decoder knows that these bits should have values of 0. More specifically, if thecorresponding LLR has a sign that would imply a value of 1 for a frozen bit, then this suggests that anerror may have been made during the decoding of one of the preceding information bits. However, in theSC decoding process, there is no opportunity to consider alternative values for the preceding informationbits. Once a value has been selected for an information bit, the SC decoding process moves on and thedecision is final.
This motivates SCL decoding [6], which enables a list of alternative values for the information bits to beconsidered. As the decoding process progresses, it considers both options for the value of each successiveinformation bit. More specifically, an SCL decoder maintains a list of candidate kernal information blocks,where the list and the kernal information blocks are built up as the SCL decoding process proceeds. Atthe start of the process, the list comprises only a single kernal information block having a length of zerobits. Whenever the decoding process reaches a frozen bit, a bit value of 0 is appended to the end of
c©AccelerComm 2018 www.accelercomm.com 7
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
Info bit 3
(7) 0
(6) 0 (7) 0
(10) 0 (13) 1
(12) 1 (13) 1
(18) 0 (21) 0
(20) 0 (21) 0
(24) 0
(26) 1
(14) 1
(14) 1
(14) 1
(14) 1
(3) +0.09 (2) +0.72
(5) +0.81 (2) +0.09
(9) +0.96 (8) −3.13
(11) −4.09 (8) −0.96
(17) −2.02 (16) +4.28
(19) +2.26 (16) −2.02
(23) +0.73 (22) −9.81
(25) −10.5 (22) −0.73
(0) +2.41
(0) −0.87
(0) +3.56
(0) +0.09
(0) −3.12
(0) +1.15
(0) −0.72
(0) −2.66
(1) −2.41
(1) −0.87
(1) −0.72
(1) −0.09
(15) −5.53
(15) +2.02
(15) −4.28
(15) −2.75
Encoded LLR 0
Encoded LLR 1
Encoded LLR 2
Encoded LLR 3
Encoded LLR 4
Encoded LLR 5
Encoded LLR 6
Encoded LLR 7
Frozen bit
Frozen bit
Frozen bit
Frozen bit
Info bit 0
Info bit 1
Info bit 2
(4) 0
Fig. 5: Example SC decoding process, using the N = 8 polar code graph, for the case where a particulararrangement of frozen bits is used to convert a particular set of M = 8 encoded LLRs into the K = 4recovered information bits [1001]. The LLRs obtained using the f and g functions of (2) and (3) are shownabove each connection. The bits obtained using the partial sum computations of (4) and (5) are shownbelow each connection. The accompanying numbers in parenthesis identify the step of the SC decodingprocess where the corresponding LLR or bit becomes available.
each candidate kernal information block in the list. However, whenever the decoding process reaches aninformation bit, two replicas of the list of candidate kernal information blocks is created. Here, the bitvalue of 0 is appended to each block in the first replica and the bit value of 1 is appended to each blockin the second replica. Following this, the two lists are merged to form a new list having a length whichis double that of the original list. This continues until the length of the list reaches a limit L, which istypically chosen as a power of two. From this point onwards, each time the length of the list is doubledwhen considering an information bit, the worst L among the 2L candidate kernal information blocks areidentified and pruned from the list. In this way, the length of the list is maintained at L until the SCLdecoding process completes.
Throughout this process, the worst candidate kernal information blocks are identified by comparing andsorting metrics that are computed for each block [7], based on the LLRs obtained on the left-hand edge ofthe polar code graph. These LLRs are obtained throughout the SCL decoding process by using separatereplicas of the partial sum computations of (4) and (5) to propagate the bits from each candidate kernalinformation block into the polar code graph, from left to right. Following this, separate replicas of the gand f computations of (1) – (3) may be used to propagate corresponding replicas of the LLRs from rightto left, as in the SC decoding process described in Section III-A. The metric associated with appending
c©AccelerComm 2018 www.accelercomm.com 8
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
the bit value ul,j in the position j ∈ [0, N − 1] to the candidate kernal information block l is given by
φl,j(ul,j) = φl,j−1 + ln(1 + e−(1−2ul,j)xl,j) (6)
≈{φl,j−1 if ul,j = 1
2(1− sign(xl,j))
φl,j−1 + |xl,j| otherwise , (7)
where xl,j is the corresponding LLR and φl,j−1 is the metric that was calculated for the candidate kernalinformation block in the previous step of the SCL decoding process. Here, (7) is referred to as the min-sumapproximation. Note that since the metrics accumulate across all bit positions j ∈ [0, N − 1], they mustbe calculated for all L candidate kernal information blocks whenever a frozen bit value of 0 is appended,as well as for all 2L candidates when both possible values of an information bit are considered. In thelatter case, the 2L metrics are sorted and L candidates having the highest values are identified as beingthe worst and are pruned from the list.
Following the completion of the SCL decoding process, the candidate kernal information block havingthe lowest metric may be selected as the recovered kernal information block. Alternatively, in CRC-aidedSCL decoding [8], all candidates in the list that do not satisfy a CRC are pruned, before the candidatehaving the lowest metric is selected and output. The error correction capability of the NR polar code ischaracterised in Figures 9 – 11.
IV. CHALLENGES OF HARDWARE IMPLEMENTATION
There are several challenges associated with the hardware implementation of polar encoders and,in particular, polar decoders. This section begins by discussing challenges that are common to theimplementation of both polar encoders and polar decoders, before discussing additional challenges thatare specific to polar decoders.
Data dependencies. As described in Sections II and III, the polar encoding and decodingprocesses are characterised by particular data dependencies, which require the various processingoperations to be completed in a particular sequence. This limits the degree of parallel process-ing that can be achieved during the implementation of polar encoders and decoders. This isparticularly challenging in the case of polar decoders, owing to the serial nature of the SCand SCL algorithms. More specifically, the corresponding data dependencies require the kernalinformation bits to be recovered one after another, in order from top to bottom of the polarcode graph. During the polar decoding process, the data dependencies allow different numbersof operations to be completed in parallel at different times, as illustrated in the example ofFigure 5. In order to minimise the number of steps required to complete the decoding process,a large amount of hardware may used so that a single processing step is sufficient to completethe largest number of parallel operations that are supported by the decoder data dependencies.However, the data dependencies will prevent much of this hardware from being used throughoutthe rest of the decoding process, which may motivate the use of a smaller amount of hardwareand a greater number of steps. However, either way, the ratio of hardware resource usage tothe latency required to complete the decoding process may be unfavourable, unless sophisticatedalternative techniques can be developed and utilised.
Routing. A particular challenge in the implementation of polar encoders and decoders is routingthe correct information to the correct hardware components at the correct time. As illustrated by
c©AccelerComm 2018 www.accelercomm.com 9
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
the graph representations of Figure 2, the polar encoder and decoder include intricate networks ofinternal connections, particularly as the kernal block size N becomes large. Unless sophisticatedtechniques for routing information around the polar code graph are developed, large interconnec-tion networks are required to enable information to be routed between each pairing of hardwarecomponents. This is a particular challenge in the polar decoder, where partial sum bits must berouted from the left-hand edge of the graph to the computation of g functions that are distributedall over the graph, for example.
Flexibility. The 5G NR polar code is required to support a wide variety of kernal block sizesN , comprising up to a maximum of Nmax = 1024 bits. This requires a compromise to be struckbetween providing enough hardware to complete the processing of the longest block lengths witha low latency, and providing so much hardware that it cannot be fully exploited when completingthe processing of the shortest block lengths. Unless sophisticated techniques for managing thischallenge are developed, a poor ratio of hardware resource usage to the latency required tocomplete the decoding process will result for either the short or the long block lengths.
Interlacing. As described in Sections II and III, the block conditioning components of the polarencoder and decoder are required to insert or remove bits in the various blocks, in order totransform between block sizes of K, N and M . Here, the specific positions of the insertedor removed bits depend on the particular combination of K, N and M , requiring the use ofvery flexible interlacer and deinterlacer circuits, which must be capable of inserting or removingan arbitrary number of bits in arbitrary positions within the various blocks. Here, sophisticatedtechniques are required in order to facilitate hardware efficient block conditioning with lowlatency.
Complicated block conditioning. The information block conditioning and encoded block con-ditioning employed in the NR polar code is very complicated, since it includes code blocksegmentation, CRC attachment, CRC interleaving, CRC scrambling, PC and frozen bit insertion,sub-block interleaving, rate matching, channel interleaving and code block concatenation, asshown in Figures 6 – 8. Furthermore, there are intricate interdependencies between these opera-tions, where the frozen bit insertion process in the information block conditioning is dependenton the rate matching operation in the encoded block conditioning, for example. In contrast toother channel codes, where the various information and encoded block conditioning operationscan be completed separately, using independent processing blocks, the NR polar code requires itsprocessing blocks to be tightly coupled together in order to maximise the achievable performance.
The following challenges are specific to the implementation of polar decoders.
Decoder complexity. The complexity of a polar decoder is much greater than that of a polarencoder for three reasons. Firstly, while polar encoders operate on the basis of bits, polar decodersoperate on the basis of the probabilities of bits, which require more memory to store and morecomplex computations. Secondly, while polar encoders only have to consider the particularpermutation of the information block that they are presented with, polar decoders must considerall possible permutations of the information block and must select that which is most likely.Finally, while polar encoders only process each information block once, an SCL polar decodermust process each information block L number of times, in order to achieve sufficiently strongerror correction. For these reasons, the latency, hardware resource usage and power consumption
c©AccelerComm 2018 www.accelercomm.com 10
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
of polar decoders are typically orders of magnitude greater than those of polar encoders.
Copy. As described in Section III-B, the SCL decoding process creates replicas of the listof candidate kernal information blocks, as well as all associated intermediate LLRs and bits.However, copying this large amount of information within a hardware implementation imposesparticular challenges for the implementation of the memory architecture. One option is to employmemory blocks having very large bandwidths, allowing the copy process to be completed withina small number of steps. Alternatively, the copy process could be completed over many steps,requiring only a moderate memory bandwidth. However, either way, the ratio of hardwareresource usage to the latency required to complete the decoding process is unfavourable, unlesssophisticated alternative techniques can be developed and utilised. This challenge is particularlyimportant, since the hardware resource usage of polar decoders is typically dominated by memory.
Sort. Another key challenge in the implementation of the SCL decoding process is imposedby metric sorting. As described in Section III-B, this sort is required in order to identifyand prune the worst L candidate kernal information blocks, among the merged list of 2Lcandidates. One option is to employ a large amount of hardware to simultaneously compareevery one of the 2L candidates with every other one of the candidates, so that the sorting canbe completed within a short latency. Alternatively, the hardware resource requirement can bereduced by structuring successive comparisons to efficiently reuse intermediate results, at thecost of increasing the latency required to rank the 2L candidates. However, either way, theratio of hardware resource usage to the latency required to complete the decoding process isunfavourable, unless sophisticated alternative techniques can be developed and utilised.
CRC integration. CRC bits are employed by the NR polar code in order to facilitate errordetection and also to improve the error correction capability of the polar decoder. However,there is a tradeoff between the error detection capability and the error correction capability. Inorder to meet the error detection reliability requirements of NR, the CRC bits must be handledvery carefully, in a manner which is not captured in the NR standards. In particular, the CRC(and PC) bits must be decoded as an integral part of the polar decoding process, using anunconventional decoding technique. This is in contrast to conventional CRCs, which may bedecoded separately from other channel codes, in independent processing blocks, leading to amuch simpler implementation.
V. CONCLUSIONS
In this white paper, we have discussed the selection of polar codes in the 5G NR standard and haveprovided tutorials on the polar encoding and decoding processes, paying particular attention to the SC andSCL decoding algorithms. Furthermore, we have discussed the challenges associated with the hardwareimplementation of polar encoder and decoders, noting that these challenges are particularly great in thecase of the polar decoder, since its complexity is orders of magnitude greater than that of the polar encoder.
At AccelerComm, we have been researching polar codes since they were first published in 2009. Wehave drawn upon our expertise and intuition for polar codes in order to develop polar encoder and decodersolutions that address all of the challenges described in this white paper. We offer patent-pending first-to-market polar encoder and decoder Intellectual Property (IP) which allow all of the 5G requirements tobe met in Field Programmable Gate Array (FPGA) and Application Specific Integrated Circuit (ASIC)implementations. More specifically, we have developed sophisticated solutions that overcome all of the
c©AccelerComm 2018 www.accelercomm.com 11
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
challenges described in Section IV, offering much greater flexibility, error correction capability andhardware efficiency than all previously published implementations of polar encoders and decoders.
REFERENCES
[1] E. Arikan, “Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels,”IEEE Transactions on Information Theory, vol. 55, no. 7, pp. 3051–3073, July 2009.
[2] 3rd Generation Partnership Project; Technical Specification Group Radio Access Network; NR; Multiplexing and channel coding (Release15), 3GPP Std. TS 38.212, Rev. 15.0.0, December 2017.
[3] K. Niu and K. Chen, “Crc-aided decoding of polar codes,” IEEE Communications Letters, vol. 16, no. 10, pp. 1668–1671, October2012.
[4] Huawei, HiSilicon, “Polar code construction for NR,” in 3GPP TSG RAN WG1 Meeting #86bis, Lisbon, Portugal, October 2016,R1-1608862.
[5] ZTE, ZTE Microelectronics, “Rate matching of polar codes for eMBB,” in 3GPP TSG RAN WG1 Meeting #88, Athens, Greece, February2017, R1-1701602.
[6] I. Tal and A. Vardy, “List decoding of polar codes,” in 2011 IEEE International Symposium on Information Theory Proceedings, July2011, pp. 1–5.
[7] A. Balatsoukas-Stimming, M. B. Parizi, and A. Burg, “Llr-based successive cancellation list decoding of polar codes,” IEEE Transactionson Signal Processing, vol. 63, no. 19, pp. 5165–5179, Oct 2015.
[8] K. Niu and K. Chen, “Crc-aided decoding of polar codes,” IEEE Communications Letters, vol. 16, no. 10, pp. 1668–1671, October2012.
[9] T. Erseghe, “Coding in the finite-blocklength regime: Bounds based on Laplace integrals and their asymptotic approximations,” IEEETransactions on Information Theory, vol. 62, no. 12, pp. 6854–6883, December 2016.
Prof. Robert G. Maunder is an industry authority on error correction and channel coding. As a professor at theUniversity of Southampton, he built a team of experts and published over 100 IEEE papers and resources on the jointdesign of algorithms and hardware implementations for error correction, including turbo, LDPC and polar codes. Thisexpertise is being leveraged by Prof Maunder’s founding of AccelerComm, which is a semiconductor IP-core companyspecialising in patent-pending channel coding solutions.
c©AccelerComm 2018 www.accelercomm.com 12
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
CRC24C
attachment
CRC
interleaving
Polar
encoding
Rate
matching
Rate
dem
atching
Sub-block
deinterleaving
Sub-block
interleaving
Determination
ofknow
nbits
Determination
ofhigher
layer
param
eters
PBCH
payload
descram
bling
PBCH
payload
deinterleaving
Dem
ultiplexing
from
PBCH
Multiplexing
onto
PBCH
PBCH
payload
scrambling
Determination
ofhigher
layer
param
eters
PBCH
payload
interleaving
PBCH
payload
generation
PBCH
payload
extraction
Section
ofTS38.212
(x.x.x.x)
(7.1.1)
(7.1.2)
(7.1.3)
(7.1.4)
(7.1.4)
Frozenbit
insertion
(7.1.4)
PBCH
encoder
(7.1.1)
Distributed-C
RC-aided
SCL
polar
decoding
PBCH
decoder
(7.1.5)
(7.1.5)
Key:
Fig.
6:B
lock
diag
ram
ofth
epo
lar
enco
der
and
deco
der
empl
oyed
byth
ePu
blic
Bro
adca
stC
hann
el(P
BC
H)
of3G
PPN
ewR
adio
.
c©AccelerComm 2018 www.accelercomm.com 13
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
CRC
interleaving
Polar
encoding
CRC
scrambling
Zero
pad
ding
Sub-block
interleaving
Rate
matching
Rate
dem
atching
Sub-block
deinterleaving
Multiplexing
onto
PDCCH
Dem
ultiplexing
from
PDCCH
Determination
ofinform
ation
block
length
Determination
ofencoded
block
length
Determination
ofRNTI
DCIbit
sequence
generation
DCIbit
sequence
extraction
Section
ofTS38.212
(x.x.x.x)
(7.3.1)
(7.3.3)
(7.3.3)
Frozenbit
insertion
(7.3.3)
PDCCH
encoder
PDCCH
decoder
(7.3.2)
(7.3.1)
(7.3.2)
attachment
CRC24C
Ones-initialised
(7.3.4)
(7.3.4)
Distributed-C
RC-aided
SCL
polar
decoding
Determination
ofRNTI
Key:
Fig.
7:B
lock
diag
ram
ofth
epo
lar
enco
der
and
deco
der
empl
oyed
byth
ePh
ysic
alD
ownl
ink
Con
trol
Cha
nnel
(PD
CC
H)
of3G
PPN
ewR
adio
.
c©AccelerComm 2018 www.accelercomm.com 14
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
Chan
nel
interleaving
Rate
matching
(6.3.1.4.1/6.3.2.4.1)
Code
block
segm
entation
Chan
nel
deinterleaving
Rate
dem
atching
Sub-block
deinterleaving
Sub-block
interleaving
Polar
encoding
Rate
matching
Determination
ofencoded
block
length
(6.3.1.4.1/6.3.2.4.1)
UCIbit
sequence
generation
Multiplexing
onto
PUCCH/
PUSCH
Dem
ultiplexing
from
PUCCH/
PUSCH
Rate
dem
atching
Determination
ofinform
ation
block
length
UCIbit
sequence
extraction
Section
ofTS38.212
(x.x.x.x)
K∈[12,1706]
K∈[1,11]
(6.3.1.3.1/6.3.2.3.1)
(6.3.1.1/6.3.2.1)
(6.3.1.2.1/6.3.2.2.1)
(6.3.1.2.1/6.3.2.2.1)
(6.3.1.3.1/6.3.2.3.1)
(6.3.1.3.2/6.3.2.3.2)
Short
block
encoding
K∈[12,1706]
K∈[1,11]
Short
block
decoding
PC/C
RC-aided
SCL
polar
decoding
PUCCH/P
USCH
decoder
PUCCH/P
USCH
encoder
attachment
CRC11
CRC6or
(6.3.1.4.2/6.3.2.4.2)
(6.3.1.5/6.3.2.5)
(6.3.1.6/6.3.2.6)
(6.3.1.4.1/6.3.2.4.1)
(6.3.1.4.1/6.3.2.4.1)
Identicalto
LTE
Identicalto
LTE
Key:
Code
block
segm
entation
PC
and
frozen
bit
insertion
Code
block
concatenation
Code
block
concatenation
Fig.
8:B
lock
diag
ram
ofth
epo
lar
enco
der
and
deco
der
empl
oyed
byth
ePh
ysic
alU
plin
kC
ontr
olC
hann
el(P
UC
CH
)of
3GPP
New
Rad
io.
c©AccelerComm 2018 www.accelercomm.com 15
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
-12 -11 -10 -9 -8 -7 -6 -5
Es/N0 [dB]
10 -3
10 -2
10 -1
100
BLE
R
PBCH polar code, K = 32, M = 864, QPSK, AWGN
L=1L=2L=4L=8L=16L=32capacity
Fig. 9: Plot of Block Error Rate (BLER) versus channel Signal to Noise Ratio (SNR) Es/N0 for thePublic Broadcast Channel (PBCH) polar code of 3GPP New Radio, when using Quadrature Phase ShiftKeying (QPSK) for communication over an Additive White Gaussian Noise (AWGN) channel. Here, Kis the number of bits in each information block, M is the number of bits in each encoded block and L isthe list size used during min-sum Successive Cancellation List (SCL) decoding. The simulation of eachSNR was continued until 1000 block errors were observed. Capacity plots are provided by the O(n−2)metaconverse PPV upper bound [9].
c©AccelerComm 2018 www.accelercomm.com 16
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
8 16 32 64 128
K
-15
-10
-5
0
5
10
Req
uire
d E
s/N0 [d
B]
PDCCH polar code, BLER = 0.001, QPSK, AWGN
M=108, L=8 M=216, L=8 M=432, L=8 M=864, L=8M=1728, L=8 M=108, L=16 M=216, L=16 M=432, L=16 M=864, L=16M=1728, L=16 M=108, capacity M=216, capacity M=432, capacity M=864, capacityM=1728, capacity
Fig. 10: Plot of Signal to Noise Ratio (SNR) Es/N0 required to achieve a Block Error Rate (BLER) of 10−3
versus number bits in each information block K for the Physical Downlink Control Channel (PDCCH)polar code of 3GPP New Radio, when using Quadrature Phase Shift Keying (QPSK) for communicationover an Additive White Gaussian Noise (AWGN) channel. Here, M is the number of bits in each encodedblock and L is the list size used during min-sum Successive Cancellation List (SCL) decoding. Thesimulation of each SNR was continued until 100 block errors were observed. Capacity plots are providedby the O(n−2) metaconverse PPV upper bound [9].
c©AccelerComm 2018 www.accelercomm.com 17
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
8 16 32 64 128 256 512 1024 2048
K
-25
-20
-15
-10
-5
0
5
10
15
Req
uire
d E
s/N0 [d
B]
PUCCH polar code, BLER = 0.001, QPSK, AWGN
M=54, L=8 M=108, L=8 M=216, L=8 M=432, L=8 M=864, L=8 M=1728, L=8 M=3456, L=8 M=6912, L=8M=13824, L=8 M=54, L=16 M=108, L=16 M=216, L=16 M=432, L=16 M=864, L=16 M=1728, L=16 M=3456, L=16 M=6912, L=16M=13824, L=16 M=54, capacity M=108, capacity M=216, capacity M=432, capacity M=864, capacity M=1728, capacity M=3456, capacity M=6912, capacityM=13824, capacity
Fig. 11: Plot of Signal to Noise Ratio (SNR) Es/N0 required to achieve a Block Error Rate (BLER) of10−3 versus number bits in each information block K for the Physical Uplink Control Channel (PUCCH)polar code of 3GPP New Radio, when using Quadrature Phase Shift Keying (QPSK) for communicationover an Additive White Gaussian Noise (AWGN) channel. Here, M is the number of bits in each encodedblock and L is the list size used during min-sum Successive Cancellation List (SCL) decoding. Thesimulation of each SNR was continued until 100 block errors were observed. Capacity plots are providedby the O(n−2) metaconverse PPV upper bound [9].
c©AccelerComm 2018 www.accelercomm.com 18