CN110771047B

CN110771047B - Polarity decoder for LLR domain computation with F and G functions

Info

Publication number: CN110771047B
Application number: CN201880039349.2A
Authority: CN
Inventors: R·蒙德; M·布雷扎; 钟世达; I·安德雷德; 陈泰海
Original assignee: Communication Co ltd
Current assignee: Communication Co ltd
Priority date: 2017-06-15
Filing date: 2018-06-12
Publication date: 2023-08-04
Anticipated expiration: 2038-06-12
Also published as: EP3639374C0; CN110741557B; EP3639374B1; GB2563419A8; GB2563463A; EP3639376B1; US20210159915A1; CN110771047A; GB201714766D0; GB2563419B; GB2563463B; WO2018229073A1; EP3639376A1; US20200212936A1; EP3639374A1; US11165448B2; GB2563419A; WO2018229068A1; CN110741557A; GB201709505D0

Abstract

A polarity decoder core (111) is described. The polarity decoder core (111) comprises a processing unit (2201), the processing unit (2201) having: at least one input configured to receive at least one input log-likelihood ratio LLR (2202, 2203); logic configured to manipulate at least one input LLR; and at least one output configured to output the manipulated at least one LLR. The logic of the processing unit (2201) includes only a single two-input adder (2207) to manipulate the at least one input LLR, and the input LLR and the manipulated LLR are in the format of a fixed-point representation including 2's complement binary digits and additional sign bits.

Description

Polarity decoder for LLR domain computation with F and G functions

Technical Field

The technical field of the present invention relates to a polarity decoder, a communication unit, an integrated circuit and a method for polarity decoding. The invention is applicable to, but not limited to, polarity decoding for current and future communication standards.

Background

According to the principles of Forward Error Correction (FEC) and channel coding, polar coding [1] may be used to protect information from transmission errors within imperfect communication channels, which may be subject to noise and other adverse effects. More specifically, a polarity encoder is used in the transmitter to encode the information and a corresponding polarity decoder is used in the receiver to mitigate transmission errors and recover the transmitted information. The polarity encoder converts the information block comprising K bits into an encoded block comprising a greater number of bits M > K according to a prescribed encoding procedure. In this way, the encoded block conveys K bits of information from the information block along with M-K bits of redundancy. This redundancy can be exploited in the polarity decoder according to a defined decoding procedure in order to estimate the value of the original K bits from the information block. Assuming that the conditions of the communication channel are not too stringent, the polar decoder can correctly estimate the value of K bits from the information block with high probability.

The polarity encoding process includes three steps. In the first information block adjusting step, redundant bits are inserted into the information block in prescribed positions to increase the size of the information block from K bits to N bits, where N is a power of 2. In the second polarity encoding core step, the N bits of the resulting core information block are combined in different combinations using successive exclusive or (XOR) operations according to a prescribed graph structure. The graph structure includes n=log ₂ (N) successive stages, each stage comprising N/2 XOR operations, which combine specific bit pairs. In a third step, a coding block adjustment is applied to the resulting core coding block to adjust the size of the resulting core coding block from N bits to M bits. This may be achieved by repeating or removing certain bits in the core encoded block according to a prescribed method to produce an encoded block that is transmitted over a channel or stored in a storage medium.

The soft coded blocks are received from the channel or retrieved from the storage medium. The polarity decoding process includes three steps, which correspond to the three steps in the polarity encoding process, but in reverse order. In the first encoding block adjusting step, redundant soft bits are inserted or combined into the soft encoding block in prescribed positions to adjust the size of the soft encoding block from M soft bits to N soft bits, where N is a power of 2. In the second polarity decoding core step, the operation is performed based on a prescribed pattern structure, and the N soft bits of the resulting core encoded block are combined in different combinations using a Successive Cancellation (SC) 1 or Successive Cancellation List (SCL) 7 procedure. In a third step, an information block adjustment is applied to the resulting recovered core information block to reduce the size of the core information block from N bits to K bits. This may be accomplished by removing specific bits in the recovered core information block according to a prescribed method to produce a recovered information block.

Has previously proposed [8]、[14]-[24]SC [1 ]]And SCL [7 ]]Several hardware implementations of polar decoders that can flexibly support different core block sizes N e {2,4,8, & gt, N at runtime _max }. These decoders conceptually use the graph [15 ]](or equivalently tree [18 ]]) Representing a polarity code, the size of the graph varies depending on the kernel block size N. As illustrated in fig. 7, the diagram includes N inputs on its right edge that accept soft bits from the demodulator (typically in Log Likelihood Ratio (LLR) [8]In the form of (a) that includes N outputs on its left edge that supply hard bit decisions for information and frozen bits (frozen bits). Between these two edges, the graph includes log2 (N) horizontally cascaded stages, each comprising N/2 vertically aligned XOR operations.

[8]、[14]-[24]Is implemented with dedicated hardware to use f and g functions [8 ]]Soft bits are combined at the location of each XOR in the graph and conceptually propagated from right to left in the graph. Likewise, dedicated hardware is conceptually employed at the left edge of the graph to convert soft bits to hard bit decisions, and to calculate and classify SCL path metrics [8 ]]. Finally, hard bit decisions are combined according to the XOR in the diagram using dedicated hardware and the resulting parts and bits (partial sum bits) are conceptually propagated from left to right in the diagram so that the g-function can use them. Note that the dependency of the g-function on the parts and bits imposes a set of data dependencies, which requires that all of the above operations be performed according to a particular schedule. This leaves only a limited degree of freedom for parallel execution operations, which varies as the decoding process proceeds. [14 ]The line decoder (line decoder) achieves a high degree of parallel processing during soft bit propagation, which allows all f and g functions to be calculated within a delay of 2N-2 clock cycles. This is using N _max L lines of 2 processing units, wherein toIn SC decoding, l=1, and in SCL decoding, L>1 is the list size. Each processing unit is capable of calculating one f-function or one g-function in each clock cycle. This parallelism is sufficient to perform the maximum number of computations that are not prevented by data dependencies simultaneously within any single stage of the graph. When n=n _max This parallel processing peak opportunity is encountered when computing the g-function for the rightmost stage in the graph. However, when N<N _max The above data correlation prevents parallelism from being fully exploited when calculating the f or g functions at other times during the decoding process. For this reason, [14]The line decoder of (a) suffers from poor hardware efficiency and also requires excessive memory bandwidth, which may grant simultaneous access to up to N _max Soft bits. Thus, [8 ]]、[15]-[24]By taking the degree of parallel processing from LN _max Reducing/2 to LP to improve hardware efficiency and memory bandwidth requirements, where P e {1,2,4, 8. However, this approach still suffers from the inability to take advantage of all parallelism of the leftmost stage, and requires several clock cycles to execute the rightmost stage f and g, increasing the overall delay associated with the f and g computation to And (3) a clock period. In addition to the above-described clock cycles required for f and g calculations, the SCL decoder typically requires at least one additional clock cycle to calculate and classify the path metrics associated with each of the N hard bit decisions made on the left edge of the graph. In the case of line decoding, performing f, g and path metric calculations and classifying the latter requires a delay of 3N-2 clock cycles. However, at [32 ]]、[33]In, path metrics are calculated and classified once for several bits, along with the corresponding f and g functions in the leftmost stage of the graph. When one time make 2 ^k In hard bit decision making, the method reduces the total number of clock cycles required for line decoding to N/2 ^k-2 -2[33]Where k e {1,2,3, }. Note that when the polarity code employs a low code rate, the delay of SCL decoding may be further reduced. In this case, although this technique does not improve the high coding rate encounteredWorst case delays, any computation related to the frozen bit at the beginning of the block may be skipped.

Note that the propagation of the parts and bits is typically performed simultaneously with the above computation in the same clock cycle. At [8 ]]、[15]、[30]The partial and update logic is to accumulate different combinations of decoded bits and the interconnection network is to deliver them to the corresponding g-function processes. This results in a significant hardware overhead and long critical paths, thereby limiting the achievable hardware efficiency, throughput and delay. In contrast, [19 ]、[21]、[28]、[32]、[34]The feed forward architecture of (c) propagates portions and bits to each successive stage of the graph using dedicated hardware. However, the complexity of the feed-forward architecture grows rapidly for each successive stage, limiting the maximum core block length N that can be supported _max And limits hardware efficiency. In contrast, [17]、[22]、[27]、[35]The method of (2) uses a simplified polar encoder core to compute the parts and bits, although this does not benefit from reusing the computation performed as a natural part of the decoding process. In the previous polar decoder hardware implementations described above, the hardware resource usage is typically dominated by memory. For example, since LLR is required to be stored at the interface between each pair of connected stages in the graph, at [8 ]]In an l=8 SCL decoder, the memory occupies 90% of the hardware. The next largest contributor to hardware resources is used to process and propagate LLRs and parts and bits, at [8]In the l=8 SCL decoder, approximately 5% of the hardware is occupied. In this processing and propagation hardware, about 80% is dedicated to the AND section and bit [15 ]]An associated interconnection network. Finally, at [8]In an l=8 SCL decoder and at [18]、[19]In the l=4 SCL decoder, about 1% of the hardware is dedicated to path metric computation and classification. However, at [32 ] ]、[33]In multi-bit approaches of (a), it is expected that these operations will occupy more hardware.

Disclosure of Invention

The present invention provides a polarity decoder, a communication unit, an integrated circuit and a method for polarity decoding as described in the appended claims.

Specific embodiments of the invention are set forth in the dependent claims.

These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.

Drawings

Further details, aspects and embodiments of the invention will be described, by way of example only, with reference to the accompanying drawings. In the drawings, like reference numbers may be used to identify similar or functionally similar elements. Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale.

Fig. 1 illustrates an example top-level schematic diagram of a communication unit with a polarity encoder and a polarity decoder, suitable for use in accordance with example embodiments of the present invention.

FIG. 2 illustrates a generator matrix F in accordance with an exemplary embodiment of the present invention,And->Is a graphical representation of an example of (a).

FIG. 3 illustrates an example polarity encoding process in which a generator matrix is used in accordance with an example embodiment of the present inventionIs illustrated wherein k=4 information bits a= [1001 ] are used with a particular frozen bit pattern ]Conversion to m=8 coded bits b= [00001111 ]]Is the case in (a).

Fig. 4 illustrates examples of three calculations that may be performed for the basic calculation unit of the proposed polar decoder core, according to an example embodiment of the invention: (a) f function, (b) g function, and (c) part and calculation.

FIG. 5 illustrates an example of an SC decoding process in which a generator matrix is used, according to an example embodiment of the inventionFor a particular vector in which m=8 coded LLRs are encoded using a particular frozen bit patternConversion of b to k=4 recovered information bits a= [1001 ]]Is the case in (a).

FIG. 6 illustrates a diagram for C in accordance with an example embodiment of the invention _max An example schematic of the proposed polar decoder core for the case of=5.

FIG. 7 illustrates a generator matrix according to an example embodiment of the inventionThe generator matrix has been grouped to include s= [1;2;2;1]C=4 columns of stages and corresponds to s _o =1 and s _i ＝2。

Fig. 8 illustrates an example flowchart of a decoding process employed by the proposed polar decoder core, whereby each period around the main loop of the flowchart corresponds to one step of the decoding process, according to an example embodiment of the present invention.

Fig. 9 illustrates an example timing diagram of a proposed polar decoder core according to an example embodiment of the present invention.

Fig. 10 illustrates an exemplary diagram of the multiple steps required for the decoding process of the proposed polar decoder core according to an exemplary embodiment of the present invention.

FIG. 11 illustrates a generator matrix according to an example embodiment of the inventionFor employing a graphical representation comprising s= [1;2;2;1]C=4 columns of stages.

Fig. 12 illustrates an example schematic diagram of a proposed processing unit that may be reconfigured to perform either the "f" function of (2) or the "g" function of (3) according to an example embodiment of the invention.

Fig. 13 illustrates an example of a known technique for a 2's complement implementation of the "f" function of (2): (a) simple implementation; (b) reducing hardware implementation; (c) reducing implementation of critical paths.

FIG. 14 illustrates a proposed polar decoder core according to an example embodiment of the present inventionAn example schematic diagram of an internal data path in (a), for s _i =2 and n _i Example of =8.

FIG. 15 illustrates an example schematic diagram of an external data path for SC decoding in a proposed polar decoder core, according to an example embodiment of the invention, which is directed to s _o =2 and n _i Example of =4.

FIG. 16 illustrates an example schematic diagram of parts and data paths in a proposed polar decoder core, for s, according to an example embodiment of the invention _i =2 and n _i Example of =8.

Fig. 17 illustrates an example schematic diagram of interactions between the internal data paths of the proposed polar decoder core, LLR memory blocks and the controller, according to an example embodiment of the present invention.

FIG. 18 illustrates a target for s according to an example embodiment of the invention _i =1 and n _i An example schematic of the interaction between the internal data path of the polar decoder core, the bit memory block and the controller for the case of =4.

Fig. 19 illustrates a diagram for n=128, N according to an example embodiment of the invention _max ＝128、s _o ＝1、s _i =2 and n _i Example of the content of LLR after decoding procedure of case=8 is completed.

Fig. 20 illustrates a diagram for n=64, N according to an example embodiment of the invention _max ＝128、s _o ＝1、s _i =2 and n _i Examples of the LLR and the content of the bit memory after the decoding process of the case of=8 is completed.

Fig. 21 illustrates a diagram for n=32, N according to an example embodiment of the invention _max ＝128、s _o ＝1、s _i =2 and n _i Examples of the LLR and the content of the bit memory after the decoding process of the case of=8 is completed.

Fig. 22 illustrates a diagram for n=16, N according to an example embodiment of the invention _max ＝128、s _o ＝1、s _i =2 and n _i Examples of the LLR and the content of the bit memory after the decoding process of the case of=8 is completed.

Fig. 23 illustrates a diagram for n=8, N according to an example embodiment of the invention _max ＝128、s _o ＝1、s _i =2 and n _i Examples of the LLR and the content of the bit memory after the decoding process of the case of=8 is completed.

Fig. 24 illustrates an exemplary computing system that may be employed in an electronic device or a wireless communication unit to perform polarity encoding operations in accordance with some example embodiments of the present invention.

Detailed Description

In a first aspect, examples of the present invention describe a polar decoder core comprising a processing unit having at least one input configured to receive at least one input log-likelihood ratio, LLR, logic configured to manipulate the at least one input LLR, and at least one output configured to output the manipulated at least one LLR. The polarity decoder core is characterized in that the logic of the processing unit includes only a single two-input adder to manipulate at least one input LLR. The input LLR and the manipulated LLR are in the format of a fixed-point representation comprising 2's complement binary numbers and additional sign bits. In this way, the hardware complexity of the processing unit is reduced to that of only a single adder and some support logic.

In some examples, the processing unit is configured to perform the "g" function or the "f" function at a certain time, or to perform only one of the following: the "g" function or the "f" function. In this way, the hardware of the processing unit can be minimized by flexibly reusing the hardware of the processing unit to perform both "g" and "f" functions, if necessary, or by optimizing to perform one or the other of the "g" and "f" functions whenever the other is not needed.

In some examples, the "f" function includes: wherein sign (·) returns "+1" if its argument is negative and "+1" if its argument is regular. In this way, hardware complexity is reduced compared to a modification of the "f" function that uses the tanh function or other complex functions.

In some examples, the "g" function includes:

in this way, the processing unit is able to perform core operations of successive cancel and successive cancel list decoding algorithms.

In some examples, the at least one input LLR is represented as a fixed point number representation with W+1 bitsWherein->Is a tag with an appended sign bit,is a tag that is used as a bit of a sign bit of a complement binary number part (two's complement binary number part) of 2, which is represented by a fixed point number, and >Is a tag of the least significant bit LSB of the complement binary number portion of 2 represented by a fixed point number. In this way, the additional sign bit may eliminate the need for frequent negation (negate) of the complement number of 2 that would otherwise result during successive cancellation and successive cancel list decoding algorithms.

In some examples, a single two-input adder includes two inputs and is configured to provide a 2's complement output, where each input has a value expressed from a fixed point number [ ]And->) The complement output of 2 includes a second number of bits ("w+1" bits) including additional bits to avoid overflow. In this way, the need to clip the output of each two-input adder is eliminated, thereby enhancing the error correction capability of the polar decoder.

In some examples, the output of the processing unit includes a third number ("w+2") of bits that combine the additional bits introduced by the single two-input adder plus the additional sign bits. In this way, the need to clip the output of each processing unit is eliminated, thereby enhancing the error correction capability of the polar decoder.

In some examples, when implementing the "g" function, a single two-input adder is used to manipulate the 2's complement binary number of at least one input LLR to be based on part and bit And at least one additional sign bit of the input LLR by adding the first LLR +.>The complement binary part of 2 of (2) is added to the second LLR +.>In the 2's complement binary part or from the second LLR +.>Subtracting the first LLR ++in the 2's complement binary part of (2)>2 of (2) to obtain LLR +.>2 of the complement binary number of (c). In this way, the "g" function may be completed using the same operations as the "f" function, allowing the hardware to be effectively reused for both functions.

In some examples, when implementing the "f" function, the 2's complement binary number of the at least one input LLR is manipulated using a single two-input adder to operate on the first LLR by combining the first LLR based on the additional sign bits of the at least one input LLRThe complement binary part of 2 of (2) is added to the second LLR +.>In the 2's complement binary part or from the second LLR +.>Subtracting the first LLR ++in the 2's complement binary part of (2)>2 of (2) to obtain the min term +.f. of the "f" function>2 of the complement binary number of (c). The operation is completed by using the MSB of the resulting complement number of 2 output from the single two-input adder to select the first LLR +. >The complement binary part of 2 or the second LLR +.>2 to provide at least one LLR for output manipulation2 of the complement binary number of (c). In this wayThe "f" function may be accomplished using only a single two-input adder, rather than using two or more two-input adders as in other implementations.

In some examples, the manipulated at least one LLR is obtained from at least one ofAnd->) Is added with the sign bit: the MSB of the 2's complement binary part of the at least one input LLR and the additional sign bit of the at least one input LLR. In this way, the additional sign bit may be obtained using only simple logic hardware.

In some examples, obtaining the manipulated at least one LLRIs taken as the second LLRIs added to the value of the additional sign bit. In this way, no additional logic hardware is required to obtain the additional sign bit.

In some examples, the polarity decoder core further includes an external data path including an f/g functional graph including a first number (s _o ) Is a processing stage of (a). First number (s _o ) Each of the processing stages of (1) includes a second number of performing only "f" functions And a second number of processing units performing only "g" functions->Is provided). In this way, some processing units may be optimized to perform only the "f" function, while other processing units may be optimized to perform only the "g" function, thereby reducing hardwareAnd (3) using.

In some examples, the polarity decoder core includes an internal data path including a plurality of processing units arranged in a plurality (s _i ) A processing stage configured to perform at least one of an "f" function or an "g" function. The rightmost stage includes a first number (n _i 2) and each successive stage to the left of the rightmost stage contains half the number of processing units of its corresponding processing stage to the right. In this way, the hardware of the internal data path can be flexibly reused to perform different combinations of the "f" and "g" functions, thereby reducing hardware usage.

In some examples, the range (0 to 2 ^SC The access index (v) in 1) is expressed on the basis of 2 as having a first number (s _c ) Wherein each successive bit from right to left is used to control whether an "f" function or a "g" function is performed in the left to right internal data path by a processing unit of each successive stage of the plurality of processing units. The execution is such that the Least Significant Bit (LSB) of the binary number is used to control the leftmost stage of the plurality of processing units and the Most Significant Bit (MSB) of the binary number is used to control the rightmost stage of the plurality of processing units. In this way, control of the processing unit can be achieved using simple hardware based only on the counter of the access index.

In some examples, delta bit widths represented by fixed point numbers are used in each successive processing stage from right to left. In this way, overflows in the external and internal data paths can be avoided, thereby improving the error correction capability of the polarity decoder.

In some examples, the polar decoder core further includes a clipping circuit 2411, the clipping circuit 2411 configured to reduce a bit width (W) of the LLR output on a leftmost stage of the plurality of processing units to match a bit width of the LLR on a rightmost stage of the plurality of processing units. In this way, all LLR memory blocks may use the same number of bits to represent LLRs without requiring a greater number of bits in consecutive LLR memory blocks. This reduces hardware usage while minimizing the use of clipping to preserve the error correction capability of the polar decoder.

In some examples, the clipping circuit 2411 is configured to additionally reduce the bit width of an intermediate processing stage between a rightmost stage of the plurality of processing units and a leftmost stage of the plurality of processing units. In this way, the hardware resource usage of the processing unit in the leftmost stage can be reduced at the cost of slightly reducing the error correction capability of the polar decoder.

In some examples, the polarity decoder core further includes a plurality of LLR memory blocks coupled to the plurality of processing units, each of the LLR memory blocks configured to convert a respective input LLR to a complement fixed point number of 2 stored in the plurality of LLR memory blocks. In this way, the number of bits that must be stored in the LLR memory block is reduced, thereby reducing the associated hardware usage.

In some examples, if the additional sign bit of the fixed-point representation is set, when the input LLR is written to the LLR memory block, the 2's complement binary number portion of the fixed-point representation is inverted by inverting all bits of the fixed-point representation, and then a further single two-input adder is used to increment the resulting value to convert to the 2's complement fixed-point representation. In this way, the conversion from a fixed-point representation to a 2's complement fixed-point representation can be accomplished using only simple hardware.

In some examples, the 2's complement binary numbers of at least one input LLR are pre-converted to a fixed point representation by appending the 2's complement binary numbers to zero-valued additional sign bits when the input LLR is read from the LLR memory block. In this way, the conversion from the 2's complement fixed-point representation to the fixed-point representation can be accomplished using only simple hardware.

In a second aspect, an example of the invention describes a communication unit comprising a polarity decoder core according to the first aspect.

In a third aspect, an example of the invention describes an integrated circuit comprising a polarity decoder core according to the first aspect.

In a fourth aspect, as an example of the invention, a method of polarity decoding is described according to the first aspect. The method comprises the following steps: receiving at least one input log-likelihood ratio LLR in a format represented by fixed-point numbers including 2's complement binary number and additional sign bits; at least one input LLR in a format represented by a fixed point number including 2's complement binary number and additional sign bits is manipulated, and at least one input LLR in a format represented by a fixed point number including 2's complement binary number and additional sign bits is output.

In a fifth aspect, examples of the invention describe a non-transitory tangible computer program product comprising executable code stored therein for polar decoding according to the fourth aspect.

For the motivation discussed above, the present invention is a novel polarity decoder architecture that enables flexible, low-latency, hardware-efficient SCL polarity decoding. The proposed architecture does not process one stage of the polar pattern at a time, but rather achieves a higher degree of parallelism by processing several successive stages at a time. This demonstrates that such parallel processing can be fully utilized in most computations for f and g, enabling greater hardware utility than linear and semi-parallel architectures. Furthermore, since several successive stages are processed at a time, memory is only required at the interface between each pair of successive stages, not at the interface between each pair of successive stages. This significantly reduces the overall memory requirements of the proposed architecture relative to previous implementations, which is particularly influential because memory is the largest contributor to hardware resource usage.

Although the examples of the present invention are described with reference to the use of LLR memory blocks, it is contemplated that these memory blocks are used to store soft bits of any form, and the use of LLR memory blocks to store soft bits as LLRs is for illustrative purposes only.

Although examples of the present invention are described with reference to integrated circuit implementations within applications of wireless communication receivers, it is contemplated that in other examples, the present invention may be applied in other implementations and other applications. For example, the circuits and concepts described herein may be constructed as, for example, an application specific integrated circuit, an application specific instruction set processor, an application specific standard product, a field programmable gate array, a general purpose graphics processing unit, a system on a chip, a hardware implementation within a configurable processor. Similarly, it is contemplated that in other examples, for example, the software implementation may be comprised within a central processing unit, digital signal processor, or microcontroller. In addition to wireless communication receivers, the present invention may be configured into wireless communication transceivers or communication devices for other communication channels, such as optical, wired, or ultrasonic channels. Furthermore, the present invention may be configured into a storage device to provide FEC for data recovered from, for example, optical, magnetic, quantum or solid state media.

Examples of the present invention also provide a method and architecture for decoding information according to the principles of polarity decoding, with the objective of providing FEC during communication over unreliable channels or during storage in unreliable media. Examples of the present invention also provide a method and architecture that provides flexible support for information blocks that include multiple bits that vary from block to block.

Some examples of the invention are described with reference to the New Radio (NR) standard, which is currently defined by the third generation partnership project (3 GPP) as a candidate for fifth generation (5G) mobile communications. Currently, polar coding and decoding have been chosen to provide FEC in uplink and downlink control channels for NR enhanced mobile broadband (eMBB) applications as well as in the Physical Broadcast Channel (PBCH). Polar coding and decoding have also been identified as candidates for providing FEC for uplink and downlink data and control channels for ultra-reliable low latency communication (URLLC) and large-scale machine type communication (mctc) applications of NR. Alternatively, some examples of the invention are described without reference to a particular standardized application. More broadly, the present invention may be applied to any future communication standard that selects polarity encoding and decoding to provide FEC. Furthermore, the present invention may be applied in non-standardized communication applications that may use polarity encoding and decoding to provide FEC for communication over wireless, wired, optical, ultrasound or other communication channels. Likewise, the invention may be applied in storage applications using polarity encoding and decoding to provide FEC in optical, magnetic, quantum, solid state and other storage media.

In some examples, the circuits and functions described herein may be implemented using discrete components and circuits, while in other examples, operations may be performed in a signal processor, e.g., in an integrated circuit.

Because the illustrated embodiments of the present invention may be implemented, to a great extent, using electronic components and circuits known to those skilled in the art, details will not be explained in any greater extent than that considered necessary as illustrated below, for the understanding and appreciation of the underlying concepts of the present invention and in order not to obfuscate or distract from the teachings of the present invention.

Detailed description of the drawings

Referring now to fig. 1, there is illustrated a top-level schematic diagram of a communication unit 116 including a polarity encoder and a polarity decoder, suitable for use in accordance with an example of the present invention. In this example of the communication unit 116, the skilled person will appreciate that many other components and circuits (such as frequency generation circuits, controllers, amplifiers, filters, etc.) are not shown for simplicity purposes only. In other examples, it is contemplated that block 116 may also take the form of an integrated circuit that includes a polarity decoder (and in some cases block conditioning and polarity decoding processing functions), such as for use in a communication unit, a memory unit, or any electronic device designed to use polarity decoding. In other examples, it is contemplated that block 116 may take the form of software running on a general purpose computing processor.

The polarity decoder includes three sequential components, namely an information block adjustment 112, a polarity decoder core 111, and an encoding block adjustment 110. These components will be discussed in the following paragraphs. To provide the context of this discussion, fig. 1 illustrates the communication or storage channel 108 and the corresponding components of the polarity encoder, namely the information block adjustment 101, the polarity encoder core 102, and the encoding block adjustment 103, although they operate in reverse order. As will be discussed in the following paragraphs, the polarity decoder operates based on the recovered information block 115, the recovered core information block 114, the soft core coding block 113, and the soft coding block 109. Correspondingly, the polarity encoder operates based on the information block 109, the core information block 105, the core encoding block 106 and the encoding block 107, although they are processed in the reverse order.

In order to understand the operation of the polarity decoder, in particular the polarity decoder core 111, it is first worth considering the operation of the polarity encoder core 102. In the context of a polar encoder, the input of the information block adjusting component 101 may be referred to as an information block 104, the block size of the information block 104 being K. More specifically, the information block is a row vector containing K information bitsWherein a is _i E {0,1}. The information block adjusting component 101 interleaves K information bits with N-K redundancy bits, which may be, for example, freeze bits [1 ] ]Cyclic Redundancy Check (CRC) bits [2 ]]Parity Check (PC) freeze bit [3]User equipment identification (UE-ID) bit [4]Or hash bit [5 ]]。

Here, the freeze bit may often take a logical value of "0", while the CRC or PC freeze bit or hash bit may take a value obtained from the information bit or redundancy bits interleaved earlier in the process. The information block adjusting component 101 generates redundancy bits and interleaves them into positions identified by a prescribed method, which is also known to the polarity decoder. The information block adjusting component 101 may also comprise an interleaving operation, which may for example implement a bit reversal arrangement [1 ]]. The output of the information block adjustment component 101 may be referred to as a core information block 105, the core information block 105 having a block size N. More specifically, the core information block 105 is a row vector including N core information bitsWherein u is _j E {0,1}. Here, the information block adjustment needs to be done such that N is a power of 2, N being greater than K, to provide compatibility with a polar encoder core that operates based on a generator matrix of a size of a power of 2, as will be discussed below. The input of the polarity encoder core 102 is a core information block u 105, and the output of the polarity encoder core 102 may be referred to as a core encoding block 106, the block size of the core encoding block 106 matching the core block size N. More have In bulk, the core-encoded block 106 is a row vector comprising N core-encoded bits: />Wherein x is _i E {0,1}. Here, the multiplication is performed according to a modulo-2 matrix>To obtain a core encoded block 106 in which modulo-2 of two bit values and XOR as them can be obtained. Here, the generation matrix +.>From [ n=log2 (N) of the following kernel matrix]The Kronecker powers are given:

note that successive Kronecker powers of the kernel matrix may be recursively obtained, with each powerBy adding the previous power to>Is replaced with a core matrix and is obtained by replacing each logical "0" with a 2x2 zero matrix. Thus, the nth Kronecker power of the core matrix->Is of size 2 _n x2 _n . For example, the number of the cells to be processed,

here, u= [1011 ]]Give outAnd u= [11001001 ]]Give out

The skilled person will appreciate that the degree of integration of a circuit or component may in some cases depend on the implementation. Furthermore, it is contemplated that in some examples a signal processor may be included in the communication unit 116 and adapted to implement encoder and decoder functions. Alternatively, as shown in fig. 1, a single processor may be used to implement both the processing of the transmit and receive signals and some or all of the baseband/digital signal processing functions. It is apparent that the various components within the wireless or wired communication unit 116, such as the polar encoder described, may be implemented in discrete or integrated component form, so the final structure is application or design specific.

In some examples, the operations of the polarity encoder core 102 may be performed by a generator matrixIs represented by a graphical representation 201, 202, 203, the graphical representation 201, 202, 203 being illustrated in fig. 2. Referring now to FIG. 2, a generator matrix F201, < > is illustrated in accordance with an example of the present invention>202 and->203, and an example graphical representation 200. Generating matrix->Is an example of a small polar pattern, whereas in general the polar pattern may be much larger and of any size n>0. Thus, the example in fig. 2 illustrates a much simplified arrangement than in reality, for explanatory purposes only and without obscuring the description of the invention.

Here, each modulo-2 addition204 may be implemented using binary exclusive-or (XOR) operations. Note that the figure includes "N" inputs on its left edge 205 and "N" outputs on its right edge 206, corresponding to "N" core information bits of "u"105 and "N" core code bits of "x" 106. Generating matrix F201, ">202 and->203 comprises n=log2 (N) stages 207, each stage comprising N/2 vertically aligned XORs 204, giving N log2 (N) =2 XORs in total. Note that there is a data correlation between successive stages 207 implementing left-to-right processing scheduling. More specifically, the data correlation prevents the computation of the XOR in a particular stage 207 until after the XOR in the stage 207 to the left of the particular stage 207 has been computed.

In some examples, to the continuous Kronecker powerAs are the recursive nature of these generator matrices, the successive graphical representations also have a recursive relationship. More specifically, the graphical representation 200 of the polar-encoded core operation 201 for a core block size of n=2 includes a single stage 207, the single stage 207 containing a single XOR 204. Notably, in the example polarity encoder, the first of the n=2 core encoded bits is obtained as an XOR of the n=2 core information bits, while the second core encoded bit is equal to the second core information bit. For a larger core block size "N", the graphical representation may be considered as a vertical concatenation of two graphical representations of core block size N/2, followed by an additional stage 207 of XOR. Similar to the n=2 cores described above, the first N/2 of the N core encoded bits are taken from the output of the two N/2 cores as the XOR of the corresponding bits, while the last N/2 of the core encoded bits is equal to the output of the second N/2 core.

In this example, the input of the encoding block adjustment component 103 of the polarity encoder is a nuclear encoding block x 106, and its output may beThe block size of the coded block 107 is M, which is referred to as the coded block 107. More specifically, the encoded block is a row vector comprising M encoded bitsWherein b _k ∈{0，1}。

Here, the resulting polarity encoding rate is given by r=k/M, where the encoding block adjustment 103 needs to be done so that "M" is greater than "K". The encoding block adjustment component 103 may use various techniques to generate "M" encoding bits in the encoding block b 107, where "M" may be higher or lower than "N". More specifically, repetition [6] may be used to repeat some of the "N" bits in the core encoded block "x", while shortening or puncturing (puncturing) techniques [6] may be used to remove some of the "N" bits in the core encoded block "x". Note that shortening strips ensure that bits with a logical value of "0" and puncturing strips that may have a logical value of "0" or "1". The coding block adjustment component may also include interleaving operations. After polarity encoding, the encoded block "b"107 may be provided to a modulator, which transmits the encoded block "b"107 over the communication channel 108.

Referring now to fig. 3, a generator matrix is usedAn extended example polarity encoding process of the graphical representation 300 of 203 illustrates an example in which a specific frozen bit pattern is used to use k=4 information bits a= [1001]104 to m=8 coded bits b= [00001111 ]]107. More specifically, the information block adjustment 101 is used to adjust k=4 information bits a= [1001 ]104 into n=8 core information bits u= [00010001 ]]105. Then, the polarity encoder core 102 converts them into n=8 core encoded bits x= [00001111 ] using the polarity code pattern 203]106. Here, the input path may be tracked by various XOR operations to identify the output. Finally, the coded block adjustment 103 reserves all the core coded bits to provide m=8 coded bits b= [00001111]107。

In the receiver, the demodulator functions to recover information about the encoded blocks. However, due to the communication channelThe random nature of the noise in 108, the demodulator is typically unable to obtain absolute confidence in the value of the M bits in the encoded block 107. The demodulator may express its confidence in the value of the bits in the encoded block 107 by generating a soft encoded block 109 of block size M. More specifically, the soft-coded block 109 is a row vector comprising M coded soft bitsEach soft bit may be represented in the form of a Log Likelihood Ratio (LLR):

wherein Pr (b) _k = "0") and Pr (b) _k = "1") is the probability of adding to "1".

Here, positive LLRIndicating demodulator pair corresponding bit b _k Values with "0" have greater confidence, while negative LLRs indicate greater confidence for the bit value "1". The magnitude of the LLR expresses how much confidence, where an infinite magnitude corresponds to the absolute confidence of the bit value, and a magnitude of "0" indicates that the demodulator has no information about whether the bit value "0" or "1" is more likely.

In an alternative approach, each soft bit may be represented by a pair of symmetric likelihoods (LLs):

the polarity decoder includes three sequential components, namely, an encoded block adjustment 110, a polarity decoder core 111, and an information block adjustment 112, as shown in fig. 1. These components will be discussed in the following paragraphs.

The input of the coding block adjustment component 110 of the polar decoder is a soft coding block109, and the output of which may be referred to as a soft-core encoded block 113 of block size N. More specifically, this soft-core encoding block 113 is a row vector including "N" core-encoded LLRs ≡>To convert the M coded LLRs to "N" core coded LLRs, the infinite LLRs may be interleaved with the soft-coded block 109 to occupy positions within the soft-core coded block corresponding to "0" value core coded bits removed by shortening in the polar encoder. Also, the LLRs for the "0" value may be interleaved with soft code block 109 to occupy the positions of the core coded bits removed by puncturing. In the case of repetition, the LLRs corresponding to copies of a particular core encoded bit may be summed and placed in corresponding locations within soft-core encoded block 109. If interleaving is employed within the coding block adjustment component 103 of the polarity encoder, a corresponding de-interleaving operation may also be performed.

The input to the polar decoder core 111 is a soft-core encoded block113, and its output may be referred to as a recovered core information block 114 of block size "N". More specifically, the restored core information block 114 is a row vector including "N" restored core information bits +.>Wherein->In some examples, polarity decoder core 111 may operate using a variety of different algorithms, including Successive Cancellation (SC) decoding [1 ]]And Successive Cancel List (SCL) decoding [7]。/>

The input to the information block adjustment component 112 of the polarity decoder is recoveryAnd its output may be referred to as a recovered information block 115 of block size "K". More specifically, the recovered information block 115 is a row vector comprising "K" recovered information bitsWherein->By retrieving the core information block->All redundant bits are removed 114 to obtain a recovered information block. If interleaving is employed within the information block adjusting component 101 of the polarity encoder, a corresponding de-interleaving operation may also be performed.

1) SC decoding: the polarity decoder core operating based on SC decoding may be considered to have a similar graph structure 201, 202, 203 as the polarity encoder, as shown in fig. 2. It can be observed that each stage 207 of the graph comprises N/2 basic calculation units, similar to n=2 graph 201. More specifically, each basic computing unit has two connections on its left edge, which are connected to the basic computing units in the immediately left stage 207, or to the left edge of the graph 205 if there is no left stage. These connections on the left edge of the base computing unit are horizontally aligned with two connections on its right edge, which are connected to the base computing unit in the immediately right stage 207, or to the right edge of the graph 206 if there is no right stage. Within the basic computing unit, a first of the two right connections is connected to the two left connections via XOR 204, while the second right connection is directly connected to the second left connection. In the leftmost stage of the figure, two connection parts on the left and right edges of each basic calculation unit are connected to each other in the vertical direction. But in other stages the two connections of each basic calculation unit are vertically separated from each other by an offset that doubles in each successive stage 207.

The SC decoder performs calculations related to the basic calculation unit according to a sequence determined by the data correlation. More specifically, depending on the availability of LLRs provided by the connections 403, 404 on the right edge of a particular base computing unit and depending on the availability of bits provided by the connections 401, 402 on the left edge of the particular base computing unit, there are three types of computations that can be performed for that particular base computing unit.

The first case where the base computation unit may contribute to the SC decoding process is that the LLRs have been provided by both connections 403, 404 on the right edge of the base computation unit. As shown in fig. 4 (a), we refer to the first and second of these two LLRs as respectivelyAnd->This enables the basic calculation unit to calculate LLR for the first 401 of the two connections on its left edge according to the following f-function>

Wherein sign (·) returns "+1" if its argument is negative and "+1" if its argument is regular.

In the subsequent SC decoding process, bits will be provided on the first 401 of the connections on the left edge of the basic computational unitAs shown in fig. 4 (b). LLR ++previously provided using connections 403, 404 on the right edge >And->Together, this enables the basic calculation unit to calculate LLR for the second of the two connections 402 on its left edge according to the following g-function>/>

Later, as shown in FIG. 4 (c), bits will be provided on the second one 402 of the connections on the left edge of the base computing unitBit +.A first one 401 of the connections on the left edge was used previously>Together, this enables bit +_ for the first 403 and second 404 connection on the right edge of the basic computing unit>And->Is described, wherein:

as can be appreciated from the discussion above, the f-function of (1) or (2) can be used to propagate LLRs from right to left within the graph, while the portions and calculations of (4) and (5) can be used to propagate bits from left to right, while the g-function of (3) can be used to switch from propagating bits to propagating LLRs.

In order that the LLRs can be propagated from right to left, the LLRs need to be provided on the connection on the right edge 206 of the figure. This is performed at the beginning of the SC decoding process by providing a soft-core encoded block from the soft-core on a continuous connection on the right edge 206 of the figureContinuous LLR of 113. Also, bits need to be provided on the connection on the left edge 205 of the figure in order to propagate bits from left to right. Here, data correlation other than the above is imposed. If the position of a particular connection on the left edge of the graph corresponds to the position of an information bit in the core information block u 105, the bit input into that connection depends on the LLR output from that connection. More specifically, if a positive LLR is output on the connection, it may be the recovered core information block +. >114, and then the value 0 is input into the connection. At the same time, the negative LLR allows a value of "1" to be selected for the corresponding bit of the recovered core information block 114, and then the value of "1" is input into the connection. In the case of a connection portion corresponding to a redundant bit in the core information block u 105, the value of the redundant bit may be input to the connection portion as long as it is known. Here, before the SC decoding process starts, the values of the freeze bits and the UE-ID bits may be known, but the values of the CRC, PC, and hash bits may not become available until the relevant information bits are recovered.

In combination, the data correlation described above requires that the recovered blocks of core information be obtained once in top-to-bottom order at the connection on the left edge 205 of the graph114, and one bit of information within 114. More specifically, the SC decoding process begins with propagating the LLR from the right edge 206 of the graph to the top connection on the left edge 205 of the graph using either f-function (1) or (2), allowing the first bit to be recovered. Then, through the use of%4) And (5) and computing the left-to-right propagated bits, then switching from bit propagation to LLR propagation using the g-function of (3) for a particular basic computation unit before using the f-function to propagate LLR to the next connection on the left edge 205 of the graph, thereby allowing the corresponding bits to be recovered to recover each successive bit from top to bottom. This process is illustrated in the example of fig. 5.

FIG. 5 illustrates an example of an SC decoding process in which a generator matrix is used, according to an example embodiment of the invention203 for which a particular vector b of m=8 coded LLRs 109 is converted into k=4 recovered information bits a= [1001 using a particular frozen bit pattern]115. The LLR obtained using the f and g functions of equations (2) and (3) is shown above each connection. The bits obtained using the partial and calculation of equations (4) and (5) are shown below each connection. The accompanying numbers in brackets identify the steps of the SC decoding process in which the corresponding LLR or bits become available. />

2) SCL decoding:

in one example of the SC decoding process described herein, the value selected for each bit in recovered information block 115 depends on the sign of the corresponding LLR, which in turn depends on the value selected for all previous recovered information bits. If this approach results in an incorrect value being selected for a particular bit, it will typically result in an erroneous concatenation of all subsequent bits. The selection of incorrect values for the information bits can be detected taking into account the subsequent frozen bits, since the decoder knows that these bits should have a value of "0". More specifically, if the corresponding LLR has a sign that would imply a value of "1" for the frozen bit, this indicates that an error has occurred during decoding of one of the preceding information bits. However, in the SC decoding process, there is no opportunity to consider the alternative value of the previous information bit. Once a value is selected for an information bit, the SC decoding process continues and a final decision is made.

This motivates SCL decoding that enables consideration of the list of alternative values of the information bits [7]. As the decoding process proceeds, it takes into account two options for the value of each successive information bit. More specifically, the SCL decoder maintains a list of candidate core information blocks, wherein the list and core information blocks are established as the SCL decoding process proceeds. At the beginning of the procedure, the list includes only single core information blocks of length zero. Each time the decoding process reaches the freeze bit, a bit value of 0 is appended to the end of each core information block in the list. However, each time the decoding process reaches an information bit, two copies of the list of candidate core information blocks are created. Here, a bit value of "0" is attached to each block in the first copy, and a bit value of 1 is attached to each block in the second copy. The two lists are then combined to form a new list that is twice the length of the original list. This continues until the length of the list reaches a limit L, which is typically chosen to be a power of 2. From this point on, each time the length of the list is doubled when considering the information bits, the worst L of the 2L candidate core information blocks is identified and the worst L is pruned from the list. In this way, the length of the list remains at L until the SCL decoding process is complete.

Here, each block [8 ] is compared and classified based on the LLR obtained on the left edge 205 of the polar code pattern]The calculated metrics identify the worst candidate core information block. The LLRs are obtained during SCL decoding by propagating bits from left to right into the polar pattern for each candidate core information block using separate copies of the portions and calculations of (4) and (5). Thereafter, as in the example SC decoding process described herein, the corresponding LLRs may be propagated right to left using the separate copies of g and f calculations of (1) - (3). And position j E [0, N-1 ]]Median valueThe metrics associated with attaching to candidate core information block I are given by:

wherein the method comprises the steps ofIs the corresponding LLR, and phi _l，j-1 Is a metric calculated for the candidate core information block in the last step of the SCL decoding process. Note that since the metric is at all bit positions j ε [0, N-1 ]]And thus, each time a frozen bit value of "0" is appended, metrics must be calculated for all L candidate core information blocks, and metrics must be calculated for all 2L candidates when two possible values of the information bit are considered. In the latter case, 2L metrics are classified and the L candidates with the highest values are identified as the worst candidates and pruned from the list.

After completion of the SCL decoding process, the candidate core information block with the lowest metric may be selected as the recovered core information block 114. Alternatively, in CRC assisted SCL decoding [9], all candidates in the list that do not satisfy the CRC are pruned before the candidate with the lowest metric is selected and output.

The proposed polar decoder core

Referring now to FIG. 6, a diagram for C is shown, according to an example embodiment of the invention _max An example schematic diagram of the proposed polar decoder core 1600 for the case of=5. The proposed polar decoder core 111 comprises data paths 1601, 1602, 1603, memories 1604, 1605 and a controller 1606 component. More specifically, C employing an internal data path 1601, an external data path 1602, and a partial and data path 1603 _max -2 copies. In addition, adopt C _max -1-bit memory block 1605 and C _max LLR memory block 1604. In contrast to known processor architectures for implementing decoders, examples of the present invention may flexibly group all stages in a polar code pattern into 1-C depending on the core block size N at runtime _max Multiple columns within a range, where in some examples C may be selected at design time _max . In contrast, some prior art techniques always use a fixed number of columns, which does not vary with core block size, while some prior art techniques can only group the leftmost level into columns and require all other levels to remain individual.

In this way, the example of the present invention has the advantage of using columns, i.e., reducing the number of steps required to complete the polarity decoding process. The examples of the present invention also preserve the flexibility of supporting a long core block size N without columns having too large a width and hardware requirements. Also, some examples of the invention preserve the flexibility to support short core block sizes N while preserving the high utility of internal datapath hardware, and thus maintaining hardware efficiency.

More specifically, the proposed architecture does not process one stage of the polar code pattern at a time, but rather achieves a higher degree of parallelism by processing several successive stages within each column at a time. This parallel processing can be fully utilized in most computations of f and g, enabling greater hardware utility than linear and semi-parallel architectures. Furthermore, since several successive stages are processed at a time, memory is only required at the interface between each pair of successive stages, not at the interface between each pair of successive stages. This significantly reduces the overall memory requirements of the proposed architecture relative to previous implementations, which is particularly influential because memory is the largest contributor to hardware resource usage. Finally, a simple mechanism for propagating parts and bits is proposed, which is also influential, since parts and propagation are the second largest contributors to hardware resource usage in previous implementations.

More specifically, under the control of the controller 1606, each of the internal data path 1601, the external data path 1602, and the partial and data paths 1603 may involve processing one subrow of a row of a column in each step of the polarity encoder core operation. Here, the input of data paths 1601, 1602, or 1603 is read from LLR and/or bit memory blocks 1604 and 1605 residing at the appropriate interfaces of one edge or the other on either side of the current column, depending on whether the information is propagated from left to right or right to left in the polarity pattern. Also, depending on the direction of the information flow, the output of data paths 1601, 1602, or 1603 is written to LLR and/or bit memory blocks 1604 and 1605 residing at the appropriate interfaces on either side of the current column. In this way, bits or LLRs may be transferred between processes performed in adjacent columns by reading and writing the same memory block 1604 or 1605.

The LLRs and bits are arranged within these memory blocks 602, 603 in a manner that allows the data paths 1601, 1602, or 1603 to perform seamless read and write operations without requiring complex interconnection networks or complex control signals.

Architecture

The proposed polar decoder core 111 enables flexible decoding of one recovered core information block 114 at a time, wherein consecutive recovered core information blocks may have a core block size N that may vary from block to block.

More specifically, a core block size N may be between 2 and N _max A value of any power of 2 in between, where N _max Is a parameter fixed at design time. At the beginning 1801 of the polarity decoding process, the polarity is decoded at a sequence of N/min (N, N ₁ ) In successive steps, the soft-core code block113 loads 1802 into LLR input 1607 of polarity decoder core 111. LLR input 1607 has acceptable n in each step _l Width of LLR, where the parameter n _l Is fixed at the time of design. Here, each LLR may be represented using a 2's complement fixed point number, whose bit width is fixed at design time. At N<n _l In order to increase its length to n before providing it to the proposed polar decoder core 111, an equal number of zero-valued LLRs are inserted after each LLR in the soft-core encoding block 113 _l . During the polarity decoding process, a redundancy bit pattern and corresponding redundancy bit values are provided to the corresponding inputs 1608 of the proposed polarity decoder core 111. The width of each of these inputs is acceptable in each step +.>Pattern bits or redundancy bits, which will be provided to the proposed polar decoder core 111 using an on-demand basis, depending on the requirements of the polar decoding process. At- >In the case of (2), attaching an asserted freeze bit flag to the freeze bit pattern to increase its length to +.>

After the polarity decoding process is completed, a series of N/min (N, N _b ) A succession of steps to output 1803 the recovered core information block on bit output 1609 of the proposed polar decoder core 111114 having n _b Bit width. At N < N _e The zero value bits may be removed from the end of the output 1609 of the proposed polar decoder core 111. When decoding a soft-core encoded block 113 of block size N, one such example of the proposed polar decoder core 111 is based on a polar code generation matrix ∈ ->Operates on the graphical representations 201, 202, 203 of (a). Here, n=log2 (N) stages 207 within the figures 201, 202, 203 are grouped into C columns 1701, 1702, where each column includes a certain number of successive stages 207. Each column 1701, 1702 may be indexed by its index c ε [0, C-1 ]]Is referenced, with the leftmost column 1701 having an index c=0 and the rightmost column having an index c=c-1. A row vector can be used +.>To express the number of stages in each column 1701, 1702, where s ₀ Is the number of stages in the leftmost column 1701, and s _C-1 Is the number of stages in the rightmost column. Here, s needs to be chosen such that This is illustrated in fig. 7, for which a matrix is to be generated +.>Is grouped as comprising s= [1;2;2;1]C=4 columns 1701, 1702 of stage 207. In the proposed polar decoder core 111, the leftmost column with index c=0 is called outer column 1701, and the index c e 1, c-1]The other columns are referred to as a set of inner columns 1702. The particular number of stages in each column 1701, 1702 depends on the core block size N and the parameters s fixed at design time ₀ Sum s _i To select. Here, s ₀ Specifying a maximum number of stages that can be accommodated in the outer column 1701, which can be taken from "0" to n _max ＝log ₂ (N _max ) Any value in the range. At the same time, s _i Specifying a maximum number of stages that can be accommodated in each internal column 1702, which may be taken from 1 to n _max -s _o Any value in the range. If the number of stages in the plot n=log2 (N) satisfies n.ltoreq.s _o The graphs 201, 202, 203 are then broken down into only c=1 columns, i.e. the outer column 1701, which will include s ₀ N stages 207. Otherwise, decompose the diagrams 201, 202, 203 into +.>Columns, wherein the outer column 1701 includes s ₀ ＝s _o The right-most inner column 1702 includes s for each stage 207 _C-1 ＝n-s _o -(C-2)s _i Stages 207, and all other internal columns 1702 include s _c ＝s _i A stage 207. This is illustrated in fig. 7, where in the case of graphs 201, 202, 203 comprising n=6 stages 207, from s _o =1 and s _i =2 gives s= [1;2;2;1]. Note that in an alternative arrangement, C.epsilon.1, C-1 for all c.epsilon.1 may be used]Satisfy s _c ≤s _i Any other combination of n-s is allocated among the C-1 inner columns 1702 _o The rightmost stage 207, although this requires modification of the design described in this section.

Note that if the maximum number of stages n in the graph _max ＝log ₂ (N _max ) Satisfy n _max ＝s _o The graphs 201, 202, 203 will always be decomposed to include a maximum of s _0，max ＝n _max Only C of the individual stage 207 _max =1 columns 1701. Otherwise, the graphs 201, 202, 203 are decomposed to maximumA number of columns 1701, 1702, wherein the outer column 1701 includes a maximum s _0，max ＝s _o The right-most inner column 1702 includes a maximum s for each stage 207 _C-1，max ＝n _max -s _o -(C _max -2)s _i Stages 207, and all other internal columns 1702 include a maximum s _c，max ＝s _i A stage 207. Vector of the set of columns 1701, 1702 and subcode radixThe correlation, wherein each subcode radix is given by:

here, the subcode radix r of a particular column 1701, 1702 _c Quantization the core block size N that would result if the graphs 201, 202, 203 included the stage 207 only in that column and the column to the left. Note that the sub-code base r of each successive column 1701, 1702 _c Growing from left to right. The corresponding maximum subcode radix is given by:

each column 1701, 1702 includes multiple rows, vectors may be used To express a plurality of rows, wherein the number of rows in a particular column is defined by R _c ＝N/r _c Given.

Here, each row 1703 includes a sub-graph including s _c Stages 207 and r horizontally aligned on left and right edges thereof _c Successive connecting portions. Can be observed in FIG. 7To the extent that the row definitions given above result in no interconnection between any pair of rows 1703 within any particular column 1701, 1702. Each row 1703 of each column 1701, 1702 may be accessed one or more times by a polarity decoding process, for example, to perform XOR operations or f and g functions. More specifically, processing associated with a particular row in a particular column may be performed for more than one temporally separated instance during a polarity decoding process, where each set of temporally separated processes may be referred to as "accessing" the row. However, access to row 1703 in columns 1701, 1702 to the right of figures 201, 202, 203 is more computationally intensive than access to the left of the figures, because each column r _c The number of connections within each row 1703 increases from left to right. However, it can be observed in FIG. 7 that row 1703 in the rightmost column can be broken down into subrows 1704, with no connection between subrows 1704. Thus, the computations associated with a particular access to row 1703 at a particular time during the polarity decoding process may be dispersed among several consecutive steps, each performing computations 1804, 1805, 1806 for a different sub-row 1704 in row 1703. In this way, the polarity decoding process completes one step at a time, where each step may correspond to one or more hardware clock cycles, depending on whether and how the pipeline is used. By using more subrows 1704 per row 1703 in the column 1702 to the right of figures 201, 202, 203, the number of computations performed in each step of the decoding process can be maintained at a relatively constant level regardless of which column is being accessed. Formally, the number of subrows of each row 1703 that make up each column 1701, 1702 may use a vector Expressed herein, s _c Must be a power of 2 and cannot exceed +.>To ensure that there are no connections between the subrows 1704. Note that this implies that the row 1703 in the outer column 1701 cannot be further broken down into subrows 1704. Each child row 1704 includes a sub-graph including s _c A plurality of stages 207 and n horizontally aligned on left and right edges thereof _c ＝r _c /S _c A plurality of connection portions vertically offset from each other by r _c /n _c Position. Here, n _c The block size, called subrows 1704, must be at +.>Power of 2 in range. In the proposed polarity decoder core 111, the particular block size of the subrows 1704 in each inner column 1702 is selected to be n _c ＝min(r _c ，n _i ). Here, n _i Specifying the maximum internal sub-line block size, which is a parameter fixed at design time and can be used +.>To N _max A value of any power of 2 in the range. Each row 1703 of each column 1701, 1702 is enclosed in a dashed box. The first subrows 1704 in the first row 1703 of each column 1701, 1702 are highlighted in bold.

This is illustrated in fig. 7, where in the case of graphs 201, 202, 203 comprising n=6 stages 207, by n _i =8 gives s= [1;1, a step of; 4, a step of; 8]。

Fig. 8 illustrates an example flowchart of a decoding process employed by the proposed polar decoder core, whereby each period around the main loop of the flowchart corresponds to one step of the decoding process, according to an example embodiment of the present invention. The flowchart starts at 1801 and at 1802, the LLRs of the soft-core encoding block 113 are loaded into the proposed polar encoder core 111. At 1807, the current column index C is initialized to c=c-1, the current row index y is initialized to a zero value vector of length C, and the current sub-row index s is initialized to 0.

At 1808, a determination is made as to whether c > 0 is used to identify whether the current column is an inner column. If so, the flow chart proceeds to 1809 where v=mod (y _c-1 ，r _c /r _c-1 ) To identify the index of the current access to the current subrow in the current row of the current column. Subsequently, at 1805, the partial sum data paths 1 through c are used to sum the partial sum bitsFrom column 0 to the current column. Subsequently, at 1806, the internal data path is used to process the current access to the current subrow in the current row of the current column. At 1813, s=s is determined _c -1 is used to determine if all sub-rows in the current row have been accessed indexed v. If not, then at 1812 the sub-row index s is incremented so that the next sub-row will be accessed. The flow chart then returns to 1808 to continue processing the subrows in the current row of the current inner column.

Conversely, if it is determined at 1813 that the access to index v has now been made to all of the child rows in the current row of the current inner column, the flowchart proceeds to 1814. Here, v=r is determined _c /r _c-1 -1 for determining whether a last access has been made to all sub-rows in the current row of the current inner column. If not, the flow chart proceeds to 1818, or if so, the flow chart first proceeds to 1816 and then proceeds to 1818. At 1816, the row index of the current column is incremented so that when the current inner column is accessed again later in the polarity decoding process, the next row down will be accessed. At 1818, the current column index c is decremented so that the left column, whether the outer column or the other of the inner columns, will be accessed next. At 1821, the subrow index s is reset to 0 so that the next access to a row in the inner column will begin from its top subrow. Thereafter, the flowchart returns to 1808.

If it is determined at 1808 that c>0 identifies that the current column is an outer column, the flow chart proceeds to 1804. Here, the outer data path is used to process the current row y in the outer column ₀ . Thereafter, at 1810, the determination y is used ₀ ＝R ₀ -1 to determine if the bottom row in the outer column has been accessed. If not, the flow chart proceeds to 1815 where the row index of the outer column is incremented so that the next row down will be accessed when the outer column is accessed again later in the polarity decoding process. Next, a procedure is used in 1817, 1820 and 1819 to determine which internal columns should be accessed next. At 1817, column index C is initialized to the index of the rightmost inner column C-1. At 1819, c is decremented until at 1820, mod (y ₀ 2 ^so ，r _c-1 ) =0. Thereafter, the flow chart proceeds to 1821 where the sub-row index s is reset to 0, and then the flow chart returns to 1808.

Conversely, if it is determined at 1810 that the bottom row of the outer column has been accessed, then the recovered core information block 114 is output from the proposed polarity decoder core 111, and the process ends at 1811.

In some examples, the proposed polar decoder core 111 completes the decoding process according to the data correlation. As the decoding process proceeds, computations are performed for different rows 1703 in different columns 1701, 1702 according to a particular schedule, as shown in the flow chart of fig. 8. Each row 1703 in an outer column 1701 will be accessed once by the process, while each row 1703 in each particular inner column 1702 will be accessed by the process Secondary, wherein s _c Is the number of stages in the column. The decoding process starts with a soft-core encoded block +.>The LLR for 113 is passed to a single row 1703 in the rightmost column. The decoding process then uses the f function of (1) or (2) to perform calculations on these LLRs during the first access to this single row 1703 in the rightmost column. Whenever access to a row 1703 in the inner column 1702 is completed, it will pass the resulting LLR to one of the connected rows 1703 in the left column, with the particular row 1703 selected as the topmost row that has not yet been accessed. The decoding process will then use the f function of (1) or (2) to perform calculations on these LLRs during the first access to this row 1703 in the left column. Whenever an access 1804 to a row 1703 in the outer column 1701 is completed, it will be a restored core information block +.>114 contribute bits. Thereafter, the partial sum equations of (4) and (5) will be used to transfer 1805 the partial sum bits from this row 1703 in the outer column 1701 to the leftmost inner column 1702 with the horizontally aligned row 1703, so far, having completed less than +.>And (3) accessing. At the same time, the decoding process will perform an access to this row 1703, where the g function of (3) is used to combine these bits with the LLR provided at the beginning of the first access to this row 1703. Note that each access to a row 1703 in the inner column 1702 may be performed over multiple successive steps of the decoding process, with each step 1806 operating on a different one of the subrows 1704 in the row 1703. The subrows 1704 may be processed in any order herein, although the flow chart of FIG. 8 illustrates a top-to-bottom processing scenario. Here, the parts and bits propagate 1805 from the outer column 1701 to the child rows 1704 in the inner column 1702 in the same step as they are used by the g-function of (3), as discussed below. Note that this same method can be used for both SC and SCL decoding procedures. In the case of SCL decoding, each access to each sub-row 1704 uses parallel processing to perform computations associated with all L candidate core information blocks in the list at the same time.

As shown in fig. 9, the total number of steps required to complete the decoding process can be obtained by combining the number of accesses made to each row 1703 in each column 1701, 1702 with the number of subrows 1704 in each column, as depicted in fig. 10, giving a totalAnd (3) a step.

Fig. 10 illustrates an exemplary diagram of the multiple steps required for the decoding process of the proposed polar decoder core according to an exemplary embodiment of the present invention. It plots the number of stages s in the external data path 1602 according to the core block length N _o Series s in internal data path 1601 _i And block size n of internal data path _i The number of steps required for the decoding process of the proposed polar decoder core 111. For the case of l=8 list decoding and for s _i And n _i Is "path" quantization offThe number of fixed-point adders in the key data path length, "outadd" quantifies the number of fixed-point adders that must be located in the external data path 1602 and "add" quantifies the number of fixed-point adders that must be located in the internal data path 1601. In addition, for N _max In the case of 1024, "LLRmem" quantizes the required LLR memory 1604 capacity in the LLR, while "bitmem" quantizes the required bit memory 1605 capacity in the bits, including memory for candidate core information blocks obtained by the external data path 1602.

Note that a further N/min (N, N is needed before the decoding process can be started _l ) The step is to load 1802 the LLRs of the soft-core encoding block 113 into the proposed polar decoder core 111. Note that in an alternative example arrangement, the processing of the rightmost column 1702 may begin towards the end of the loading 1802 of the soft-core encoding block 113, allowing some concurrency to be achieved, subject to modification to the design shown. In the case of SC decoding, the recovered core information block 114 may be output 1803 from the proposed polarity decoder core 111 once while processing the outer column 1701 in the graphs 201, 202, 203Bits, although occasionally the time of accessing 1804 the outer column 1701 according to the decoding process. However, in the case of SCL decoding, the output 1803 of the recovered core information block 114 cannot start until all processing has been completed and after the best candidate core information block of the L candidate core information blocks has been selected. In this case, a further N/min (N, N _b ) To output 1803 the recovered core information block 114. Each step may correspond to a single clock cycle in a hardware implementation, depending on whether and how the pipeline is applied.

The number of steps used for the three parameterizations of the proposed polar decoder core is plotted in fig. 10 against the core block length N. The legend of the figure also quantifies the computational and memory resources used by each parameterization, as will be described in detail in the sections below. As can be expected, by having more stages s in the external data path 1602 _o In the internal data path 1601More stages s _i And a larger internal datapath block size n _i Is used to parameterize the same. While these faster parameterized data paths use more computing resources and longer critical paths, they tend to use less memory resources because they use fewer columns. FIG. 10 combines the proposed polar decoder core with [14 ]]Line decoder of [15 ]]Is compared with the semi-parallel decoder of [14 ]]Line decoder of [15 ]]Has been parameterized as using [26 ]]Is to recover 2 at one time by multi-bit technique ^so And (5) core information. As shown in FIG. 10, and employ S _o Benchmark test programs with values=2 have the same parameter S compared with _o The proposed polar decoder of=2 completes the decoding process using fewer steps. In addition, it uses less computing resources and uses less than 25% of the LLR memory amount. Furthermore, the proposed polar decoder core employs an elegant approach for partial and propagation with little hardware overhead. Since LLR memory and part and propagation are the two largest contributors to hardware resource usage, it is expected that the hardware efficiency of the proposed polar decoder may be four to five times better than the latest polar decoder.

The proposed method can be considered to employ conventional polar patterns 201, 202, 203 as the basis for LLR propagation using the f and g functions of (1) - (3). However, a novel rearrangement of the polar patterns 201, 202, 203 is used as the basis for bit propagation 1805 using the parts and equations of (4) and (5).

For which a matrix is generated in fig. 11The graphs 201, 202, 203 of (1) represent that have been decomposed to include s= [1;2;2;1]C=4 columns of stage 207Is illustrated in the figure of this rearrangement. Here, it can be observed that the bottom r in each stage 207 of each row 1703 of inner columns 1702 _c-1 XOR has been removed, where r _c-1 Is the subcode radix of the left column, as defined above. Instead, XOR 2101 has been introduced at the interface between each inner column 1702 and its right column. More specifically, each top r of each column to the right thereof is transferred from each row 1703 of each inner column 1702 _c -r _c-1 Bit and bottom r passed from the row 1703 _c-1 A particular one of the bits is exclusive ored 2101. Here, the specific bit is identified such that both bits in each XOR pair have the same index modulo r _c-1 Wherein each bit index ranges from 0 to N-1 prior to the modulo operation and from 0 to r after the modulo operation _c-1 -1。

As shown in fig. 6, the proposed polar decoder core 111 includes internal data path 1601, external data path 1602, partial and data path 1603, LLR memory block 1604, bit memory block 1605, and controller 1606 components. More specifically, although the proposed polar decoder core 111 includes only a single instance of the external data path 1602 and the internal data path 1601, it includes a C of the partial sum data path 1603 _max -2 instance, C of bit memory block 1605 _max C of example-1 and LLR memory block 1604 _max Examples. Here, the external data path 1602 interfaces with the bit output of the polarity decoder core 111, and can be considered to reside within an external column 1701 with index c=0. Meanwhile, the internal data path 1601 may be considered to reside at different indices cε [1, C-1 ] during different steps of the decoding process]Within a different inner column 1702 of the set. In addition, index c.epsilon.1, C-2]May be considered to reside within the inner columns 1702 of the polar patterns 201, 202, 203 with corresponding indices c.

In addition, have an index c ε [1, C-2 ] ]May be considered to interface with its left column via bit memory block 1605 and LLR memory block 1604 having index c, and with its right column via bit memory block 1605 and LLR memory block 1604 having index c+1. Furthermore, the rightmost one with index C-1Column 1702 may be considered to be via having index C _max Is interfaced with LLR inputs 1607 of the proposed polar decoder core 111. As shown in fig. 6, the external data path 1602, bit memory block 1605, and portion and data path 1603 form a chain, which represents the C columns 1701, 1702 in the polarity code pattern 201, 202, 203. When the decoding process accesses a different internal column 1702 in the graphs 201, 202, 203, the internal data path 1601 may take input from and provide output to different points in the chain. In some example embodiments, FIG. 6 also illustrates a mechanism for bypassing 1610 the bit memory blocks 1605 in the chain. This is the mechanism mentioned above that allows bits to be propagated 1805 from the outer data path 1602 into the inner data path 1601 through successive portions and data paths 1603 within a single step of the decoding process, regardless of which inner column 1702 is being accessed. Note that in some examples, with SCL decoding, data paths 1601, 1602, 1603 and memories 1604, 1605 have sufficient resources to perform the computation on all L candidate core information blocks in parallel.

The proposed polar decoder core 111 has significant differences from all previously proposed polar decoding methods. [10] The programmable architecture of [11] employs a serial approach that uses a schedule that adheres to the data dependencies described above, performing the computations associated with a single f or g function in each step. In contrast, the proposed method performs all computations associated with the subrows 1704 in each step, resulting in higher parallelism, higher throughput, and lower latency. [12] The expanded decoder of [13] achieves very high parallelism by employing different dedicated hardware for each f or g calculation in the polarity decoding process. However, each step of the polarity decoding process uses hardware for only a single f or g calculation, resulting in high latency. Although this approach can achieve high throughput by overlapping many decoding processes at a time, it has limited flexibility. In contrast, the proposed method is completely flexible in that its computing hardware can be reused for each subrows 1704 in the polar code patterns 201, 202, 203, even though they contain fewer stages 207 or smaller block sizes than the hardware assumptions. [14] The line decoder of (c) achieves a high degree of parallel processing by simultaneously performing all f and g calculations associated with the rightmost stage 207 of the polar code pattern 201, 202, 203 having a particular size. However, the above-described data dependencies may prevent this parallelism from being fully exploited when processing other stages 207 in the graphs 201, 202, 203. Instead, successively smaller subsets of the hardware may be reused to perform processing for each successive stage 207 on the left, resulting in poor hardware efficiency and flexibility. Thus, the semi-parallel decoders of [8], [15] - [24] improve hardware efficiency and flexibility by reducing the degree of parallel processing, which requires several processing steps to perform the computation of the rightmost stage 207, but still cannot take advantage of all parallelism at the leftmost stage 207. In contrast, each step of the proposed method achieves a high degree of parallelism by simultaneously performing computations that span not only the length of each column up and down, but also across multiple stages 207 in each column 1701, 1702. More specifically, the proposed method uses a tree structure to perform the computation for each sub-row 1704, which ensures that parallelism is typically exploited regardless of which column 1701, 1702 is being accessed and regardless of the graph size. This achieves a high degree of flexibility, high hardware efficiency, high throughput and low latency.

Although several polarity decoding methods employing the concepts of columns 1701, 1702 have been previously proposed, none of them applies in a fully generalized manner to the proposed polarity decoder core 111, where any number of columns 1701, 1702, each comprising a potentially different and any number of stages 207, may be employed. [14] The tree structure of [25] - [29] operates on the basis of a single column 1701, which single column 1701 includes all the stages 207 in the polarity code pattern 201, 202, 203, but this approach only supports a single core block length and may result in a large hardware resource requirement. In the method of [30], [31], the polar pattern 201, 202, 203 is broken down into two columns 1701, 1702, which include an equal number of stages 207, but again the method only supports a single core block length. In contrast, using the semi-parallel approach described above, the methods of [32], [33] use an external column 1701, which external column 1701 may include several stages 207, but all other stages are processed separately. In contrast to these approaches, the proposed polarity decoder core 111 may benefit from the general application of the ranks 1701, 1702 due to its novel memory architecture. These are necessary because the bits and LLRs of a particular packet are written while processing one column 1701, 1702, but the bits and LLRs of a different packet are read while processing an adjacent column 1701, 1702. The proposed memory architecture seamlessly enables read and write operations using these packets, ensuring that the correct bit and LLR groups are delivered gracefully to the correct locations at the correct times. Furthermore, the proposed method facilitates significant memory reduction since the bits and LLRs are stored only at the boundaries between each pair of consecutive columns 1701, 1702 and not at the large number of boundaries between each pair of consecutive stages 207.

These same novel memory architectures also serve as the basis for the proposed portion and propagation 1805 in the polar decoder core 111, where the bypass mechanism 1610 is used to pass bits from the outer column 1701 to any of the inner columns 1702 in a single step of the decoding process. This is in contrast to the parts and propagation methods that have been previously proposed. In [8], [15], and [30], the partial and update logic is used to accumulate different combinations of decoded bits, and a complex interconnection network is used to deliver them to the corresponding g-function processes. This results in a significant hardware overhead and long critical paths, thereby limiting the achievable hardware efficiency, throughput and delay. In contrast, the feed forward architecture of [19], [21], [28], [32], [34] uses dedicated hardware to propagate the partial sum bits to each successive stage 207 of the polar code pattern 201, 202, 203. However, the complexity of the feed forward architecture grows rapidly for each successive stage 207, limiting the range of core block lengths that can be supported, and limiting hardware efficiency. In contrast, the methods of [17], [22], [27], [35] use the simplified polar encoder core 102 to achieve partial sums, although this does not benefit from reusing the calculations performed as a natural part of the decoding process as in the proposed method.

Data path

The proposed polar decoder core 111 uses dedicated hardware data paths 1601, 1602, 1603 to implement the f and g LLR functions of (2) and (3) and the parts and functions of (4) and (5). While the latter may be implemented using a network of XOR gates 204, the f and g functions may be implemented using a network of fixed point processing units 2201. In some examples, the internal data path 1601 may perform a computation 1806 associated with one access to one child row 1704 in a row 1703 of one internal column 1702. Also, in some examples, the external data path 1602 may perform the computation 1804 associated with a row 1703 in the external column 1701. Finally, in some examples of the partial sum chains described herein, each instance of the partial sum data path 1603 may be used to propagate 1805 the partial sum through one internal column 1702.

1) Processing unit and fixed point number representation:

the proposed processing unit 2201 of fig. 12 accepts two fixed point input LLRs2202 and->2203 and bit input- >2204 and a mode input 2205. Depending on the binary values provided by mode input 2205, processing unit 2201 combines the other inputs to generate fixed point output LLR2206 according to (2) or (3) as depicted in FIG. 4

Document [10 ]]、[13]Some previous implementations of mid-polarity codes use a 2's complement fixed point number representation to place each LLRVector expressed as W bits->Wherein->Is both the Most Significant Bit (MSB) and the sign bit, < >>Is the Least Significant Bit (LSB) and +.>In this way, the g-function of (3) can be implemented using a single adder. Here, subtraction may be accomplished by complementing all bits in the complement fixed-point representation of 2 of the LLR being subtracted, if desired, and then adding it to another LLR using the carry-in of the full adder circuit along with an additional "1". In the f-function of (2), it is necessary to add +.>And->Taking the inverse to determine the absolute value respectivelyAnd->

Fig. 13 illustrates an example of a known technique for the complement implementation of 2 with respect to the "f" function of (2): (a) simple implementation; (b) reducing implementation of hardware; (c) reducing implementation of critical paths.

In a simple implementation of the f-function, both of these can be implemented by complementing all bits in the complement fixed-point representation of LLR2 with a complement 2301 and adding 1 using adder circuit 2302 to produce the absolute value shown in FIG. 13a Each of the negations. Thereafter, according to the comparison and selection operation shown in fig. 13a, the comparison and selection operation can be performed by using the third adder 2303Less->And using the resulting sign bit selection 2304 +.>Or->To realize->Finally, depending onAnd->May need to be about->The fourth adder 2305 is required for the inversion. In a more complex 2's complement implementation, the functionality of the first three adders 2302, 2303 described above may be implemented using only a single adder 2306. This enables the f-function to be implemented using two adders in series, with the second adder 2307 performing the inversion if necessary, as shown in fig. 13 b. To reduce the critical path length to only a single adder, an alternative implementation may use three adders 2306, 2308 in parallel to implement the f-function, as shown in fig. 13 c. Here, one adder 2306 is used to combine the functions of the first three adders 2302, 2303 described above and determine whether or not the sum should be +.> Or->Give->Meanwhile, if the first adder 2306 selects 2309 these values, the other two adders 2308 calculate +.>And->Document [15]、[16]、[26]、[36]Some other previous implementations of mid-polarity codes have used a sign-magnitude fixed-point representation to place each LLR +. >Vector expressed as W bits->Wherein->Is a sign bit->Is MSB (MSB)/(b)>Is LSB andat the same time, some previous implementations [29]A complement fixed point number representation of 1 is used, whereinAlthough these methods allow for the completion of (2) using a single adderf function, but in order to perform g function of (3), an additional adder is also required to convert the complement fixed-point number representation of 2 and from 2. Alternatively, these methods may be implemented using only a single adder to perform both f and g functions, at the cost of sometimes being in the resulting LLR +.>And->Introduces + -1 errors, thereby reducing the polarity decoder [29 ]]Is used for error correction.

In contrast to these previous implementations, the input LLR, output LLR, and internal operations of the proposed processing unit 2201 of FIG. 12 are represented in fixed-point numbers, with the complement number of 2 appended to the additional sign bits. More specifically, each input LLR-x 2202, 2203 is represented asOf (2), wherein%>Is an additional sign bit->Two complementary sign bits used as MSB and 2, ">Is LSB and +.>Here, the sign of LLR can be obtained as +.>In other words, the additional sign bit indicates whether the value represented by the complement fixed-point number of 2 should be inverted in order to recover the true LLR value. Note that in alternative arrangements, additional sign bits may be placed in the vector, for example The w+1 bits of the proposed fixed-point number representation are reordered instead of the forefront and/or by using a complement representation of 2 with LSB priority instead of MSB priority. This illustrates that, in some cases, the tag of the bit +.>The index w contained in (a) may be related to their importance or function, not to their order. Note that while some of the previous efforts cited above have temporarily used a binary flag to indicate that the complement fixed point number of the accompanying 2 needs to be negated. However, these flags are not transferred between processing units or into memory. In particular, none of the processing units of previous efforts have the input circuitry required to accept the inputs 2202, 2203 represented by the proposed fixed point numbers.

The proposed processing unit 2201 employs only a single adder 2207, which single adder 2207 can be shared to perform both the g function of (3) and the "f" function of (2), as characterized by the schematic diagram and truth table of fig. 12. In some cases, a single adder of a particular processing unit may be used to perform the "g" function in some clock cycles and the "f" function in other clock cycles. Alternatively, in some cases, a single adder may be used only to perform the "f" function. Alternatively, in other cases, a single adder may be used only to perform the "g" function. The two inputs 2208 of the adder each have a W bit, which are derived from And->The complement of 2 of (2) and the output 2209 includes the w+1 bit to avoid overflow. For example, LLR +.>Andthe w+1=7 bit fixed point number representation of (c) will each include a 2's complement binary number containing w=4 bits and an additional sign bit. LLR->And->W=6 bits of the 2's complement binary number may be provided to a single adder. This may result in a complement output of 2 comprising w+1=7 bits to avoid when +.>And->Overflow occurs when there is a large amplitude. When the complement output of 2 is combined with the additional symbol, the resulting fixed-point number representation will include w+2=8 bits. Dependent on->Values of>And->Is added by using adder 2207>The complement part of 2 of (2) is added to +.>The complement part of 2 or from +.>Subtracting +.2 from the complement part of 2>2 of (2) to achieve LLR +.>2 of (c) are provided. As is conventional, the control signal may be used to control a two-input adder to calculate the addition or subtraction of its 2's complement input. More specifically, before the control signal is provided to a single adder, it may be exclusive-ored with the bits of one of the complement inputs of 2, so as to switch all bits of the input when the control signal is asserted. In addition, a control signal may be provided to the "carry" input of the adder. It is contemplated that all references to a single two-input adder may include all variations thereafter. Because of +.f. Functions >The term can also be implemented by performing this addition or subtraction using adder 2207, thus depending on +.>And->The values of two sign bits in the two enable a high degree of hardware reuse. Then 2210 +_ can be selected using the MSB of the obtained complement number of 2>Or->2 of (2) to provide LLR2 of (c) are provided. LLR for both f and g functions +.>And->May be obtained using simple combinatorial logic as characterized by the truth table of fig. 12. The output 2206 of the proposed processing unit 2201 comprises w+2 bits, wherein LLR +_ is represented, due to the additional bits introduced by adder 2207>Or->Given by the formula:note that the proposed method does not yield a final LLROr->Any + -1 error is introduced, preserving the same error correction capability as the 2's complement fixed point number representation, but each processing unit 2201 uses only a single adder 2207.

Note that in the external data path 1602 of section II-B3, only some of the processing units 2201 are required to perform one or the other of the f or g functions. In these cases, mode input 2205 and all circuitry dedicated to unused modes may be removed. Note that the complement fixed-point numbers of 2 provided to the LLR input 1607 of the proposed polar decoder core 111 can be converted to the proposed fixed-point number representation by appending them to zero-valued additional sign bits. It is also contemplated that the zero valued additional sign bits may be supplemented with 2's complement fixed-point number bits in any order, provided that other examples of contemplated implementations cover the reordering of the proposed fixed-point number representations. Thereafter, the proposed fixed-point representation may be used throughout the proposed polar decoder core 111 without conversion to a complement of 2 or any other fixed-point representation. For example, LLR memory 5 in the example of fig. 6 may store LLRs using a complement number representation of 2 and may include an optional conversion circuit 1621 on its output port for providing supplemental zero-valued additional sign bits. Alternatively, the LLR memory 1604 required to store each LLR may be reduced by one bit by converting the LLR to 2's complement fixed-point number using an adder before writing. More specifically, if the additional sign bit is set, the number of complements of 2 may be replaced by inverting all its bits and then increasing the result value using an adder. For example, in the example of fig. 6, LLR memories 1 through 4 may store LLRs using a complement number representation of 2, and may include an optional conversion circuit 1620 on their input ports for inverting the complement portion of 2 of the proposed fixed-point representation, depending on the value of the corresponding additional sign bit. These conversion circuits 1620, 1621 are optional components depending on how the LLRs are stored in memory. To convert back to the proposed fixed-point representation when the LLR is read from the LLR memory block 1604, a 2's complement fixed-point number may be appended to the zero-valued additional sign bit. For example, in the example of fig. 6, LLR memories 1 through 4 may store LLRs using a complement number representation of 2, and may include optional conversion circuitry 1621 on their output ports for providing supplemental zero-valued additional sign bits.

2) Internal data path:

the internal data path 1601 is used to perform all LLR and bit calculations for each access 1806 to each subrow 1704 in the internal columns 1702 of the polar code patterns 201, 202, 203. In some examples, the internal data path 1601 may be formed from s as described herein _i And n _i To parameterize. These parameters are referred to herein as the number of internal datapath stages and the internal datapath block size, respectively. Note that for n _i Using a larger value is analogous to processing with a smaller n at the same time _i Is a sub-row of the first row. In this example, the values of these parameters are fixed at design time, with the number of internal data path stages s _i Can be 1 to n _max -s _o Any value within the range, while the internal datapath block size may take onTo N _max A value of any power of 2 within the range.

FIG. 14 illustrates an example schematic diagram of internal data paths in a proposed polar decoder core, for s, according to an example embodiment of the present invention _i =2 and n _i Example of =8. This example of an internal data path 1601 schematic may be suitable for SC decoding. In the case of SCL decoding, L parallel copies of the schematic may be used, where L is the list size. The internal data path 1601 has an input v that identifies which access is in progress to the current child row 1704, where the access index is between 0 and 0 Within a range of (2). Note that this input is not shown in fig. 14 for simplicity. In the case of SC decoding, the internal data path 1601 is decoded from n on its left edge 2401 _i The bit takes the input. In this example, these input bits are derived from previous accesses of the external data path 1602 and the internal data path 1601 to the left internal column 1702 described herein via consecutive hops (hops) through the portion of the data path 1603 and the bit memory block 1605. The vector of bit inputs can be decomposed into +.>Equal length sub-vectors corresponding to +.>And a row 1703 of connections. However, during a particular access v to the current sub-row 1704, only the first v sub-vectors will contain valid bits, as processing will be completed for only the first v connected rows 1703 in the left column. Note that the last subvector of input bits will not provide a valid bit since the lowest connected row 1703 in the left column is not accessed until after the last access to the current row 1703 in the current column. Thus, in an alternative arrangement, the last +.>A number of inputs and all connected circuitry. Further, internal data path 1601 takes the data from n on its right edge 2402 _i LLR inputs result from previous accesses by the internal data path 1601 to the immediately right column 1702 via the corresponding LLR memory block 1604. Here, the proposed fixed point number representation described above may be used for each LLR, as described below. The internal data path 1601 provides n on its left edge 2403 _i The output of the bits and provided to the portion of portion II-B4 and data path 1603 via a corresponding bit memory block 1605. Further, in some examples, the internal data path 1601 is n on its left edge 2404 _i The fixed point LLRs provide outputs that are provided to the columns 1701, 1702 immediately to the left via corresponding LLR memory blocks 1604. However, only a subset of these outputs carry valid LLRs, such as n output on the left edge of internal data path 1601 _i Identified by the write enable signal. Note that these write enable signals are not shown in fig. 14 for simplicity.

As shown in fig. 14, internal data path 1601 includes a graph 2405 of XOR 204. Here, each input to the left edge of XOR graph 2405 is taken from a corresponding bit input 2401 on the left edge of internal data path 1601, while a corresponding output from the right edge of XOR graph 2405 is provided to a corresponding bit output 2403 also on the left edge of the data path. Note that XOR graph 2405 is similar to a generator matrix Rightmost s in the graphical representation of (a) _i Stage 207. However, the lowest ++in each stage is ignored in the XOR diagram 2405 of the internal data path 1601>The XORs 204 because they will be connected to the lowestAnd input bits that do not carry significant bits, as described above. This results in the figure11, some XOR 204 is omitted from the rearranged figure. Note that when the number of levels s in the current column _c Below s _i At this point, the number of stages in XOR diagram 2405 is reduced to match s by disabling XOR gate 204 in the leftmost stage of diagram 2405 _c . This may be accomplished by masking the corresponding vertical connection in the data path using and gate 2406, as shown in fig. 14.

Further, in some examples, the internal data path 1601 may include a network 2407 of processing units 2201, each of which processing units 2201 may be configured at run-time to perform the f-function of (2) or the g-function of (3). Each input to the right edge of the processing unit network 2407 is taken from a corresponding LLR input 2402 on the right edge of the internal data path 1601, while each output from the left edge of the network is provided to an LLR output 2404 on the left edge of the data path. Network 2407 includes s _i A stage, wherein the rightmost stage includes n _i 2 processing units 2201, and each successive stage on the left contains half of the processing units 2201 of its right stage.

In some examples, the processing unit may be configured to operate based on a fixed-point number representation as described herein, wherein in each successive stage, increasing bit widths are used from right to left. However, clipping circuit 2411 may be used to reduce the bit width of the soft bits or LLRs output on the left edge of the network of processing units to match the bit width of the soft bits or LLRs input on the right edge. In alternative arrangements clipping may additionally be performed between some specific stages of the processing unit network, which reduces the hardware resource requirements of the internal data path, but at the cost of reducing the error correction capability of the polar decoder. As quantified in fig. 10 for the case of l=8 list decoding, which implies that l=8 copies of the internal data path 1601 are required, the critical path through the processing unit network comprises s in series _i A number of processing units 2201, and the total number of processing units 2201 is defined byGiven. Processing units 2201 in network 2407 are connected together to form a binary tree. These connecting portions are according toFrom the generator matrix->Is the rightmost s of the graphical representations 201, 202, 203 of (c) _i The topmost XOR 204 in stage 207 is arranged. Note that the tree structure is similar to [26 ] ]、[30]And [32 ]]Although these previous implementations do not flexibly support different core block lengths N at run-time. Note that when the number of levels s in the current column _c Below s _i At this time, the number of stages in processing element network 2407 is reduced to match s by using multiplexer 2408 to bypass processing element 2201 in the leftmost stage in network 2407 _c As shown in fig. 14.

Depending on which access v is being made to the current child row 1704, the processing unit 2201 performs the f function of (2) or the g function of (3). More specifically, the access index v is converted to have s _c The binary numbers of the digits, but in reverse order, with the LSBs mapped to the leftmost level of the processing units in the internal data path and the Most Significant Bits (MSBs) mapped to the rightmost level of the processing units in the internal data path. If a bit in a particular location within the inverted binary representation of the access index has a value of "0," processing unit 2201 in the corresponding stage of the network performs the f function of (2). Conversely, if the corresponding bit is "1", then these processing units 2201 perform the g function of (3). Here, multiplexer 2409 is used to deliver the correct bits from XOR graph 2405 to each processing unit 2201 that calculates the g function.

As shown in fig. 14, the arrangement of multiplexers 2408 is for n positioned on the left edge 2404 of the internal data path 1601 _i Generated by the processing unit network in the individual LLR outputsThe LLRs. Circuitry is also provided to assert a write enable output having corresponding locations for the LLRs. More specifically, the arrangement of multiplexers 2408 will be provided by processing unit network 2407 with each index +.>LLR mapping to an index n (m) ∈ [0, n _i -1]N on the left edge 2404 of the internal data path 1601 _i Different ones of the outputs: />

Here, j is _c ∈[0，N-1]Referred to as a first index, which represents the vertical index of the top-most connection of the polar code pattern 201, 202, 203 belonging to the current sub-row 1704 in the current column c, where j is for the top-most sub-row in the top-most row _c =0. The first index may be obtained according to the following:

j _c ＝y _c r _c +s

wherein y is _c ∈[0，N/r _c -1]Is the index of the row 1703 currently being accessed in column c, and s e 0, max (r _c /n _i ，1)-1]Is the index of the child row 1704 being accessed in this row 1703. N output on the left edge of internal data path 1601 _i Of the vectors of write enable signals, there is an index n (m)The corresponding subset of the individual signals is asserted. In some examples, this operation of multiplexer 2408 and the write enable signal allow the LLRs output by internal data path 1601 to be written directly to corresponding LLR memory block 1604. In some examples, controller 1606 may be configured to insert pipeline registers between some or all of the stages in XOR graph 2405 and processing element network 2407.

3) External data path:

In the case of SC decoding, the external data path 1602 of fig. 15 may be used for the polar code pattern 201. Each row 1703 in the outer column 1701 of 202, 203 performs all LLR and bit calculations 1804. The external data path 1602 is defined by s _o Parameterization, s _o Referred to as an external datapath progression. In some examples, the value of this parameter is fixed at design time, and 0 to n may be employed _max ＝log ₂ (N _max ) Any value within the range. Here, assume thatAt->In the case of (a), the interface with the corresponding LLR memory block 1604 of FIG. 6 would require a greater width +.>And modifications to the controller 1606.

The external data path 1602 takes the data from the left edge thereofRedundancy bits 2501 and->Inputs of the redundancy bit flags 2502 originate from corresponding inputs 1608 of the proposed polarity decoder core 111. External data path 1602 also receives n on its right edge 2503 via corresponding LLR memory block 1604 _i The LLRs take inputs that originate from internal data path 1601. Furthermore, external data path 1602 is provided on its right edge 2504 for n via a corresponding bit memory block 1605 _i The output of the bits is provided to internal data path 1601 and partial sum data path 1603. In addition, external data path 1602 is provided on its left edge 2505>Bit outputs for the restored core information block +.>114. In the case of SC decoding, these bits can be written directly to the bit output 1609 of the proposed polar decoder core 111, which thus adopts +.>Is a width of (c).

The external data path 1602 is based on a generator matrix according to the data correlation previously describedOperates on the graphical representations 201, 202, 203 of the generator matrix for performing all XOR, f, and g operations. Thus, external data path 1602 includes an XOR graph that includes s _o Stages, each stage comprising->And XOR 204. In addition, the external data path 1602 includes an f/g graph, which also includes s _o Stages, each stage having +.>The individual processing units 2201 and +.>And a processing unit 2201, as described herein.

The processing units 2201 operate based on a fixed point number representation, with incremental bit widths being used in each successive processing unit 2201 along the critical path shown in fig. 15.

The input on the right edge 2503 of the f/g graph includesThe fixed point LLRs are as shown in fig. 15. The arrangement of multiplexer 2506 is for n provided from an input on the right edge 2503 of external data path 1602 _i Selection of these>The LLRs. More specifically, the arrangement of the multiplexer 2506 is according to n (m) =mn _i /r ₁ From having index n (m) ∈ [0, n _i -1]N on the right edge 2503 of the internal data path 1601 _i With each index +.>Is a low-power amplifier (LLR).

Note that ifThen there will be every index +_ on the input of the f/g graph>The LLR of (1) is set to a fixed point number to represent the maximum positive value supported. These additional LLRs have no effect on the decoding process, since they correspond to the values at +.>Is appended to the asserted freeze bit flag of the freeze bit vector.

The external data path 1602 also includes a circuit 2507 for selecting the value of a bit output on the left edge 2505 of the external data path. More specifically, if the corresponding redundancy bit flag is set 2502, the value of the corresponding redundancy bit 2501 is adopted. If not, a value is selected for the bit using the sign of the corresponding LLR, where a positive LLR gives a bit value of 0 and a negative LLR gives a bit value of 1. These decisions inform the XOR and g operations performed within the graph and also drive the bit outputs on the left edge 2505 of the external data path 1602.

After all XOR operations 204 within external data path 1602 are completed, they are generated on the right edge of the XOR graph The vector of bits is shown in fig. 15. The multiplexer 2508 is arranged to be positioned at the right edge 2504 of the external data path 1602N of upper output _i These>Bits. More specifically, the arrangement of the multiplexer 2508 is according to n (m) =mn _i /r ₁ With each index on the output of the XOR-graph>Bit map with index n (m) ∈ [0, n _i -1]N on the right edge 2504 of the internal data path 1601 _i Different ones of the outputs; while zero value bits are provided to all other outputs on the right edge 2504 of the external data path 1602. In some examples, controller 1606 may be configured to insert pipeline registers between some or all of the stages in the XOR graph and the f/g graph.

In the case of SCL decoding, the external data path 1602 needs to additionally be able to perform all partial sum, f and g calculations on all candidates in the list. In addition, the external data path 1602 needs to calculate (7) metrics that accumulate over successive core information bits. Here, registers may be used to pass metrics between successive accesses to successive rows 1703 in the external column 1701. Additionally, in some examples, the external data path 1602 requires classification circuitry in order to identify and select the L candidates with the lowest metrics. Finally, the required capacity is LN _max Bit memory blocks of bits store L candidate core information blocks. In this case, an additional pointer memory [18 ]]May be used to assist in addressing the bit memory block. Fig. 10 quantifies the total number of adders needed to achieve f, g, metric and classification calculations for the case of l=8scl decoding.

4) Part and data path:

the partial sum data path 1603 is used to perform XOR operations 2101 on each of the subrows omitted in the XOR graph in the internal data path 1601 and propagates 1805 bits from left to right in the polarity code patterns 201, 202, 203. The partial sum data path 1603 is defined by s _i And n _i Parameterization, s _i And n _i Respectively referred to as an internal datapath progression and an internal datapathDiameter block size. Note that a larger n is used _i The value is similar to processing with smaller n at the same time _i Is a sub-row of the display. As described above, in some examples, the values of these parameters are fixed at design time, with the internal datapath progression s _i Can be 1 to n _max -s _o Any value within the range, while the internal datapath block size may take onTo N _max A value of any power of 2 within the range.

In this example, the operation of the partial and data path 1603 schematic shown in fig. 14 is for SC decoding. In the case of SCL decoding, L parallel copies of the schematic may be used, where L is the list size. In the case of SC decoding, the partial sum data path 1603 is accessed from n on its left edge 2601 via consecutive hops through other copies of the partial sum data path 1603 and via bit memory block 1605 _i Bits take inputs that originate from the right edge 2504 of the external data path 1602 and the left edge 2403 of the internal data path 1601. The partial sum data path 1603 outputs n on its right edge 2602 _i Bits provided to the left edge 2401 of the internal data path 1601 via consecutive hops through portions and other copies of the data path 1603 and via the bit memory block 1605.

As shown in fig. 16, the bottommost partThe output bits are set equal to the corresponding input bits. However, the highest +.>The output bits are the corresponding input bits and the bottommost +.>An XOR 204 of one of the input bits. Here, the specific bits are identified such that both bits in each exclusive OR pair have the same index modulo +.>Wherein each bit index is between "0" and n prior to the modulo operation _i Within-1 and after modulo arithmetic at "0" to +.>Within a range of (2). Since the partial and data path 1603 is invoked at the interface between the inner columns 1702 of each successive pair, the XOR 204 of the partial and data path 1603 corresponds to the additional XOR 2101 introduced in the rearranged diagram of fig. 11.

Note that in an alternative arrangement, the results of the XOR 204 performed by the internal data path 1601 may be discarded after they are used as inputs to the g-function, rather than being output on the left edge 2403 of the internal data path 1601 and stored in the bit memory 1605. In this case, it is necessary to rely on the partial sum and data path 1603 to perform all XOR operations 204 on the corresponding subrows during propagation 1805 of the partial sum. This can be achieved by: replacement of FIG. 16 with a completed XOR patternThe completed XOR diagram is similar to the generator matrix +.>Is the rightmost s in the graphical representation 201, 202, 203 of (c) _i Stage 207. However, this approach would require s _i n _i 2 XOR 204, which is generally greater than the one used in the proposed method>The number of XOR 204 is high. Furthermore, compared to the single XOR 204 of the proposed method, the critical path will comprise s _i XO (Crystal x)R 204。

Memory device

The proposed polar decoder core 111 employs two types of memory, an LLR memory block 1604 and a bit memory block 1605.

1) LLR memory:

as shown in fig. 17, the proposed polar decoder core 111 employs C of the LLR memory 1604 _max Two-dimensional blocks, LLR memories 1-C _max . Conceptually, it can be considered that LLR memory c ε [1, C _max -1]Located at index c e 1, C _max -1]At the interface on the left edge of inner column 1702, while LLR memory C may be considered _max Resides at the interface between the rightmost column 1702 of the proposed polar decoder core 111 and the LLR input 1607. The memory block with index c comprises a single Random Access Memory (RAM) with width n _i Fixed point LLR and depth max (r _c-1，max /n _i 1) addresses, wherein width and depth represent two dimensions of a memory block. The total LLR memory requirement of the proposed polar decoder core 111 is determined byLLR is given. Note that alternative arrangements may accommodate C within a single RAM by expanding its depth to accommodate all memory blocks in the depth dimension _max Memory blocks other than using C _max Different RAMs accommodating C in a third RAM size _max Memory blocks. However, such alternative arrangements would imply a different data path interface and controller 1606 design than the data path interface and controller 1606 designs described below and elsewhere in the specification. In some examples, assume +.>At->In the case of LLR memory 1 and LLR memory C _max Requiring a greater +.>Width and modifications to the controller 1606 to support interfaces with external data paths 1602.

Note that in the case of SCL decoding, there are indices "1" to C _max The LLR memory block 1604 of-1 needs to be replicated L times, and the LLR memory block 1604 may be accommodated in a RAM size or width size. Additional pointer memories [18 ] may be used herein]To assist in addressing between these copies of the memory. However, since the LLR provided by LLR input 1607 of polar decoder core 111 is common to all L decoding attempts, it is only necessary to have index C _max A single copy of LLR memory block 1604. For the case of l=8scl decoding in fig. 10, the total capacity of the LLR memory blocks is quantized, excluding the pointer memory. Because of these considerations, the LLR provided to LLR input 1607 of the proposed polar decoder core 111 is always stored at index C _max Regardless of how many columns C are used to decode the current core block length N in LLR memory block 1604. As an additional benefit, there is an index C _max LLR memory block 1604 may use and internal data path n _i Width n of decoupling of width of (a) _l Interfacing with the LLR input 1607 of the proposed polar decoder core 111. In this way, no matter how the internal data path 1601 is parameterized, the data for n may be used _i Fast loading of LLRs into the proposed polar decoder.

However, for simplicity, in this example, let us assume n _l ＝n _i . The number N of the input LLRs is smaller than the LLR memory C _max Width n of (2) _i In the case of (a), an equal number of zero value LLRs are inserted after each input LLR and then provided to the memory's input so as to occupy its full width. Note that in the case where the LLR input 1607 of the proposed polar decoder core 111 is represented by a complement fixed-point number of 2, there is an index C _max The LLR memory block 1604 of (i) may directly store the supplied 2's complement LLR without the additional sign bits introduced by the proposed fixed-point number representation of some examples.

A single LLR memory block 1604 is illustrated in fig. 17 for s _i =1 and n _i Case=4. As shown in fig. 17, the RAM of each LLR memory block 1604 has a memory value of at least one of the values of max (r _c-1，max /n _i 1) outputting n across the width of a particular one of the addresses _i N of LLRs _i LLR reads data port 2701, where a particular address is selected by input provided on address port 2702, as shown in FIG. 17. Also, RAM has n _i LLR write port 2703, as shown in fig. 17. The write port 2703 accepts n can be updated across the width of a particular address _i The input of the LLR, the particular address is selected by the input provided on address port 2704. However, these n are updated only when the corresponding write enable signal 1615 is asserted _i The LLRs. Let n be _i Individual write enable signals 1615 may be used to control whether n is written individually _i Each of the LLRs. If a particular hardware RAM implementation does not support this function itself, the write port may be defined by n _i A multiplexer 1614 is driven which can be used to multiplex the input LLRs with feedback from the read port 2701, as shown in fig. 17. In this way, n _i The n write enable signals 1615 may individually control these n _i The LLR selected by the multiplexers either writes new LLR values to RAM or maintains the current LLR values by writing corresponding LLRs obtained from the read port 2701.

As shown in fig. 17, each operation of the internal data path 1601 within column c reads from and writes to the LLR memory block 1604 having index c+1 using a corresponding write enable signal 1615. Likewise, each operation of external data path 1602 is read from LLR memory block 1604 having index c=1 if C > 1, otherwise from LLR memory block having index C _max Is read in LLR memory block 1604. LLR memory block 1604 and data pathsThese interfaces between paths 1601, 1602 are specifically designed to avoid complex routing network requirements that allow any LLR in a memory block to be read or written by any input or output of data paths 1601, 1602. Instead, the arrangement of the LLRs in the memory block is designed such that only a simple routing network is needed between the LLR memory block 1604 and the data paths 1601, 1602. Likewise, in some examples, it is designed such that only a limited number of control signals from the controller 1606 are required. More specifically, during each step of the decoding process, n over the width of a particular address within an appropriate memory block is set as appropriate _i The LLRs are read and delivered seamlessly to either the internal data path 1601 or the external data path 1602. Likewise, whenever operated, n over a particular address width within an appropriate memory block is written using LLR and write enable signals 1615 delivered seamlessly by internal data path 1601 _i A subset of the LLRs. The controller 1606 need only provide the appropriate read and write addresses 2702, 2704 to the two memory blocks 1604.

2) Bit memory:

As shown in fig. 6, the proposed polarity decoder core 111 employs C of bit memory 1605 _max 1 three-dimensional blocks, i.e. bit memory 1 to bit memory C _max -1. Conceptually, the bit memory c can be considered to be located on the left edge of the column 1702 with the corresponding index c at the interface with the columns 1701, 1702 with the index c-1. Here, the bit memory block 1605 having the index c includesA RAM having a width n _i Bit and depth of +.>Addresses, where RAM, width, and depth represent three dimensions of memory block 1605. The total bit memory requirement of the proposed polarity decoder core 111 is defined by +.>Bits are given.

Note that in the case of SCL decoding, the bit memory block 1605 needs to be copied L times, and the bit memory block 1605 may be accommodated in a RAM size or width size. In this case, an additional pointer memory [18 ]]May be used to facilitate addressing between these copies of the memory. For the case of l=8scl decoding in fig. 10, the total capacity of bit memory block 1605 is quantized, including the output bit memory described in some examples, but not including pointer memory. Note that alternative arrangements may exchange roles of RAM and width dimensions, rather than employing a memory with a single memoryN of bit width _i And RAM, although this will imply a different data path interface and controller 1606 design than that described in the following description and other examples. As mentioned, in some examples, assume +.>At->In the case of (a), the bit memory 1 needs a larger width +.>And modifications to the controller 1606 to support interfaces with external data paths 1602.

A single bit memory block 1605 is illustrated in fig. 18, for s _i =1 and n _i Case=4. As shown in fig. 18, each RAM in each block of the bit memory 1605 has n _i Bit reads port 2801. The read port 2801 is at the depth of the RAMOutputting n across the width of a particular one of the addresses _i Bits. Here, as shown in fig. 18, a specific address is selected through an input provided on the address port 2802. Likewise, each RAM has n _i The bit write port 2803 is shown in FIG. 18. The write port 2803 accepts that n can be updated across the width of a particular address _i Bit input, the particular address is selected by the input provided on address port 2804. However, these n are updated only when the corresponding write enable signal 1616 is asserted _i Bits. Let n be _i Individual write enable signals 1616 may be used to control whether n is written individually _i Each of the bits. If the particular hardware RAM implementation does not support this function by itself, the write port 2404 may be defined by n _i The multiplexers 1617 are driven, and the multiplexers 1617 may be used to multiplex the input bits with feedback from the read ports 2801. For simplicity, this mechanism is not shown in fig. 18, although it is shown in fig. 6. In this way, n _i The write enable signals 1616 may be individually controlled by these n _i The bit selected by the multiplexer 1617 either writes a new bit value to RAM or maintains the current bit value by writing the corresponding bit obtained from the read port 2801.

As shown in FIG. 6, the C-2 instance of external data path 1602, part and data path 1603 and the C-1 instance of bit memory block 1605 form a chain. More specifically, bit memory 1 resides between external data path 1602 and partial and data path 1, while bit memory C ε [2, C-2] resides between partial and data path C-1 and partial and data path C, while bit memory C-1 terminates the chain and resides to the right of partial and data path C-2. In the step of the decoding process that is accessing the subrows 1704 in the inner column c, the multiplexer 1612 connected to the bit input and output on the left edge of the inner data path 1601 is controlled to interface with the bit memory c. Here, fig. 18 details the interface between the bit memory c and its neighboring data paths 1601, 1602, 1603.

These between the bit memory block 1605 and the respective data paths 1601, 1602, 1603Interfaces are specifically designed to avoid the need for complex routing networks, which are required to allow any input or output of the data paths 1601, 1602, 1603 to read or write any bit in the memory block 1605. Instead, the arrangement of bits in the memory block 1605 is designed such that only a simple routing network is required between the bit memory block 1605 and the data paths 1601, 1602, 1603. Also, in this example, it is designed such that only a limited number of control signals from the controller 1606 are required. More specifically, within a particular bit memory block 1605The address ports of the individual RAMs are all tied together, requiring only the controller 1606 to generate a single address 2802, 2804 for each bit memory block 1605. Further, the bit inputs 2601 on the left edge of the partial sum data path c and 2401 on the left edge of the internal data path 1601 are each read from the bit memory c on a simple width-wise basis, as described in detail below. Similarly, the bit output 2403 on the left edge of the internal data path 1601 is written to the bit memory c on a width-wise basis. In contrast, bit output 2602 on the right edge of portion and data path c-1 is written to bit memory c on the basis of a simple RAM direction, as described in detail below. Likewise, bit output 2504 on the right edge of external data path 1602 is written to bit memory 1 on a RAM-directional basis. In some alternative examples, the width-wise bit memory access may be replaced by a RAM-wise access, and vice versa, although this implies a different data path interface and controller 1606 design than the data path interface and controller 1606 design described below and elsewhere.

For both the width-direction interface and the RAM-direction interface between the bit memory block 1605 and the data path, there is a position l ε [0, n ] in the input or output of the data path _i -1]Reads from or writes to a particular location within the width of a particular address within the depth of a particular one of the RAMs in memory block 1605. This location in memory block 1605 may be defined by width coordinate w _l ∈[0，n _i -1]Depth coordinatesAnd RAM coordinatesTo identify. As described above, the arrangement of bits in each memory block 1605 and the proposed operation of the polarity decoder core 111 are such that ≡within a particular bit memory block 1605>The address ports 2802, 2804 of the RAM may all be tied together. This implies that all n accessed together for both width-wise and RAM-wise interfaces _i The bits will all have the same depth coordinates, i.e. d _l For all l E [0, n ] _i -1]Having the same value.

In addition, there is a position l e [0, n in the width-wise datapath interface _i -1]Only bits with corresponding width coordinates w are accessed _l Location in bit memory block 1605 of=l. However, this bit in the datapath interface may require access to any possible RAM coordinates at different times during the polarity decoding process Thus (S)>The multiplexer 2805 is the only circuitry needed to provide the ith bit to the width direction data path input.

More specifically, the multiplexer 2805 is implemented byA selection is made between bits provided in the first position in the read port 2801 in each of the RAMs, as shown in fig. 18. Here, in some examples, controller 1606 needs to provide n to bit memory block 1605 _i The RAM reads the coordinates, which can be decoded to provide n _i Multiple pathsEach of the multiplexers 2805 provides separate control signals. In contrast, the first bit of the width-direction data path output does not require additional circuitry, since the bit can be provided to +.>A first location in the write port of each of the RAMs, and a write enable signal 1616 may be used to control which of the RAMs is updated. Here, in some examples, controller 1606 is required to provide n to bit memory block 1605 _i A RAM writes the coordinates, which can be decoded to assert n _i A write enable signal 1616.

Furthermore, the RAM direction output of the data path has a position l E [0, n _i -1]Writing only bits with corresponding RAM coordinatesIs stored in memory block 1605. However, the bit may need to be written to any possible width coordinate w at different times during the polarity decoding process _l ∈[0，n _i -1]. Thus, as shown in FIG. 18,multiplexer 2806 is up +.>Each RAM of the write ports 2803 provides n _i Unique circuitry required for each of the inputs. This is because of the RAM coordinates r _l Each input of RAM of (2) is satisfied only from havingPosition l e [0, n ] _i -1]Is selected from a subset of the data path outputs of the plurality of data paths. Here, controller 1606 may be required to provide n to memory block 1605 _i Width write coordinates that can be decoded to assert n _i Write enable signal 1616 and n _i Corresponding subset provisioning control of the individual multiplexers 2806And (5) signal generation.

As described above, in the step of the decoding process that is accessing the sub-row 1704 in the inner column c, a specific selection of bits is read in the width direction from each bit memory block 1605 having the index c ' ∈ [1, c-1], passes through the portion having the index c ' and the data path 1603, and is written into the bit memory block 1605 having the index c ' +1 in the RAM direction. Note that a subset of the locations in the bit memory c ' written by the portion and data path c ' -1 in the RAM direction will also be read by the portion and data path c ' in the width direction. Thus, the bit memory with index ranges of 2 to c' -1 is operated in transparent mode so that these bit values provided by the write operation become available for read operations in the same step of the decoding process. More specifically, in addition to feedback from the read port of each RAM in bit memory c 'to its write port 1617, a bypass 1610 is provided so that bits provided to the write port 2803 through portions and data path c' -1 can be fed directly to the read port 2801. As shown in fig. 18, a multiplexer 1610 is provided to select between the output provided by the read port 2801 of bit memory c 'and the input provided by the partial data path c' -1. These multiplexers may be driven by the same write enable signal 1616 that controls the operation of the corresponding write port. This allows bits to propagate 1805 from bit memory 1 through the portions described herein and the chain of data paths 1603 and bit memory blocks 1605 and to be delivered to bit input 2401 on the left edge of internal data path 1601. Here, the controller 1606 provides control signals to the bit memory block 1605 to ensure that the correct bits are xored 2101 together in part and data path 1603. After the internal data path 1601 operation is completed, the bits provided by bit output 2403 on its left edge are written to bit memory block 1605 having index c. Here, a multiplexer 1613 is provided at the input to the write port 2803 to select between the output provided by the partial data path c' -1 and the internal data path 1601. Note that these multiplexers 1613 are located after the point from which the transparent bypass 1610 is taken to prevent the creation of an infinite feedback loop.

Controller for controlling a power supply

As previously mentioned, the proposed polarity decoding procedure involves a total ofAnd (3) a step. During each step of processing 1806 of the sub-row 1704 in the inner column 1702 having index c, the controller 1606 is required to provide read control signals to the bit memory block 1605 having indices 1 through c. In addition, controller 1606 is required to have index c ε [1, C-2 ] at process 1806]The read control signal is provided to LLR memory c+1 when the subrows 1704 in inner column 1702 or to LLR memory C when subrows 1704 in inner column C-1 are processed 1806 _max . In addition, when processing 1806 the subrows 1704 in the inner column 1702 indexed c, the controller 1606 is required to provide write control signals to the bit memory blocks 1605 having indices 2 through c and the LLR memory block 1604 having index c. During each step of processing 1804 for row 1703 in external column 1701 having index c=0, controller 1606 is required to provide a write control signal to bit memory 1, and if C>1 provides a read control signal to LLR memory 1 or to LLR memory C if c=1 _max . The controller 1606 is designed such that each memory write operation seamlessly arranges the corresponding bits or LLRs in memory so that they can then be seamlessly read without requiring a complex interconnection network.

In addition to the various signals used in the flow chart of FIG. 8, the controller 1606 operates depending on what is referred to as a first index j _c ∈[0，N-1]Signal of. This represents the vertical index of the polarity code pattern 201, 202, 203 belonging to the top connection of the sub-row 1704 currently being accessed in column c, wherein for the top-most sub-row 1704, j in the top-most row 1703 _c =0. Can be according to j _c ＝y _c r _c +s, obtaining a first index; wherein y is _c ∈[0，N/r _c -1]Is the index of the currently accessing row 1703 in column c, and s e 0, max (r _c /n _i ，1)-1]Is the index of the child row 1704 in that row 1703 that is currently being accessed. In propagation through successive bit memory blocks 1605 and copies of partial sum data path 1603During the process of 1805 part and bits, a bit corresponding to each column c' ∈ [1, c-1 ] is obtained according to the following]An associated first index:

wherein the method comprises the steps ofIs the index of the access of the current row 1703 in the current column c.

As previously described in some examples, both read and write accesses to the LLR memory block 1604 may be performed in the width direction. Position l e 0, n in the input or output of LLR memory c _i -1]Accessing the memory at a specific depth d _l And width w _l The LLR at the coordinates is calculated,

wherein the method comprises the steps of

In all cases, w _l ＝l。

As described herein in some examples, it is assumed that circuitry is provided to load 1802 LLRs from the corresponding inputs 1607 of the proposed polar decoder core 111 to the LLR memory C _max Is a kind of medium. The controller 1606 is required to operate the loading circuitry so that when the internal data path 1601 performs processing 1806 on a particular sub-row 1704 in column C-1, it can use depth coordinates from LLR memory C _max Reading the corresponding LLR:

in addition, when either the internal data path 1601 or the external data path 1602 performs a process 1804, 1806 on a particular sub-row 1704 in column c e [0, C-2], it reads from LLR memory c+1 using depth coordinates:

in contrast, when the internal data path 1601 performs processing 1806 on a particular sub-row 1704 in column c, it writes to LLR memory c using depth coordinates:

here, it can be observed that the width coordinate w _l =l and first index j _c Irrespective, it is therefore possible to hard wire according to the widthwise operation described in some examples. In contrast, depth d _l The coordinates must be referenced by the controller 1606 according to the first index j _c To control. Note, however, that the depth coordinate d _l Regardless of the bit index I, the controller 1606 is only required to provide a single address 2702, 2704 to the memory block 1604. Note that if one is located at the corresponding position l ε [0, n _i -1]Is asserted, the location l e 0, n at the write port is written only in LLR memory block 1604 _i -1]As described in some examples.

As described in some examples, the internal data path 1601 performs both read and write accesses to the bit memory block 1605 in the width direction. For these width-wise memory accesses, the position l E [0, n ] in the input or output of the bit memory c _i -1]Accessing at a specific depth d according to the following equation _l 、RAMr _l Andwidth w _l The bit 20 stored at coordinates:

ω _l ＝l.

here, it can be observed that the width coordinate w _l =l and first index j _c Irrespective, it is therefore possible to hard wire according to the widthwise operation described in some examples. In contrast, depth d _l And RAMr _l The coordinates must be referenced by the controller 1606 according to the first index j _c To control. Note, however, that the depth coordinate d _l Regardless of the bit index I, the controller 1606 is only required to provide a single address 2802, 2804 to a memory block. Note that in some cases, where n _i ＞r _c The above approach may result in two or more input bits attempting to write to the same location in bit memory block 1605. In this case, the bit with the lowest index I should be written to memory, while other competing bits can be safely discarded.

As described in some examples, write accesses to bit memory block 1605 by external data path 1602 and partial and data path 1603 are performed in the RAM direction. For these RAM-oriented memory accesses, the position l E [0, n ] in the input of the bit memory c+1 _i -1]Accessing at a specific depth d according to the following equation _l 、RAMr _l And width w _l Bits stored at coordinates:

here, it can be observed that the RAM coordinatesAnd a first index j _c Irrespective, it is therefore possible to hard wire according to the RAM direction operation described in some examples. In contrast, depth d _l And width w _l The coordinates must be referenced by the controller 1606 according to the first index j _c To control. Note, however, that the depth coordinate d _l Regardless of the bit index I, the controller 1606 is only required to provide a single address 2802, 2804 to a memory block. The above-described method of controlling memory read and write operations results in a characteristic arrangement of LLRs and bits within the memory blocks 1604, 1605.

Fig. 19-23 provide various examples of this feature arrangement after the decoding process is completed. Each figure shows an index j e 0, n-1 providing a connection between two adjacent columns 1701, 1702 in a polarity diagram 201, 202, 203 of LLRs or bits stored at each RAM, depth and width coordinate in a corresponding memory block 1604, 1605.

Referring now to FIG. 24, there is illustrated an exemplary computing system 2400 that may be used to implement polarity encoding according to some example embodiments of the invention. This type of computing system may be used in a wireless communication unit. Those skilled in the relevant art will also recognize how to implement the invention using other computer systems or architectures. The computing system 2400 may represent, for example, a desktop, laptop or notebook computer, a handheld computing device (PDA, cellular telephone, palmtop, etc.), a mainframe, server, client, or any other type of special or general purpose computing device as may be desirable or appropriate for a given application or environment. Computing system 2400 can include one or more processors, such as a processor 2404. Processor 2404 may be implemented using a general-purpose or special-purpose processing engine, such as a microprocessor, microcontroller, or other control logic. In this example, processor 2404 is connected to a bus 2402 or other communication medium. In some examples, computing system 2400 may be a non-transitory tangible computer program product including executable code stored therein for implementing polarity encoding.

The computing system 2400 may also include a main memory 2408, such as a Random Access Memory (RAM) or other dynamic memory, for storing information and instructions to be executed by the processor 2404. Main memory 2408 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 2404. The computing system 2400 may similarly include a Read Only Memory (ROM) or other static storage device coupled to the bus 2402 for storing static information and instructions for the processor 2404.

The computing system 2400 may also include an information storage system 2410, which information storage system 2410 may include, for example, a media drive 2412 and a removable storage interface 2420. The media drive 2412 may include a drive or other mechanism to support fixed or removable storage media, such as a hard disk drive, floppy disk drive, magnetic tape drive, optical disk drive, compact Disk (CD) or Digital Video Drive (DVD) read or write drive (R or RW), or other removable or fixed media drive. Storage medium 2418 may include, for example, a hard disk, floppy disk, magnetic tape, optical disk, CD or DVD, or other fixed or removable medium that is read by and written to by media drive 2412. As shown in these examples, storage medium 2418 may include a computer-readable storage medium having stored therein specific computer software or data.

In alternative embodiments, information storage system 2410 may include other similar components for allowing computer programs or other instructions or data to be loaded into computing system 2400. Such components can include, for example, a removable storage unit 2422 and an interface 2420, such as a program cartridge and cartridge interface, removable memory (e.g., flash memory or other removable memory modules) and memory slots, and other removable storage units 2422 and interfaces 2420 that allow software and data to be transferred from the removable storage unit 2418 to the computing system 2400.

The computing system 2400 may also include a communication interface 2424. The communication interface 2424 may be used to allow software and data to be transferred between the computing system 2400 and external devices. Examples of communication interface 2424 can include a modem, a network interface (such as an ethernet or other NIC card), a communication port (e.g., a Universal Serial Bus (USB) port), a PCMCIA slot and card, etc. Software and data transferred via communications interface 2424 are in the form of signals which may be electronic, electromagnetic, optical or other signals capable of being received by communications interface 2424. These signals are provided to communication interface 2424 via channel 2428. The channel 2428 may carry signals and may be implemented using wireless media, wire or cable, fiber optic or other communications media. Some examples of channels include telephone lines, cellular telephone links, RF links, network interfaces, local or wide area networks, and other communication channels.

In this document, the terms "computer program product," "computer-readable medium," and the like may be used to generally refer to media such as memory 2408, storage device 2418, or storage unit 2422. These and other forms of computer-readable media may store one or more instructions for use by processor 2404 to cause the processor to perform specified operations. These instructions, when executed, generally referred to as "computer program code" (which may be grouped in computer program or other groupings), enable the computer system 2400 to perform functions of embodiments of the present invention. Note that the code may directly cause the processor to perform the specified operations, be compiled to do so, and/or be combined with other software, hardware, and/or firmware elements (e.g., libraries for performing standard functions) to do so.

In embodiments where elements are implemented using software, the software may be stored in a computer-readable medium and loaded into computing system 2400 using, for example, removable storage drive 2422, drive 2412 or communication interface 2424. When executed by the processor 2404, the control logic (in this example, software instructions or computer program code) causes the processor 2404 to perform the functions of the invention as described herein.

In the foregoing specification, the invention has been described with reference to specific examples of embodiments thereof. It will be apparent, however, that various modifications and changes may be made therein without departing from the scope of the invention as set forth in the appended claims, and that the claims are not limited to the specific examples described above.

The connections discussed herein may be any type of connection suitable for transmitting signals from or to a corresponding node, unit or device, e.g., via an intermediary device. Thus, unless otherwise indicated or stated, connections may be, for example, direct connections or indirect connections. Connections may be illustrated or described with reference to a single connection, multiple connections, unidirectional connections, or bidirectional connections. However, different embodiments may vary the implementation of the connection. For example, a separate unidirectional connection may be used instead of a bidirectional connection, and vice versa. Moreover, the plurality of connections may be replaced by a single connection that transmits the plurality of signals serially or in a time division multiplexed manner. Likewise, a single connection carrying multiple signals may be separated into various different connections carrying subsets of those signals. Thus, there are many options for transmitting signals.

Those skilled in the art will recognize that the architecture described herein is merely exemplary, and that in fact many other architectures can be implemented that achieve the same functionality.

Effectively "associating" any arrangement of components to achieve the same functionality, thereby achieving the desired functionality. Thus, any two components herein combined to achieve a particular functionality can be seen as "associated with" each other such that the desired functionality is achieved, irrespective of architectures or intermedial components. Likewise, any two components so associated can also be viewed as being "operably connected," or "operably coupled," to each other to achieve the desired functionality.

Furthermore, those skilled in the art will recognize that boundaries between the above described operations are merely illustrative. Multiple operations may be combined into a single operation, a single operation may be distributed among additional operations, and operations may be performed at least partially overlapping in time. Moreover, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be altered in various other embodiments.

The invention is described herein with reference to an integrated circuit device comprising, for example, a microprocessor configured to perform the functions of a polarity decoder. However, it will be appreciated that the invention is not limited to such integrated circuit devices and may be equally applied to integrated circuit devices that include any alternative type of operable functionality. By way of example only, examples of such integrated circuit devices that include alternative types of operable functions may include Application Specific Integrated Circuit (ASIC) devices, field Programmable Gate Array (FPGA) devices, or integration with other components, and the like. Furthermore, because the illustrated embodiments of the present invention may be implemented, to a great extent, using electronic components and circuits known to those skilled in the art, details will not be explained in any greater extent than that considered necessary as illustrated, for the understanding and appreciation of the underlying concepts of the present invention and in order not to obfuscate or distract from the teachings of the present invention. Alternatively, the circuit and/or component examples may be implemented as any number of separate integrated circuits or separate devices interconnected with each other in a suitable manner.

As another example, examples or portions thereof may be implemented as software or code representations of physical circuitry or may be converted to logical representations of physical circuitry, such as in any suitable type of hardware description language.

Moreover, the present invention is not limited to physical devices or units implemented in non-programmable hardware, but may also be applied in programmable devices or units capable of executing a desired polarity code by operating in accordance with suitable program code, such as minicomputers, personal computers, notepads, personal digital assistants, electronic games, automotive and other embedded systems, cellular telephones, and other various wireless devices, commonly referred to herein as "computer systems".

However, other modifications, variations, and alternatives are also possible. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of other elements or steps than those listed in a claim. Furthermore, the terms "a" or "an," as used herein, are defined as one or more than one. Likewise, the use of introductory phrases such as "at least one" and "one or more" in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles "a" or "an" limits any particular claim containing such introduced claim element to inventions containing only one such element, even if the same claim includes the introductory phrases "one or more" or "at least one" and indefinite articles such as "a" or "an". The same is true for the use of definite articles. Unless otherwise indicated, terms such as "first" and "second" are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements. The mere fact that certain measures are recited in mutually different claims does not indicate that a combination of these measures cannot be used to advantage.

Reference to the literature

[1] Arikan, "Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels," IEEE Transactions on Information Theory, vol.55, no.7, pp.3051-3073, month 7 in 2009.

[2] K.Niu and K.Chen, "CRC-aided decoding of polar codes," IEEE Communications Letters, vol.16, no.10, pp.1668-1671, month 10 in 2012.

[3] Huawei, hiSilicon, "Polar code construction for NR," 3GPP TSG RAN WG1 Meeting#86bis,Lisbon,Portugal, month 10 of 2016, R1-1608862.

[4] Huawei, hiSilicon, "Evaluation of channel coding schemes for control channel," 3GPP TSG RAN WG1 Meeting#86bis,Lisbon,Portugal, month 10 of 2016, R1-1608863.

[5] CATT, "Polar codes design for eMBB control channel," at 3GPP TSG RAN WG1 AH NR Meeting,Spokane,USA, month 1 in 2017, R1-1700242.

[6] ZTE, ZTE Microelectronics, "Rate matching of polar codes for eMBB," at 3GPP TSG RAN WG1 Meeting#88,Athens,Greece, month 2 in 2017, R1-1701602.

[7] Tal and A.Vardy, "List decoding of polar codes," 2011IEEE International Symposium on Information Theory Proceedings, 7 months 2011, pp.1-5.

[8] Balatsoukas-Stimming, M.B.Parizi and A.Burg, "Llr-based successive cancellation list decoding of polar codes," IEEE Transactions on Signal Processing, vol.63, no.19, pp.5165-5179, 10 months 2015.

[9] K.Niu and K.Chen, "Crc-aided decoding of polar codes," IEEE Communications Letters, vol.16, no.10, pp.1668-1671, month 10 in 2012.

[10] G.Sarkis, P.Giard, A.Vardy, C.Thibeault and W.J.Gross, "Fast polar decoders: algorithm and implementation," IEEE Journal on Selected Areas in Communications, vol.32, no.5, pp.946-957, month 5 in 2014.

[11]P.Giard, A.Balatsoukas-Stimming, G.Sarkis, C.Thibeault and W.J. Gross, "Fast Low-complexity decoders for low-rate polar codes," Journal of Signal Processing Systems, pp.1-11,2016 [ Online ]]. The method can obtain:http://dx.doi.orq/10.1007/s1 1265-016-1 173-v

[12] P.Giard, G.Sarkis, C.Thibeault and W.J.Gross, "237gbit/s unrolled hardware polar decoder," Electronics Letters, vol.51, no.10, pp.762-763,2015.

[13] P.Giard, G.Sarkis, C.Thibeault and W.J.Gross, "Multi-mode unrolled architectures for polar decoders," IEEE Transactions on Circuits and Systems I: regular paper, vol.63, no.9, pp.1443-1453, month 2016, 9.

[14] C.Leroux, I.Tal, A.Vardy and W.J.Gross, "Hardware architectures for successive cancellation decoding of polar codes," in 2011IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP), 5 months 2011, pp.1665-1668.

[15] C.Leroux, A.J.Raymond, G.Sarkis and W.J.Gross, "A semi-parallel successive-cancellation decoder for polar codes," IEEE Transactions on Signal Processing, vol.61, no.2, pp.289-299, 1 month in 2013.

[16] A.Mishra, A.J.Raymond, L.G.Amaru, G.Sarkis, C.Leroux, P.Meinerzhagen, A.Burg and W.J.Gross, "A successive cancellation decoder asic for a 1024-bit polar code in nm cmos," 2012IEEE Asian Solid State Circuits Conference (A-SSCC), 11 months 2012, pp.205-208.

[17] Fan and C.y.Tsui, "An efficient partial-sum network architecture for semi-parallel polar codes decoder implementation," IEEE Transactions on Signal Processing, vol.62, no.12, pp.3165-3179, 6 months 2014.

[18] Balatsoukas-Stimming, A.J.Raymond, W.J.Gross and A.Burg, "Hardware architecture for list successive cancellation decoding of polar codes," IEEE Transactions on Circuits and Systems II: express Briefs, vol.61, no.8, pp.609-613, month 8 in 2014.

[19] Lin and Z.yan, "An efficient list decoder architecture for polar codes," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.23, no.11, pp.2508-2518, 11 months 2015.

[20] Y.Fan, J.Chen, C.Xia, C.y.Tsui, J.Jin, H.Shen and B.Li, "Low-latency list decoding of polar codes with double thresholding," in 2015IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP), month 4 of 2015, pp.1042-1046.

[21] J.Lin, C.Xiong and Z.yan, "A high throughput list decoder architecture for polar codes," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.24, no.6, pp.2378-2391, month 2016, 6.

[22] Y.Fan, C.Xia, J.Chen, C.Y.Tsui, J.Jin, H.Shen and B.Li, "A low-latency list successive-cancellation decoding implementation for polar codes," IEEE Journal on Selected Areas in Communications, vol.34, no.2, pp.303-317, month 2 in 2016.

[23] G.Berhault, C.Leroux, C.Jego and d.dalet, "Hardware implementation of a soft cancellation decoder for polar codes," 2015Conference on Design and Architectures for Signal and Image Processing (DASIP), 9 months 2015, pp.1-8.

[24] G.Sarkis, I.Tal, P.Giard, A.Vardy, C.Thibeault and W.J.Gross, "Flexible and Low-complexity encoding and decoding of systematic polar codes," IEEE Transactions on Communications, vol.64, no.7, pp.2732-2745, month 7 in 2016.

[25] Zhang, B.Yuan and K.K.Parhi, "Reduced-latency sc polar decoder architectures," in 2012IEEE International Conference on Communications (ICC), in 2012, month 6, pp.3471-3475.

[26] Yuan and K.K.Parhi, "Low-latency successive-cancellation polar decoder architectures using 2-bit decoding," IEEE Transactions on Circuits and Systems I: regular Papers, vol.61, no.4, pp.1241-1254, month 4 in 2014.

[27] Dizdar and E.Arkan, "A high-throughput energy-efficient implementation of successive cancellation decoder for polar codes using combinational logic," IEEE Transactions on Circuits and Systems I: regular paper, vol.63, no.3, pp.436-447, month 2016, 3.

[28] Xiong, J.Lin and Z.yan, "A multiple mode area-efficient scl polar decoder," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.24, no.12, pp.3499-3512, 12 months 2016.

[29] C.Kim, H.Yun, S.Ajaz and H.Lee, "High-throughput low-complexity successive-cancellation polar decoder architecture using ones complement scheme," Journal of Semiconductor Technology and Science, vol.15, no.3, pp.427-435,2015.

[30] Pamuk and E.Arkan, "A two phase successive cancellation decoder architecture for polar codes," 7 months in 2013, pp.957-961, 2013IEEE International Symposium on Information Theory.

[31] X.Liang, J.Yang, C.Zhang, W.Song and X.you, "Hardware efficient and low-latency ca-scl decoder based on distributed sorting," in 2016IEEE Global Communications Conference (GLOBECOM), in 2016, 12, pp.1-6.

[32] Xiong, J.Lin and Z.yan, "Symbol-decision successive cancellation list decoder for polar codes," IEEE Transactions on Signal Processing, vol.64, no.3, pp.675-687, month 2 in 2016.

[33] Yuan and K.K.Parhi, "Low-latency successive-cancellation list decoders for polar codes with multibit decision," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.23, no.10, pp.2268-2280, month 10 2015.

[34] Zhang and K.K.Parhi, "Low-latency sequential and overlapped architectures for successive cancellation polar decoder," IEEE Transactions on Signal Processing, vol.61, no.10, pp.2429-2441, 5 months 2013.

[35]T.Che and G.S. Choi, "An efficient partial sums generator for constituent code based successive cancellation decoding of polar codes," CoRR, vol.abs/1611.09452,2016 "[ Online ]]. The method can obtain:http://arxiv.org/abs/161 1.09452

[36] J.Sha, X.Liu, Z.Wang and X.Zeng, "A memory efficient belief propagation decoder for polar codes," China Communications, vol.12, no.5, pp.34-41, month 5 of 2015.

Claims

1. A polarity decoder core (111) comprising a processing unit (2201), the processing unit (2201) having:

at least one input configured to receive at least one input log-likelihood ratio LLR (2202, 2203);

logic configured to manipulate at least one input LLR; and

at least one output configured to output the manipulated at least one LLR;

wherein the polar decoder core (111) is characterized in that the logic of the processing unit (2201) comprises only a single two-input adder (2207) to manipulate the at least one input LLR, and the input LLR and the manipulated LLR are in a format of a fixed point number representation comprising 2's complement binary number and additional sign bits.

2. The polarity decoder core (111) of claim 1, wherein the processing unit (2201) is configured to:

(i) Executing the "g" function or the "f" function at a certain moment; or alternatively

(ii) Only one of the following is performed: the "g" function or the "f" function.

3. The polar decoder core (111) of claim 2, wherein at least one of the following functional conditions exists: the "f" function includes:

wherein sign (·) returns "-1" if its argument is negative and "+1" if its argument is regular; the "g" function includes:

4. the polar decoder core (111) of claim 1 or claim 2, wherein each of the at least one input LLR (2202, 2203) is represented as using a fixed point number representation having w+1 bits:

wherein the method comprises the steps of

Is a tag with an appended sign bit,

a tag that is a bit that serves as both the most significant bit MSB and the sign bit of the complement binary number portion of 2 represented by a fixed point number, and

is a tag of the least significant bit LSB of the complement binary number portion of 2 represented by a fixed point number.

5. The polarity decoder core (111) of claim 1 wherein the single two-input adder (2207) includes two inputs (2208) and is configured to provide a complement output (2209) of 2, wherein each input (2208) has a slave fixed-point number representation And->The complement output (2209) of 2 includes a second number of bits including additional bits to avoid overflow, the second number of bits being "w+1" bits.

6. The polarity decoder core (111) of claim 5 wherein the output (2206) of the processing unit (2201) includes a third number "w+2" of bits that combine the additional bits introduced by the single two-input adder (2207) plus additional sign bits.

7. The polar decoder core (111) of claim 2, when implementing the "g" function, wherein the single two-input adder (2207) is used to manipulate the complement binary number of 2 of the at least one input LLR to be based on part and bitAnd the at least one additional sign bit of the input LLR by obtaining LLR by2's complement binary part:

(i) Will first LLRIs added to the second LLR +.>In the complement binary part of 2 of (2), or

(ii) From the second LLRSubtracting the first LLR ++in the 2's complement binary part of (2)>2 of (c) are provided.

8. The polar decoder core (111) of claim 2, when implementing an "f" function, wherein the single two-input adder (2207) is used to manipulate the complement binary number of 2 of the at least one input LLR to obtain the least term of the "f" function based on the additional sign bits of the at least one input LLR by 2 complement binary of (2)The number of parts:

(i) Will first LLRThe complement binary parts of all 2 are added to the second LLR +.>In the complement binary part of 2 of (2), or

(ii) From the second LLRSubtracting the first LLR ++in the 2's complement binary part of (2)>2 of (c) is provided;

and selecting (2210) a first LLR using MSBs of the resulting number of complementary codes of 2 output from the single two-input adder (2207)The complement binary part of 2 or the second LLR +.>2 to provide at least one LLR +_ for output manipulation>2 of the complement binary number of (c).

9. The polar decoder core (111) of claim 7, wherein the manipulated at least one LLR is obtained from one ofAnd->Is added with the sign bit:

as a function of at least one of: the MSB of the 2's complement binary component of the at least one input LLR and the additional sign bit of the at least one input LLR;

as the second LLRIs added to the value of the additional sign bit.

10. The polarity decoder core (111) of claim 2, wherein the polarity decoder core (111) further comprises an external data path (1602), the external data path (1602) comprising:

An f/g function diagram comprising a first number S _O Wherein a first number s _O Each of the processing stages (207) of (a) includes a second number of performing only "f" functionsAnd a second number of processing units (2201) which perform only the "g" function +.>Is a processing unit (2201) of the above.

11. The polarity decoder core (111) of claim 2, wherein the polarity decoder core (111) comprises an internal data path (1601), the internal data path (1601) comprising a plurality of processing units (2201), the plurality of processing units (2201) being arranged as a number s _i Is configured to perform at least one of an "f" function and a "g" function, wherein the rightmost stage comprises a first number n _i 2, and each successive stage to the left of the rightmost stage contains half the number of processing units (2201) as the corresponding processing stage (207) to the right thereof.

12. The polar decoder core (111) of claim 11, wherein in the range 0 to 2 ^SC The access index v in-1 is expressed on the basis of 2 as having a first number s _c Wherein each successive bit from right to left is used to control whether the "f" function or the "g" function is performed by a processing unit of each successive stage of the plurality of processing units (2201) in the left-to-right internal data path such that the least significant bit LSB of the binary number is used to control the leftmost stage of the plurality of processing units (2201) and the most significant bit MSB of the binary number is used to control the rightmost stage of the plurality of processing units (2201).

13. The polar decoder core (111) of claim 10 or claim 11, wherein the delta bit width represented by the fixed point number is used in each successive processing stage (207) from right to left.

14. The polar decoder core (111) of claim 13, further comprising a clipping circuit (2411), the clipping circuit (2411) configured to perform at least one of: reducing the bit width W of the LLR output on the leftmost level of the plurality of processing units (2201) to match the bit width of the LLR on the rightmost level of the plurality of processing units (2201) when using the delta bit width represented by the fixed point number in each successive processing level from right to left; the bit width of the intermediate processing stage (207) between the rightmost stage of the plurality of processing units (2201) and the leftmost stage of the plurality of processing units (2201) is additionally reduced.

15. The polar decoder core (111) of claim 10, further comprising a plurality of LLR memory blocks (1604, 1605) coupled to the plurality of processing units (2201), the LLR memory blocks (1604, 1605) each configured to convert respective input LLRs to complement fixed-point numbers of 2 stored in the plurality of LLR memory blocks (1604, 1605).

16. The polar decoder core (111) of claim 1, wherein if the additional sign bit of the fixed-point representation is set, when writing the input LLR into the LLR memory block (1604), the complement binary number portion of 2 of the fixed-point representation is replaced by inverting all bits of the fixed-point representation, and then a further single two-input adder (2207) is used to increment the result value to convert to the complement fixed-point representation of 2.

17. The polar decoder core (111) of claim 1 or claim 2, wherein the complement binary numbers of 2 of the at least one input LLR are pre-converted to a fixed point representation by appending the complement binary numbers of 2 onto zero-valued additional sign bits when the input LLR is read from the LLR memory blocks (1604, 1605).

18. A communication unit comprising a polarity decoder core (111), the polarity decoder core (111) comprising a processing unit (2201), the processing unit (2201) having:

logic configured to manipulate at least one input LLR; and

at least one output configured to output the manipulated at least one LLR;

19. An integrated circuit for a wireless communication unit, the integrated circuit comprising a polarity decoder core (111), the polarity decoder core (111) comprising a processing unit (2201), the processing unit (2201) having:

logic configured to manipulate at least one input LLR; and

at least one output configured to output the manipulated at least one LLR;

20. A method of polarity decoding, characterized by at a polarity decoder core (102), wherein the polarity decoder core (102) has a processing unit (2201), the processing unit (2201) comprising within a logic circuit only a single two-input adder (2207):

receiving at least one input log-likelihood ratio, LLR, (2202, 2203) in a format of a fixed-point representation comprising 2's complement binary number and additional sign bits;

manipulating the at least one input LLR in a format of a fixed-point representation comprising 2's complement binary number and additional sign bits; and outputting the manipulated at least one LLR in a format of a fixed-point representation comprising 2's complement binary number and additional sign bits.