US20230412197A1 - Acceleration of s-polar ecc throughput by scheduler - Google Patents

Acceleration of s-polar ecc throughput by scheduler Download PDF

Info

Publication number
US20230412197A1
US20230412197A1 US17/807,217 US202217807217A US2023412197A1 US 20230412197 A1 US20230412197 A1 US 20230412197A1 US 202217807217 A US202217807217 A US 202217807217A US 2023412197 A1 US2023412197 A1 US 2023412197A1
Authority
US
United States
Prior art keywords
node
decoding
paths
path
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US17/807,217
Other versions
US11848687B1 (en
Inventor
Amit Berman
Sarit Buzaglo
Ariel Doubchak
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Priority to US17/807,217 priority Critical patent/US11848687B1/en
Assigned to SAMSUNG ELECTRONICS CO., LTD. reassignment SAMSUNG ELECTRONICS CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BUZAGLO, SARIT, BERMAN, AMIT, Doubchak, Ariel
Priority to KR1020220087087A priority patent/KR20230172992A/en
Priority to CN202310087663.0A priority patent/CN117254880A/en
Application granted granted Critical
Publication of US11848687B1 publication Critical patent/US11848687B1/en
Publication of US20230412197A1 publication Critical patent/US20230412197A1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • H03M13/05Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
    • H03M13/13Linear codes
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/25Error detection or forward error correction by signal space coding, i.e. adding redundancy in the signal constellation, e.g. Trellis Coded Modulation [TCM]
    • H03M13/253Error detection or forward error correction by signal space coding, i.e. adding redundancy in the signal constellation, e.g. Trellis Coded Modulation [TCM] with concatenated codes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • H04L1/0057Block codes
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • H03M13/05Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
    • H03M13/09Error detection only, e.g. using cyclic redundancy check [CRC] codes or single parity bit
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • H03M13/05Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
    • H03M13/13Linear codes
    • H03M13/15Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes
    • H03M13/151Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes using error location or error correction polynomials
    • H03M13/1515Reed-Solomon codes
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/29Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes
    • H03M13/2906Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes using block codes
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/29Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes
    • H03M13/2933Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes using a block and a convolutional code
    • H03M13/2936Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes using a block and a convolutional code comprising an outer Reed-Solomon code and an inner convolutional code
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/37Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35
    • H03M13/373Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35 with erasure correction and erasure determination, e.g. for packet loss recovery or setting of erasures for the decoding of Reed-Solomon codes
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/37Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35
    • H03M13/3738Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35 with judging correct decoding
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/37Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35
    • H03M13/45Soft decoding, i.e. using symbol reliability information
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/65Purpose and implementation aspects
    • H03M13/6572Implementations using a tree structure, e.g. implementations in which the complexity is reduced by a tree structure from O(n) to O (log(n))
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • H04L1/0061Error detection codes

Definitions

  • Embodiments of the disclosure are directed to methods of performing error correction in digital communications that shorten latency, and to hardware implementation of the same.
  • S-Polar is a Generalized Concatenated Code (GCC) with Reed-Solomon (RS) codes as its outer codes and polar codes as its inner codes.
  • GCC Generalized Concatenated Code
  • RS Reed-Solomon
  • the information is encoded into an N ⁇ J array using S ⁇ 1 outer codes and S inner codes.
  • the inner codes, C in (0) , C in (1) , . . . , C in (S ⁇ 1) are linear and nested codes, i.e., the s th code is contained in the s ⁇ 1 th code, for 1 ⁇ s ⁇ S ⁇ 1.
  • K in (s) is the dimension of the s th code
  • K in (0) ⁇ K in (1) ⁇ . . . ⁇ K in (S ⁇ 1) .
  • the outer codes, CV out (1) , . . . , C out (5) are assumed to be systematic codes of length J and dimensions 0 ⁇ K out (1) ⁇ . . . ⁇ K out (S ⁇ 1) ⁇ J.
  • the s st outer code, for 1 ⁇ s ⁇ S is defined over extension fields of GF(2) of dimension K in (s ⁇ 1) -K in (s) .
  • the inner codes encodes information and some parities of the outer codes into rows of the array. The encoded rows are mapped to the coset domain, on which the outer codes operate.
  • FIG. 1 illustrates the structure of a codeword of an N ⁇ J GCC with S stages.
  • the white cells represent information symbols, whereas the grey cells represent parities.
  • the method includes, when v is a frozen leaf, expanding all paths by a 0 bit and setting ⁇ v (l) to 0, for every path index, 1 ⁇ l ⁇ L, wherein a frozen leaf forces a hard decision to be zeros.
  • the method includes, for a node that is a root of a subtree with frozen leaves only (a RATE-0 node), for a node that is a root of a subtree with information leaves (a RATE-1 node), and for a node that is a root of a subtree whose leaves are all frozen leaves, except for the right most leaf (a REP node), updating an L best path metrics that compute corresponding hard decision vectors without visiting other nodes in subtrees of the binary tree.
  • updating the L best path metrics is performed in parallel for all paths in the list, wherein a latency of the path metric update depends on a depth of the node and not on a list size.
  • the method includes selecting t paths, where t ⁇ L, for those leaves for which a correct path is likely to be among a best t paths; applying a CRC detector on a k th leaf; and continuing decoding until a number of paths equals one.
  • determining those leaves for which a right path is likely to be among a best t ⁇ L paths comprises one or more of selecting those paths whose path metric is greater than a predetermined threshold, using machine learning to select the best t ⁇ L decoding paths, using a classifier to select the best t ⁇ L decoding paths, or using forward prediction on each path to select the best t ⁇ L decoding paths.
  • each path is associated with a memory that stores 2N ⁇ 1 soft decision values, and each memory is assigned a unique processing block that performs soft calculations along the decoding tree.
  • each node in the decoding tree stores soft calculations and hard decisions, wherein a hard decision is represented by one bit, a soft calculation is represented by a plurality of bits.
  • the method further includes saving, for a node at depth d, 2 n-d soft calculations and 2 n-d hard decisions for each path in the list; and pruning paths from the decoding tree.
  • soft calculations in a left child of a root, v l is a vector ⁇ l of length 2 n-1 which takes values in a set of size q 2
  • soft calculations in a right child of the root, v t is a vector, ⁇ r , of length 2 n-1 with entries taking values in a set of size 2q 2 , wherein ⁇ r,i depends on a hard decision bit ⁇ LI .
  • the method includes determining a value of ⁇ 1,i for coordinates 2i and 2i+1, for 0 ⁇ i ⁇ 2 n-1 , by accessing a lookup table of size q 2 ; and determining a value of ⁇ r,i . by accessing a lookup table of size 2q 2 .
  • a first node from a left has q 4 possible values for soft calculations
  • a second node from the left has 2q 4 possible values for soft calculations
  • a third node from the left has 4q 4 for possible values for soft calculations
  • a last node has 8q 4 f possible values for soft calculations.
  • the method includes determining a value of ⁇ 1,i for a leftmost node by accessing a lookup table of size q 4 ; determining a value of ⁇ 1,i for a next leftmost node by accessing a lookup table of size 2q 4 ; determining a value of ⁇ 1,i for a next rightmost node by accessing a lookup table of size 4q 4 ; and determining a value of ⁇ 1,i for a rightmost node by accessing a lookup table of size 8q 4 .
  • decoding each row of an S-polar code array into codewords of C (0) includes: providing a node v in a decoding path l at a depth d in the perfect binary tree with a vector, ⁇ v (l) , of length 2 d of soft information from a parent node, v p ; computing, for every path in a list of paths in the binary tree, a vector, ⁇ v l (l) , of length 2 d-1 of soft information for a left child, v l of node v; providing node v with a vector, ⁇ v l (l) , of length 2 d-1 of hard decisions from the left child and using vector, ⁇ v l (l) , together with ⁇ v (l) , to create a soft information vector, ⁇ v r (l) , of length 2 d-1 , and passing vector, ⁇ v r (l) to a right child of no
  • the method includes, when v is a frozen leaf, expanding all paths by a 0 bit and setting ⁇ v (l) to 0, for every path index, 1 ⁇ l ⁇ L, wherein a frozen leaf forces a hard decision to be zeros.
  • the method includes, for a node that is a root of a subtree with frozen leaves only (a RATE-0 node), for a node that is a root of a subtree with information leaves (a RATE-1 node), and for a node that is a root of a subtree whose leaves are all frozen leaves, except for the right most leaf (a REP node), updating an L best path metrics that compute corresponding hard decision vectors without visiting other nodes in subtrees of the binary tree.
  • updating the L best path metrics is performed in parallel for all paths in the list, wherein a latency of the path metric update depends on a depth of the node and not on a list size.
  • the method includes selecting t paths, where t ⁇ L, for those leaves for which a correct path is likely to be among a best t paths; applying a CRC detector on a k th leaf; and continuing decoding until a number of paths equals one.
  • determining those leaves for which a right path is likely to be among a best t ⁇ L paths comprises one or more of selecting those paths whose path metric is greater than a predetermined threshold, using machine learning to select the best t ⁇ L decoding paths, using a classifier to select the best t ⁇ L decoding paths, or using forward prediction on each path to select the best t ⁇ L decoding paths.
  • each path is associated with a memory that stores 2N ⁇ 1 soft decision values, and each memory is assigned a unique processing block that performs soft calculations along the decoding tree.
  • each node in the decoding tree stores soft calculations and hard decisions, wherein a hard decision is represented by one bit, a soft calculation is represented by a plurality of bits.
  • the method further includes saving, for a node at depth d, 2 n-d soft calculations and 2 n-d hard decisions for each path in the list; and pruning paths from the decoding tree.
  • FIG. 1 illustrates the structure of a codeword of an N ⁇ J GCC with S stages.
  • FIG. 2 illustrates message passing of a node in a binary tree, according to an embodiment of the disclosure.
  • FIG. 3 illustrates stepped SCL decoding in a pipeline, according to an embodiment of the disclosure.
  • FIG. 4 (A) is an example of a perfect decoding tree of a polar code of length 16 and dimension 10 using stepped SCL decoding and stepped SSCL decoding, according to embodiments of the disclosure.
  • FIG. 4 (B) is an example of a pruned tree of the stepped SSCL decoding of a polar code of length 16 and dimension 10 , according to embodiments of the disclosure.
  • FIG. 5 illustrates the throughput of an efficient chunk scheduler, according to an embodiment of the disclosure.
  • FIG. 6 is a block diagram of a system for implementing methods of performing error correction of S-polar codes that shorten latency, according to an embodiment of the disclosure.
  • FIG. 7 is a flowchart of a method of simplified successive cancellation list (SSCL) error decoding of S-polar codes, according to embodiments of the disclosure.
  • SSCL successive cancellation list
  • FIG. 8 is a simplified successive cancellation list (SSCL) error decoding of S-polar codes for channels with a small output alphabet, according to an embodiment of the disclosure.
  • SSCL successive cancellation list
  • FIG. 9 is a flowchart of a method of operation of a latency improving scheduler, according to an embodiment of the disclosure.
  • Embodiments of the disclosure provide latency improving techniques for S-Polar code that relies on stepped SCL decoding, early CRC detection, and the combination of stepped SCL decoding with simplified SCL decoding.
  • the channel output alphabet is small, memory size and latency are reduced by precalculating all possible outcomes of one of the first few layers in the tree and by using lookup tables.
  • embodiments provide a throughout efficient scheduler that is implemented on chunks of frames, to make use of hardware duplications of the row decoder.
  • GCC decoding is also performed on multiple stages. At the first stage, rows of the noisy array are decoded to codewords of C(0). These rows are called frames. The frames that were successfully decoded are mapped into cosets. If enough cosets were obtained, then the first outer code can decode the first K in (0) -K in (1) bits of the cosets for all frames, which in turn allows the remaining 2 frames to be decoded with the better correctability of the code C (1) . This process proceeds until either an outer code fails or all frames were decoded successfully by the inner code decoders and all cosets were decoded successfully by the outer code decoders.
  • Embodiments of the disclosure (1) apply CRC detection earlier and to use shorter list size (stepped SCL) when possible; (2) provide a pipeline decoding for S-Polar frames that increase the throughput using the stepped SCL decoding, for a given hardware resources, such as memories and processing units; (3) combine (1) and (2) with simplified SCL decoding to further decrease latency, memory size, and power; (4) perform precalculations that are stored on lookup tables to save calculations and memory for the massive first layers of the decoding tree for SCL decoding, in case the channel output alphabet is small; (5) provide a throughout efficient scheduler for GCC that applies row and column decoding simultaneously and reduces latency by decreasing the expected number of frames that are being decoded. If the row decoder is duplicated, the scheduler can be applied on chunks of frames.
  • the polar codes in the S-Polar are decoded with a Successive Cancellation List (SCL) decoder.
  • the SCL decoder outputs a list of L codewords, each has a score that indicates how likely the codeword to be the correct codeword. Performance is further improved by using CRC-aided polar codes to detect the correct word from the list.
  • the SCL decoder has a high latency. The latency of the SCL decoder can be decreased by using a Simplified Successive Cancellation List SCL (SSCL).
  • SSCL Simplified Successive Cancellation List SCL
  • the decoding tree of the list decoder is pruned and hard decision procedure on the leaves of the three are defined, such that the same decisions of SCL decoder are obtained, and thus there is no performance loss.
  • the SSCL has fewer of operations that are performed sequentially, which results in a latency improvement.
  • the number of operations that potentially can be performed in parallel depends on the size of the list.
  • a useful observation is that the list size does not have to be the same throughout the decoding process, and there are points in which the list size L can be much smaller, without affecting the performance.
  • Such a list decoder is known as a stepped SCL decoder.
  • the list sizes are optimized for each leaf so that the performance loss is minimized.
  • the list size gradually increases until it reaches a maximal size.
  • the CRC code that is used to narrow down the L codewords in the list to one codeword can be applied earlier in the decoding process, and thus, at some point of the decoding, the list size can be reduced to one and remains that way until the decoding is over, with a small performance loss.
  • combining stepped SCL, early CRC, and SSCL into one decoding technique can provide a latency improvement in practical situations in which memory and processing units are limited. This combined technique is described in Section II. Note that both the simplified SCL and the stepped SCL methods, as well as early CRC, also save time complexity and memory.
  • a stepped SCL decoder can also be exploited by the S-Polar to decode a number of rows simultaneously, while sharing the memory and processing units between different frames, and thus allowing a better usage of the available hardware.
  • the list size is small, more rows can be processed simultaneously using the available resources.
  • each row requires more processing units and thus fewer rows can be processed until the point in which the list size drops to one again by the early CRC detection. This sharing of resources between multiple rows will result in a further reduction in the overall latency. This method will be described n Section II.
  • the processing and saving of the first few layers of the tree can be skipped, which have the largest amount of data to be saved, and the largest number of operations. Although operations on a node can be performed in parallel, this requires more logical gates which may not be present in a real hardware implementation.
  • look-up-tables are used that include all possible results of values in the first layer to be kept. The number of layers saved this way depends on the size of the channel output alphabet, the length of the polar code, and the list size.
  • another throughput improvement for the S-Polar code is attained by choosing a scheduler that is more throughput oriented.
  • the scheduler of the S-Polar decoder decides when to apply the outer code decoder and when to apply the inner code decoder.
  • a naive scheduler will decode all rows and then apply the outer code decoder only for the next outer code, and repeat the process with the remaining rows that were not successfully decoded.
  • a throughput oriented scheduler can decode rows until the number of decoded rows is large enough for the next outer code decoder to succeed. It can also keep applying outer code decoders while it can, returning to the row decoder in a higher stage, in which the decoding error probability is lower.
  • an hardware architecture can implement a throughput oriented scheduler in which a chunk of C rows are decoded simultaneously by the row decoder. This hardware architecture is described in Section IV.
  • SSCL decoding is based on the observation that some nodes of the tree have enough information available such that hard decisions can be made on these nodes efficiently, without having to traverse the subtrees of these nodes.
  • SSCL there are three types of nodes that can be handled this way by the SSCL.
  • RATE-0 node This node is a root of a subtree with frozen leaves only, hence the hard decisions in this node must also be zeros and there is no need to visit other nodes in its subtree.
  • RATE-1 node This node is a root of a subtree with information leaves only, hence the hard decisions in this node can be obtained from the soft decisions available in this node and again, there is no need to visit other nodes in its subtree.
  • REP node This node is a root of a subtree whose leaves are all frozen leaves, except for the right most leaf, which corresponds to the largest index.
  • the hard decision can be all-zeros or all-ones, and both options are considered for each path, without having to visit other nodes in the REP node subtree.
  • This node is a root of a subtree whose leaves are all information leaves, except for the left most leaf, which corresponds to the smallest index.
  • FIG. 2 The message passing of a node in the tree is shown in FIG. 2 , according to an embodiment of the disclosure.
  • the central node v represents the current node
  • nodes v l and v r are the left and right child nodes, respectively, of the current node
  • p v is the parent node of the current node.
  • FIG. 7 is a flowchart of a method of simplified successive cancellation list (SSCL) error decoding of S-polar codes, according to embodiments of the disclosure.
  • a node v in depth d receives, at step 72 , a vector, ⁇ v (l) , of length 2 d of soft information from its parent node, v p .
  • node v computes a vector, ⁇ v l (l) , of length 2 d-1 of soft information for its left child, v l .
  • node v receives a vector, ⁇ v l (l) , of length 2 d-1 of hard decisions from its left child and uses it, together with ⁇ v (l) , to create a vector, ⁇ v r (l) , of length 2 d-1 , which it passes to its right child.
  • v receives a vector, ⁇ v r (l) , of length 2 d-1 of hard decisions from its right child and uses it, together with ⁇ v l (l) , to create a vector, ⁇ v , of length 2 d of hard decisions, which it passes to its parent node. If, at step 76 , v is the i th leaf of the perfect tree, 0 ⁇ i ⁇ 2n, then, for every path in the list, two path metrics are updated according to
  • the path metrics PM i (l) are the log-likelihood ratios of path l through leaf node i being a correct decoding path that represents a codeword.
  • ⁇ v (l) is set to 0. Otherwise, ⁇ v (l) is set to 1. If v is a frozen leaf then all paths are expanded by a 0 bit and ⁇ v (l) is set to 0, for every path index, 1 ⁇ l ⁇ L.
  • the L best path metrics can be updated and the corresponding hard decision vectors can be computed without the need to visit other nodes in the subtrees.
  • the path metric update can be performed in parallel for all paths in the list, and the latency of the path metric update mainly depends on the depth of the node and not on the list size.
  • m min ⁇ L ⁇ 1; 2 n-d ⁇ bits of the hard decision vectors are calculated sequentially and hence, the list size for this node type does affect the latency of its path metric update.
  • Example 1 Using statistics on the location of the correct path in the list for a CRC-aided polar code of length 512 and dimension 480 , it was found that, by taking the list size to be 32 until the leaf of index 256 and then increasing the list size to 64, almost the same FER of SCL decoding is obtained with a list of size 64. Moreover, the CRC detector can be applied when reaching the leaf of index 300 , and SC decoding is used on the remaining 212 leaves without a significant FER reduction.
  • each path requires its own memory that stores 2N ⁇ 1 soft decision values, and each memory is assigned with its unique processing block that performs the soft calculations along the decoding tree.
  • the S-Polar setting benefits from the fact that many frames must be decoded, and by allocating the memories efficiently in a pipeline decoding, it can reduce latency without a significant loss of FER.
  • the latency reduction is due to the fact that the majority of the memories are released earlier, and hence decoding of new frames in the pipeline can begin when the previous frames are decoded in the successive cancellation decoding mode.
  • the following example illustrates the stepped SCL pipeline decoding from Example 1, when 4 ⁇ 32 memories are utilized.
  • Example 2 The stepped SCL decoding from Example 1 includes three phases. In phase 1 the list size is 32; in phase 2 the list size is 64; and in phase 3 the list size is one. Since phase 2 only lasts from leaf index 256 until leaf index 300 , it has the lowest latency. Assume that the latency of phase 2 is x cycles and that the latency of phase 1 is 2 ⁇ cycles. Assume also that there are a total of 4 ⁇ 32 memories. The decoding starts in phase 2 with the first frame, and after x cycles the decoding starts in phase 1 with the second frame, using 2 ⁇ 32 memories for both frames. After 2 ⁇ cycles, another 32 memories are used to decode the first frame in phase 2 and another 32 memories are used to decode the third frame in phase 1.
  • FIG. 3 illustrates stepped SCL decoding in a pipeline, according to an embodiment of the disclosure. For each frame, phases 1, 2, and 3 are depicted with the leftmost, the middle, and the rightmost rectangles, respectively. The latency per frame is about x cycles, where x is the latency of phase 2, depicted by the middle rectangle.
  • the 4 ⁇ 32 memories can be split between two frames and phases 1 and 2 can be applied on two frames simultaneously.
  • the latency per frame will be the average of the latencies of phases 1 and 2, which is 1.5 ⁇ cycles.
  • Subsection II-A reviewed the SSCL decoding and described how it reduces latency by pruning the decoding tree.
  • Subsection II-B suggested that stepped SCL decoding with early CRC detection can decrease the latency of the decoding, when frames are decoded in a pipeline for a given number of memories, since memories are released earlier and become available for new frames. This subsection will describe the effects of embodiments of the disclosure that combine the two concepts of stepped and simplified SCL decoding.
  • a simplified SCL decoding can reduce the latency by reducing the latency of this phase. For other phases, it reduces memory size, which allows duplication of hardware and decreases latency.
  • the combined decoding also saves power.
  • Each node in the decoding tree stores soft calculations and hard decisions. While a hard decision is only one bit, a soft calculation is represented by a number of bits. If the node is at depth d, then 2 n-d soft calculations and 2 n-d hard decisions are saved for each path in the list.
  • the list size on a given node might not be the same for soft calculations and hard decisions, since the former is used when traversing down from a parent node to its child node, and the latter is used later on, when traversing up from a child node to its parent node.
  • FIG. 4 An example of a decoding tree of a polar code of length 16 and dimension 10 using stepped SCL decoding and stepped SSCL decoding according to embodiments of the disclosure is shown in FIG. 4 .
  • the perfect tree is shown in FIG. 4 (A) .
  • the white leaves are frozen leaves and the black leaves are information leaves.
  • the list sizes at each leaf are written below the leaf, and the early CRC detection takes place in the 10th leaf, reducing the list size to one.
  • the pruned tree of the stepped SSCL decoding for this code is shown in FIG. 4 (B) .
  • the number assigned to each directed edge is the list size when traversing the tree on that edge.
  • the list size for soft calculations at the root of tree (layer 0) is one, and for its hard decisions the list size is 8. Notice that, once the CRC detector is applied, the list size is one, and thus nodes of type SPC can also be utilized.
  • the first node has list size one for soft calculations and list size 8 for hard decisions, and the second node is much more memory consuming, with list size 8 for soft calculations and list size one for hard decisions. Notice that, when traversing from a node to its right child, the memory of the soft calculation is no longer needed and can be released.
  • the hard decisions of this node are no longer needed.
  • the left child of the root has list size one for soft calculations. This means that when the decoding reaches the right child of the root it uses memory for 7 more soft calculation vectors.
  • the first and last nodes are leaves and therefore at these nodes only the path metrics need to be updated, which does not use much memory, since it is one value per list.
  • the second node has a list of size two for soft calculations and a list of size 8 for hard decision
  • the third node has a list of size 8 for soft calculations and a list of size one for hard decisions. Again, when reaching the third node in second layer, memory is added for 6 more soft calculation vectors.
  • This section describes a method according to an embodiment that saves memory and soft calculations for the first layers of the decoding tree of the SCL decoding, when the channel output alphabet is small, and the code length and list size are large.
  • a method described here is applicable to stepped and simplified SCL decoding as well.
  • FIG. 8 is a simplified successive cancellation list (SSCL) error decoding of S-polar codes for channels with a small output alphabet, according to an embodiment.
  • the decoder receives, at step 82 , a vector of length 2 n of Log Likelihood Ratios (LLR), and each entry of the vector can take one of q possible values.
  • LLR Log Likelihood Ratios
  • the soft calculations in the left child of the root, v l is a vector ⁇ l of length 2 n-1 which takes values in a set of size q 2 .
  • a lookup table of size q 2 can be used to access the value of ⁇ 1,i directly.
  • the soft calculations in the right child of the root, v r is a vector, ⁇ r , of length 2 n-1 with entries taking values in a set of size 2q 2 , since ⁇ r,i depends also in the hard decision bit ⁇ l,i .
  • a lookup table of size 2q 2 is used to find the value of ⁇ r,i .
  • 3 lookup tables of size q 2 are used and the channel outputs can be represented by N log 2 q bits. If, at step 83 , 3q 2 f+N log 2 q ⁇ Nf, where f is the number of bits that represent a soft calculation value, then this technique saves both memory and calculations. Then, at step 84 , values of ⁇ l,i are determined by accessing the lookup-table of size q 2 , and values of ⁇ r,i are determined accessing the lookup-table of size 2q 2 .
  • This idea can be applied to the second layer, in which there are q 4 possible values for soft calculations in the first node from the left, 2q 4 for the second node from the left, 4q 4 for the third node, and 8q 4 for the last node of this layer.
  • a value of ⁇ l,i for a leftmost node is determined by accessing a lookup table of size q 4 ; a value of ⁇ 1,i for a next leftmost node is determined by accessing a lookup table of size 2q 4 ; determining a value of ⁇ 1,i for a next rightmost node is determined by accessing a lookup table of size 4q 4 ; and determining a value of ⁇ l,i for a rightmost node is determined by accessing a lookup table of size 8q 4 .
  • a degree of freedom to determine the number of frames to be decoded in a certain stage, before trying to decode the cosets of that stage using an RS decoder.
  • the block of the decoder that determines which frames to decode and when to call the RS decoder is called a scheduler.
  • a naive scheduler decodes all frames that were not yet successfully decoded at every stage.
  • the RS decoder must wait until the row decoder has finished.
  • An approach according to an embodiment decodes only enough frames for the RS decoder to be likely to succeed.
  • the RS decoder and the row decoder can work simultaneously, which increases throughput.
  • many frames are being decoded at latter stages, which have greater probability of success, and hence, on average an approach according to an embodiment decreases the expectation of the number of times the row decoder is being used, which increases throughput.
  • an RS code is a Maximum Distance Separable (MDS) code. Therefore, if J is the length of the RS code and K is it dimension, a successful decoding is guaranteed if the number of erased frames, for example, frames that were not yet decoded or were failed to be decoded, plus twice the number of mis-corrected frames is at most J ⁇ K. This can be written as n e +2n me ⁇ J ⁇ K, where n e is the number of erased frames and nee is the number of mis-corrected frames.
  • MDS Maximum Distance Separable
  • the number m can be determined using the probabilities of failure and mis-correct of the row decoder at each stage and the total probability of failure that are desired.
  • a throughput efficient scheduler can operate on groups of C frames called chunks.
  • FIG. 9 is a flowchart of a method of operation of a latency improving scheduler, according to an embodiment of the disclosure. Assuming that the S-polar code has been represented as a perfect binary tree at step 91 , the scheduler submits, at step 92 , C frames to multiple row decoders and counts the number of decoded frames, for example, frames that were successfully decoded. The row decoders decode the frames as described above. Once this number reach K+2m, RS decoding at this stage begins at step 93 , while more chunks of frames are decoded at the same stage.
  • step 94 If, at step 94 , the RS decoder succeeds, the upcoming chunk will be decoded at the next stage of decoding, at step 95 , using the cosets from the RS decoder. Otherwise, at step 96 , RS decoding is repeated on more frames, since new chunks of frames are being decoded at this point.
  • FIG. 5 illustrates the throughput of an efficient chunk scheduler, according to an embodiment of the disclosure.
  • RSD refers to a Reed-Solomon decoder.
  • decoding of the i th chunk at stage 0 begins.
  • decoding of the i th chunk at stage 0 ends and decoding of the i+1 st chunk at stage 0 begins.
  • the RS decoder has succeeded and the i+2nd chunk is decoded at stage 1.
  • decoding of the i+2nd chunk at stage 1 ends, but there are not enough decoded frames to apply RS decoder of stage 2.
  • decoding of the i+3rd frame at stage 1 begins.
  • decoding of i+3rd frame at stage 1 ends and decoding of the i+4th frame at stage 1 begins.
  • embodiments of the present disclosure can be implemented in various forms of hardware, software, firmware, special purpose processes, or a combination thereof.
  • the present disclosure can be implemented in hardware as an application-specific integrated circuit (ASIC), or as a field programmable gate array (FPGA).
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • the present disclosure can be implemented in software as an application program tangible embodied on a computer readable program storage device. The application program can be uploaded to, and executed by, a machine comprising any suitable architecture.
  • FIG. 6 is a block diagram of a system for implementing methods of performing error correction of S-polar codes that shorten latency, according to an embodiment of the disclosure.
  • a computer system 61 for implementing the present disclosure can comprise, inter alia, a central processing unit (CPU) or controller 62 , a memory 63 and an input/output (I/O) interface 64 .
  • the computer system 61 is generally coupled through the I/O interface 64 to a display 65 and various input devices 36 such as a mouse and a keyboard.
  • the support circuits can include circuits such as cache, power supplies, clock circuits, and a communication bus.
  • the memory 63 can include random access memory (RAM), read only memory (ROM), disk drive, tape drive, etc., or a combinations thereof.
  • RAM random access memory
  • ROM read only memory
  • the present disclosure can be implemented as a routine 67 that is stored in memory 63 and executed by the CPU or controller 62 to process the signal from the signal source 68 .
  • the computer system 61 is a general purpose computer system that becomes a specific purpose computer system when executing the routine 67 of the present disclosure.
  • embodiments of the present disclosure can be implemented as an ASIC or FPGA 67 that is in signal communication with the CPU or controller 62 to process the signal from the signal source 68 .
  • the computer system 61 also includes an operating system and micro instruction code.
  • the various processes and functions described herein can either be part of the micro instruction code or part of the application program (or combination thereof) which is executed via the operating system.
  • various other peripheral devices can be connected to the computer platform such as an additional data storage device and a printing device.

Abstract

A method of simplified successive cancellation list (SSCL) error decoding of S-polar codes includes representing an S-polar code as a perfect binary tree; providing a node v a vector αv (l) of soft information from a parent node; computing a vector αv l (l) of soft information for a left child of node v; providing node v with a vector βv l (l) of hard decisions from the left child and using it with αv (l) to create a soft information vector αv r (l) and passing it to a right child of node v; providing node v with a vector βv r (l) of hard decisions from its right child and using it with βv l (l) to create a hard decision vector, βv of hard decisions, and passing it to its parent node; updating, when v is a ith leaf of the perfect tree, two path metrics, and selecting paths obtained by expanding current paths with a lowest path metric.

Description

    BACKGROUND
  • Embodiments of the disclosure are directed to methods of performing error correction in digital communications that shorten latency, and to hardware implementation of the same.
  • S-Polar is a Generalized Concatenated Code (GCC) with Reed-Solomon (RS) codes as its outer codes and polar codes as its inner codes. In a GCC, the information is encoded into an N×J array using S−1 outer codes and S inner codes. The inner codes, Cin (0), Cin (1), . . . , Cin (S−1), are linear and nested codes, i.e., the sth code is contained in the s−1th code, for 1≤s≤S−1. In particular, if Kin (s) is the dimension of the sth code, then Kin (0)≥Kin (1)≥ . . . ≥Kin (S−1). The outer codes, CVout (1), . . . , Cout (5) are assumed to be systematic codes of length J and dimensions 0<Kout (1)≤ . . . ≤Kout (S−1)<J. When considering the outer codes, it is convenient to include the two trivial outer codes for stages 0 and S, which has dimensions Kout (1)=0 and Kout (S)=J, respectively. The sst outer code, for 1≤s<S, is defined over extension fields of GF(2) of dimension Kin (s−1)-Kin (s). The inner codes encodes information and some parities of the outer codes into rows of the array. The encoded rows are mapped to the coset domain, on which the outer codes operate. The coset of a row codeword is a vector of length Kin (0)−Kin (S−1), such that for each stage s, the Kin (0)−Kin (s) first bits of the vector, allows reducing a codeword of C(0) to a codeword of C(s) and thus, the side information from the cosets increase the correctability of the row. More precisely, the encoding of the GCC is performed in S stages, where at each stage s, new information bits and parities from the previous outer codes are encoded into n=Kout (S+1)−Kout s codewords of C(s) and stored as n rows of the array. The encoded rows are mapped into the coset domain. The s+1th outer code encodes the current t=Kout s−Kout s+1 columns of the coset array systematically. The parities of the obtained codewords are transmitted to the next inner code for encoding. FIG. 1 illustrates the structure of a codeword of an N×J GCC with S stages. The white cells represent information symbols, whereas the grey cells represent parities.
  • SUMMARY
  • According to an embodiment of the disclosure, there is provided a method of simplified successive cancellation list (SSCL) error decoding of S-polar codes, including: representing an S-polar code of length N=2n as a perfect binary tree with 2N−1 nodes, wherein n is a non-negative integer, wherein, for an lth path in a list of L paths through the binary tree, 1≤l≤L, wherein an S-Polar code is a generalized concatenated code (GCC) with Reed-Solomon (RS) codes as its outer codes and polar codes as its inner codes, wherein information is encoded into an N×J array using S−1 outer codes and S inner codes, wherein S and J are non-negative integers, wherein a GCC is encoded in S stages, where at each stage s, new information bits and parities from a previous outer code are encoded into n=Kout s+1−Kout s codewords of C(s) and stored as n rows of the array, wherein C(s) is a codeword at stage s, wherein Kout s is the amount of data in an outer codeword at stage s, wherein the encoded rows are mapped into a coset array in which an s+1th outer code systematically encodes a current t=Kout s−Kout s+1 columns of the coset array, and wherein parities of obtained codewords are transmitted to a next inner code for encoding; providing a node v in a decoding path l at a depth d in the perfect binary tree with a vector, αv (l), of length 2d of soft information from a parent node, vp; computing, for every path in a list of paths in the binary tree, a vector, αv l (l), of length 2d-1 of soft information for a left child, vl of node v; providing node v with a vector, βv l (l), of length 2d-1 of hard decisions from the left child and using vector, βv l (l), together with αv (l), to create a soft information vector, αv r (l), of length 2d-1, and passing vector, αv r (l) to a right child of node v, providing node v with a vector, βv r (l), of length 2d-1 of hard decisions from its right child and using vector, βv r (l), together with βv r (l), to create a hard decision vector, βv, of length 2d of hard decisions, and passing vector, βv, to its parent node; when v is a ith leaf of the perfect tree, 0≤i<2n, then, for every path in the list of paths, updating two path metrics according to PMi (2l)=PMi-1 (l)+ln(1+exp(−αv,i (l))) and PMi (2i+1)=PMi-1 (l)+ln(1+exp(αv,i (l))), wherein PMi (l) is a loglikelihood ratio of path (l) through leaf i representing a codeword; selecting L of 2L paths obtained by expanding a current L paths with a 0 bit or with a 1 bit with a lowest path metric; and if an lth path is expanded by a 0 bit, setting βv (l) to 0, otherwise, setting βv (l) to 1.
  • According to a further embodiment of the disclosure, the method includes, when v is a frozen leaf, expanding all paths by a 0 bit and setting βv (l) to 0, for every path index, 1≤l≤L, wherein a frozen leaf forces a hard decision to be zeros.
  • According to a further embodiment of the disclosure, the method includes, for a node that is a root of a subtree with frozen leaves only (a RATE-0 node), for a node that is a root of a subtree with information leaves (a RATE-1 node), and for a node that is a root of a subtree whose leaves are all frozen leaves, except for the right most leaf (a REP node), updating an L best path metrics that compute corresponding hard decision vectors without visiting other nodes in subtrees of the binary tree.
  • According to a further embodiment of the disclosure, for a RATE-0 node and a REP node, updating the L best path metrics is performed in parallel for all paths in the list, wherein a latency of the path metric update depends on a depth of the node and not on a list size.
  • According to a further embodiment of the disclosure, updating the L best path metrics comprises, for a RATE-1 node of depth d, sequentially calculating m=min{L−1; 2n-d} bits of the hard decision wherein a latency of the path metric update depends on a list size.
  • According to a further embodiment of the disclosure, the method includes selecting t paths, where t<L, for those leaves for which a correct path is likely to be among a best t paths; applying a CRC detector on a kth leaf; and continuing decoding until a number of paths equals one.
  • According to a further embodiment of the disclosure, determining those leaves for which a right path is likely to be among a best t<L paths comprises one or more of selecting those paths whose path metric is greater than a predetermined threshold, using machine learning to select the best t<L decoding paths, using a classifier to select the best t<L decoding paths, or using forward prediction on each path to select the best t<L decoding paths.
  • According to a further embodiment of the disclosure, different paths are processed in parallel, each path is associated with a memory that stores 2N−1 soft decision values, and each memory is assigned a unique processing block that performs soft calculations along the decoding tree.
  • According to a further embodiment of the disclosure, each node in the decoding tree stores soft calculations and hard decisions, wherein a hard decision is represented by one bit, a soft calculation is represented by a plurality of bits. The method further includes saving, for a node at depth d, 2n-d soft calculations and 2n-d hard decisions for each path in the list; and pruning paths from the decoding tree.
  • According to an embodiment of the disclosure, there is provided a method of simplified successive cancellation list (SSCL) error decoding of S-polar codes, including: representing an S-polar code of length N=2n as a perfect binary tree with 2N−1 nodes, wherein n is a non-negative integer, wherein, for an lth path in a list of L paths through the binary tree, 1≤l≤L, wherein an S-Polar code is a generalized concatenated code (GCC) with Reed-Solomon (RS) codes as its outer codes and polar codes as its inner codes, wherein information is encoded into an N×J array using S−1 outer codes and S inner codes, wherein S and J are non-negative integers, wherein a GCC is encoded in S stages, where at each stage s, new information bits and parities from a previous outer code are encoded into n=Kout s+1−Kout s codewords of C(s) and stored as n rows of the array, wherein C(s) is a codeword at stage s, wherein Kout s is the amount of data in an outer codeword at stage s, wherein the encoded rows are mapped into a coset array in which an s+1th outer code systematically encodes a current t=Kout s−Kout s+1 Kaur columns of the coset array, and wherein parities of obtained codewords are transmitted to a next inner code for encoding; receiving a vector of length 2n of log likelihood ratios (LLR), wherein each entry of the vector can take one of q possible values and L is a list size. For a first layer of decoding, soft calculations in a left child of a root, vl, is a vector αl of length 2n-1 which takes values in a set of size q2, and soft calculations in a right child of the root, vt, is a vector, αr, of length 2n-1 with entries taking values in a set of size 2q2, wherein αr,i depends on a hard decision bit βLI. When 3q2f+N log2 q<<Nf, where f is a number of bits that represent a soft calculation value, the method includes determining a value of α1,i for coordinates 2i and 2i+1, for 0≤i<2n-1, by accessing a lookup table of size q2; and determining a value of αr,i. by accessing a lookup table of size 2q2. For a second layer of decoding, a first node from a left has q4 possible values for soft calculations, a second node from the left has 2q4 possible values for soft calculations, a third node from the left has 4q4 for possible values for soft calculations, and a last node has 8q4 f possible values for soft calculations. When 15q4+N log q 2<<f N (L/2+1), the method includes determining a value of α1,i for a leftmost node by accessing a lookup table of size q4; determining a value of α1,i for a next leftmost node by accessing a lookup table of size 2q4; determining a value of α1,i for a next rightmost node by accessing a lookup table of size 4q4; and determining a value of α1,i for a rightmost node by accessing a lookup table of size 8q4.
  • According to an embodiment of the disclosure, there is provided a method of error decoding S-polar codes, including: representing an S-polar code of length N=2n as a perfect binary tree with 2N−1 nodes, wherein, for an lth path in a list of L paths through the binary tree, 1≤l≤L, wherein an S-Polar code is a generalized concatenated code (GCC) with Reed-Solomon (RS) codes as its outer codes and polar codes as its inner codes, wherein information is encoded into an N−J array using S−1 outer codes and S inner codes, wherein a GCC is encoded in S stages, where at each stage s, new information bits and parities from a previous outer code are encoded into n=Kout s+1−Kout s codewords of C(s) and stored as n rows of the array, wherein the encoded rows are mapped into a coset array in which an s+1th outer code systematically encodes a current t=Kout s−Kout s+1 columns of the coset array, and wherein parities of obtained codewords are transmitted to a next inner code for encoding; submitting a plurality of frames to multiple row decoders and counting a number of frames that were successfully decoded, wherein a frame is a plurality of rows of an S-polar code array that are decoded to codewords of C(0); performing Reed-Solomon (RS) decoding on the codewords of C(0) when a number of successfully decoded frames reaches K+2m, wherein K is a dimension of a codeword of C(0) and m is a number of mis-corrects, while decoding additional pluralities of frames at a same stage; and wherein when the RS decoding succeeds, decoding an upcoming plurality of frames at a next stage of decoding, using cosets from the RS decoding, otherwise, repeating RS decoding on an additional plurality of frames.
  • According to a further embodiment of the disclosure, decoding each row of an S-polar code array into codewords of C(0) includes: providing a node v in a decoding path l at a depth d in the perfect binary tree with a vector, αv (l), of length 2d of soft information from a parent node, vp; computing, for every path in a list of paths in the binary tree, a vector, αv l (l), of length 2d-1 of soft information for a left child, vl of node v; providing node v with a vector, βv l (l), of length 2d-1 of hard decisions from the left child and using vector, βv l (l), together with αv (l), to create a soft information vector, αv r (l), of length 2d-1, and passing vector, αv r (l) to a right child of node v; providing node v with a vector, βv r (l), of length 2d-1 of hard decisions from its right child and using vector, βv r (l), together with βv l (l), to create a hard decision vector, βv, of length 2d of hard decisions, and passing vector, βv, to its parent node; when v is a ith leaf of the perfect tree, 0≤i<2n, then, for every path in the list of paths, updating two path metrics according to PMi (2l)=PMi-1 (l)+ln(1+exp(−αv,i (l))) and PMi (2i+1)=PMi-1 (l)+ln(1+exp(αv,i (l))), wherein PMi (l) is a loglikelihood ratio of path (l) through leaf i representing a codeword; selecting L of 2L paths obtained by expanding a current L paths with a 0 bit or with a 1 bit with a lowest path metric; and if an lth path is expanded by a 0 bit, setting βv (l) to 0, otherwise, setting βv (l) to 1.
  • According to a further embodiment of the disclosure, the method includes, when v is a frozen leaf, expanding all paths by a 0 bit and setting βv (l) to 0, for every path index, 1≤l≤L, wherein a frozen leaf forces a hard decision to be zeros.
  • According to a further embodiment of the disclosure, the method includes, for a node that is a root of a subtree with frozen leaves only (a RATE-0 node), for a node that is a root of a subtree with information leaves (a RATE-1 node), and for a node that is a root of a subtree whose leaves are all frozen leaves, except for the right most leaf (a REP node), updating an L best path metrics that compute corresponding hard decision vectors without visiting other nodes in subtrees of the binary tree.
  • According to a further embodiment of the disclosure, for a RATE-0 node and a REP node, updating the L best path metrics is performed in parallel for all paths in the list, wherein a latency of the path metric update depends on a depth of the node and not on a list size.
  • According to a further embodiment of the disclosure, updating the L best path metrics includes, for a RATE-1 node of depth d, sequentially calculating m=min{L−1; 2n-d} bits of the hard decision wherein a latency of the path metric update depends on a list size.
  • According to a further embodiment of the disclosure, the method includes selecting t paths, where t<L, for those leaves for which a correct path is likely to be among a best t paths; applying a CRC detector on a kth leaf; and continuing decoding until a number of paths equals one.
  • According to a further embodiment of the disclosure, determining those leaves for which a right path is likely to be among a best t<L paths comprises one or more of selecting those paths whose path metric is greater than a predetermined threshold, using machine learning to select the best t<L decoding paths, using a classifier to select the best t<L decoding paths, or using forward prediction on each path to select the best t<L decoding paths.
  • According to a further embodiment of the disclosure, different paths are processed in parallel, each path is associated with a memory that stores 2N−1 soft decision values, and each memory is assigned a unique processing block that performs soft calculations along the decoding tree.
  • According to a further embodiment of the disclosure, each node in the decoding tree stores soft calculations and hard decisions, wherein a hard decision is represented by one bit, a soft calculation is represented by a plurality of bits. The method further includes saving, for a node at depth d, 2n-d soft calculations and 2n-d hard decisions for each path in the list; and pruning paths from the decoding tree.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates the structure of a codeword of an N×J GCC with S stages.
  • FIG. 2 illustrates message passing of a node in a binary tree, according to an embodiment of the disclosure.
  • FIG. 3 illustrates stepped SCL decoding in a pipeline, according to an embodiment of the disclosure.
  • FIG. 4(A) is an example of a perfect decoding tree of a polar code of length 16 and dimension 10 using stepped SCL decoding and stepped SSCL decoding, according to embodiments of the disclosure.
  • FIG. 4(B) is an example of a pruned tree of the stepped SSCL decoding of a polar code of length 16 and dimension 10, according to embodiments of the disclosure.
  • FIG. 5 illustrates the throughput of an efficient chunk scheduler, according to an embodiment of the disclosure.
  • FIG. 6 is a block diagram of a system for implementing methods of performing error correction of S-polar codes that shorten latency, according to an embodiment of the disclosure.
  • FIG. 7 is a flowchart of a method of simplified successive cancellation list (SSCL) error decoding of S-polar codes, according to embodiments of the disclosure.
  • FIG. 8 is a simplified successive cancellation list (SSCL) error decoding of S-polar codes for channels with a small output alphabet, according to an embodiment of the disclosure.
  • FIG. 9 is a flowchart of a method of operation of a latency improving scheduler, according to an embodiment of the disclosure.
  • DETAILED DESCRIPTION I. Introduction
  • Embodiments of the disclosure provide latency improving techniques for S-Polar code that relies on stepped SCL decoding, early CRC detection, and the combination of stepped SCL decoding with simplified SCL decoding. When the channel output alphabet is small, memory size and latency are reduced by precalculating all possible outcomes of one of the first few layers in the tree and by using lookup tables. For GCC, embodiments provide a throughout efficient scheduler that is implemented on chunks of frames, to make use of hardware duplications of the row decoder.
  • GCC decoding is also performed on multiple stages. At the first stage, rows of the noisy array are decoded to codewords of C(0). These rows are called frames. The frames that were successfully decoded are mapped into cosets. If enough cosets were obtained, then the first outer code can decode the first Kin (0)-Kin (1) bits of the cosets for all frames, which in turn allows the remaining 2 frames to be decoded with the better correctability of the code C(1). This process proceeds until either an outer code fails or all frames were decoded successfully by the inner code decoders and all cosets were decoded successfully by the outer code decoders.
  • Embodiments of the disclosure (1) apply CRC detection earlier and to use shorter list size (stepped SCL) when possible; (2) provide a pipeline decoding for S-Polar frames that increase the throughput using the stepped SCL decoding, for a given hardware resources, such as memories and processing units; (3) combine (1) and (2) with simplified SCL decoding to further decrease latency, memory size, and power; (4) perform precalculations that are stored on lookup tables to save calculations and memory for the massive first layers of the decoding tree for SCL decoding, in case the channel output alphabet is small; (5) provide a throughout efficient scheduler for GCC that applies row and column decoding simultaneously and reduces latency by decreasing the expected number of frames that are being decoded. If the row decoder is duplicated, the scheduler can be applied on chunks of frames.
  • To achieve high performance, the polar codes in the S-Polar are decoded with a Successive Cancellation List (SCL) decoder. The SCL decoder outputs a list of L codewords, each has a score that indicates how likely the codeword to be the correct codeword. Performance is further improved by using CRC-aided polar codes to detect the correct word from the list. However, the SCL decoder has a high latency. The latency of the SCL decoder can be decreased by using a Simplified Successive Cancellation List SCL (SSCL). A simplified method was proposed for successive cancellation decoding, which is same as SCL with list of size one. In this technique, the decoding tree of the list decoder is pruned and hard decision procedure on the leaves of the three are defined, such that the same decisions of SCL decoder are obtained, and thus there is no performance loss. The decoding tree of the SCL decoder is a perfect binary tree with N=2n leaves on which the list decoder traverses sequentially, and thus the size of this tree determines the latency of the SCL decoder. By pruning the tree, the SSCL has fewer of operations that are performed sequentially, which results in a latency improvement.
  • For every node in the tree, the number of operations that potentially can be performed in parallel depends on the size of the list. A useful observation is that the list size does not have to be the same throughout the decoding process, and there are points in which the list size L can be much smaller, without affecting the performance. Such a list decoder is known as a stepped SCL decoder. The leaves of the tree can be split into four groups of N=4 consecutive leaves, and a different list size is used for each group.
  • According to an embodiment, using simulations, the list sizes are optimized for each leaf so that the performance loss is minimized. In an optimal series of list sizes, the list size gradually increases until it reaches a maximal size. In addition, the CRC code that is used to narrow down the L codewords in the list to one codeword can be applied earlier in the decoding process, and thus, at some point of the decoding, the list size can be reduced to one and remains that way until the decoding is over, with a small performance loss. According to an embodiment, combining stepped SCL, early CRC, and SSCL into one decoding technique can provide a latency improvement in practical situations in which memory and processing units are limited. This combined technique is described in Section II. Note that both the simplified SCL and the stepped SCL methods, as well as early CRC, also save time complexity and memory.
  • A stepped SCL decoder according to an embodiment can also be exploited by the S-Polar to decode a number of rows simultaneously, while sharing the memory and processing units between different frames, and thus allowing a better usage of the available hardware. When the list size is small, more rows can be processed simultaneously using the available resources. When the list size is increased, each row requires more processing units and thus fewer rows can be processed until the point in which the list size drops to one again by the early CRC detection. This sharing of resources between multiple rows will result in a further reduction in the overall latency. This method will be described n Section II.
  • In an embodiment, if the noisy channel has a small output alphabet, then the processing and saving of the first few layers of the tree can be skipped, which have the largest amount of data to be saved, and the largest number of operations. Although operations on a node can be performed in parallel, this requires more logical gates which may not be present in a real hardware implementation. To skip the layers, look-up-tables are used that include all possible results of values in the first layer to be kept. The number of layers saved this way depends on the size of the channel output alphabet, the length of the polar code, and the list size.
  • Finally, in an embodiment, another throughput improvement for the S-Polar code is attained by choosing a scheduler that is more throughput oriented. The scheduler of the S-Polar decoder decides when to apply the outer code decoder and when to apply the inner code decoder. A naive scheduler will decode all rows and then apply the outer code decoder only for the next outer code, and repeat the process with the remaining rows that were not successfully decoded. A throughput oriented scheduler can decode rows until the number of decoded rows is large enough for the next outer code decoder to succeed. It can also keep applying outer code decoders while it can, returning to the row decoder in a higher stage, in which the decoding error probability is lower. Although such scheduling increases throughput, it also has the potential to decrease performance, since the outer code decoder is more likely to produce miss-corrects. However, the performance loss can be mitigated by increasing the number of corrected rows that are required by the scheduler before it calls the outer code decoder. In addition, an hardware architecture can implement a throughput oriented scheduler in which a chunk of C rows are decoded simultaneously by the row decoder. This hardware architecture is described in Section IV.
  • II. Stepped SSCL for S-Polar Decoder
  • In this section the concepts of SSCL decoding and stepped SCL decoding are separately described, and then these two concepts are combined into stepped SSCL decoding.
  • A. SSCL Decoding
  • According to an embodiment, SCL decoding of a polar code of length N=2n is based on message passing of soft and hard decisions through nodes of a perfect binary tree with 2N−1 nodes. Since the message passing is performed sequentially on the nodes of the tree, the latency of the decoder increases with the size of the tree. When the decoder reaches a leaf of the tree it makes a hard decision on a single bit, which it propagates back to the parent node. If the list size is L, the hard decision is performed by taking the most likely L paths of the 2L paths that are obtained by expanding the current L paths with a 0 bit or with a 1 bit. Some leaves, called frozen leaves, force the hard decision of the decoder to be zeros, hence all paths in the list are expanded with a 0 bit. SSCL decoding is based on the observation that some nodes of the tree have enough information available such that hard decisions can be made on these nodes efficiently, without having to traverse the subtrees of these nodes. Currently, there are three types of nodes that can be handled this way by the SSCL.
  • (1) RATE-0 node: This node is a root of a subtree with frozen leaves only, hence the hard decisions in this node must also be zeros and there is no need to visit other nodes in its subtree.
  • (2) RATE-1 node: This node is a root of a subtree with information leaves only, hence the hard decisions in this node can be obtained from the soft decisions available in this node and again, there is no need to visit other nodes in its subtree.
  • (3) REP node: This node is a root of a subtree whose leaves are all frozen leaves, except for the right most leaf, which corresponds to the largest index. For such a node, the hard decision can be all-zeros or all-ones, and both options are considered for each path, without having to visit other nodes in the REP node subtree.
  • In addition to the three node types above, there is another type that can be used for successive cancellation decoding, which is equivalent to SCL with a list of size one.
  • (4) SPC node: This node is a root of a subtree whose leaves are all information leaves, except for the left most leaf, which corresponds to the smallest index.
  • This method has no effect on the frame error rate (FER).
  • The message passing of a node in the tree is shown in FIG. 2 , according to an embodiment of the disclosure. In FIG. 2 , the central node v represents the current node, nodes vl and vr are the left and right child nodes, respectively, of the current node, and pv is the parent node of the current node.
  • FIG. 7 is a flowchart of a method of simplified successive cancellation list (SSCL) error decoding of S-polar codes, according to embodiments of the disclosure. After representing S-polar code of length N=2n as a perfect binary tree with 2N−1 nodes, at step 71, for the lth path in the list, 1≤l≤L, a node v in depth d receives, at step 72, a vector, αv (l), of length 2d of soft information from its parent node, vp. At step 73, for every path in the list, node v computes a vector, ∝v l (l), of length 2d-1 of soft information for its left child, vl. At step 74, node v receives a vector, βv l (l), of length 2d-1 of hard decisions from its left child and uses it, together with αv (l), to create a vector, ∝v r (l), of length 2d-1, which it passes to its right child. At step 75, v receives a vector, βv r (l), of length 2d-1 of hard decisions from its right child and uses it, together with βv l (l), to create a vector, βv, of length 2d of hard decisions, which it passes to its parent node. If, at step 76, v is the ith leaf of the perfect tree, 0≤i<2n, then, for every path in the list, two path metrics are updated according to

  • PM i (2l) =PM i-1 (l)+ln(1+exp(−αv l (l)))

  • and

  • PM i (2l+1) =PM i-1 (l)+ln(1+exp(−αv,i (l)));
  • and, at step 77, the L paths with lowest path metric among the 2-L paths are selected. The path metrics PMi (l) are the log-likelihood ratios of path l through leaf node i being a correct decoding path that represents a codeword. At step 78, if the lth path was expanded by a 0 bit, then βv (l) is set to 0. Otherwise, βv (l) is set to 1. If v is a frozen leaf then all paths are expanded by a 0 bit and βv (l) is set to 0, for every path index, 1≤l≤L.
  • For nodes of types RATE-0, RATE-1, and REP, the L best path metrics can be updated and the corresponding hard decision vectors can be computed without the need to visit other nodes in the subtrees. Note that for nodes of type RATE-0 and REP, the path metric update can be performed in parallel for all paths in the list, and the latency of the path metric update mainly depends on the depth of the node and not on the list size. However, for RATE-1 node of depth d, m=min{L−1; 2n-d} bits of the hard decision vectors are calculated sequentially and hence, the list size for this node type does affect the latency of its path metric update.
  • B. Stepped SCL Decoding
  • In an embodiment, recall that in SCL decoding with a list of size L, at each leaf 2L path metrics are computed and the L paths with lowest path metric are chosen. The probability of the right path to be among the best t<L paths varies from one leaf to another. At some leaves, the right path is likely to be among the best t paths and thus, decreasing the list size to t<L might have an insignificant impact on the FER. Moreover, since SCL is often applied on CRC-aided polar codes, only part of the information bits are encoded with the CRC, which will allow applying the CRC detector on the kth leaf instead of on the end of the decoding. Once the CRC detector is applied, the decoding continues until the path list is of size one, i.e., with successive cancellation decoding. Notice that, for the SC decoding, there are also nodes of type SPC.
  • Example 1. Using statistics on the location of the correct path in the list for a CRC-aided polar code of length 512 and dimension 480, it was found that, by taking the list size to be 32 until the leaf of index 256 and then increasing the list size to 64, almost the same FER of SCL decoding is obtained with a list of size 64. Moreover, the CRC detector can be applied when reaching the leaf of index 300, and SC decoding is used on the remaining 212 leaves without a significant FER reduction.
  • According to an embodiment, in a hardware implementation of SCL decoding, different paths are processed in parallel. Each path requires its own memory that stores 2N−1 soft decision values, and each memory is assigned with its unique processing block that performs the soft calculations along the decoding tree. The S-Polar setting benefits from the fact that many frames must be decoded, and by allocating the memories efficiently in a pipeline decoding, it can reduce latency without a significant loss of FER. The latency reduction is due to the fact that the majority of the memories are released earlier, and hence decoding of new frames in the pipeline can begin when the previous frames are decoded in the successive cancellation decoding mode. The following example illustrates the stepped SCL pipeline decoding from Example 1, when 4×32 memories are utilized.
  • Example 2. The stepped SCL decoding from Example 1 includes three phases. In phase 1 the list size is 32; in phase 2 the list size is 64; and in phase 3 the list size is one. Since phase 2 only lasts from leaf index 256 until leaf index 300, it has the lowest latency. Assume that the latency of phase 2 is x cycles and that the latency of phase 1 is 2× cycles. Assume also that there are a total of 4×32 memories. The decoding starts in phase 2 with the first frame, and after x cycles the decoding starts in phase 1 with the second frame, using 2×32 memories for both frames. After 2× cycles, another 32 memories are used to decode the first frame in phase 2 and another 32 memories are used to decode the third frame in phase 1. When the decoding in phase 2 of the first frame is completed, 2×32 memories are released and can be used to decode the second frame in phase 2 and a new frame in phase 1. At the same time, the first frame can be decoded in phase 3 using a separate memory. This way the decoding of phase 2 is done sequentially, one frame at a time, and the latency per frame is roughly x cycles. FIG. 3 illustrates stepped SCL decoding in a pipeline, according to an embodiment of the disclosure. For each frame, phases 1, 2, and 3 are depicted with the leftmost, the middle, and the rightmost rectangles, respectively. The latency per frame is about x cycles, where x is the latency of phase 2, depicted by the middle rectangle.
  • Alternatively, in an embodiment, the 4×32 memories can be split between two frames and phases 1 and 2 can be applied on two frames simultaneously. In this case the latency per frame will be the average of the latencies of phases 1 and 2, which is 1.5× cycles.
  • C. Stepped SSCL Decoding
  • Subsection II-A reviewed the SSCL decoding and described how it reduces latency by pruning the decoding tree. Subsection II-B suggested that stepped SCL decoding with early CRC detection can decrease the latency of the decoding, when frames are decoded in a pipeline for a given number of memories, since memories are released earlier and become available for new frames. This subsection will describe the effects of embodiments of the disclosure that combine the two concepts of stepped and simplified SCL decoding.
  • Since the latency of a stepped SCL decoding in a pipeline is governed by the phase latency in the decoding of a maximum list size, a simplified SCL decoding can reduce the latency by reducing the latency of this phase. For other phases, it reduces memory size, which allows duplication of hardware and decreases latency. The combined decoding also saves power.
  • To understand the efficiency of the combined concepts, examine the decoding tree and the memory size needed at each node. Each node in the decoding tree stores soft calculations and hard decisions. While a hard decision is only one bit, a soft calculation is represented by a number of bits. If the node is at depth d, then 2n-d soft calculations and 2n-d hard decisions are saved for each path in the list. In the stepped SCL decoding, the list size on a given node might not be the same for soft calculations and hard decisions, since the former is used when traversing down from a parent node to its child node, and the latter is used later on, when traversing up from a child node to its parent node.
  • An example of a decoding tree of a polar code of length 16 and dimension 10 using stepped SCL decoding and stepped SSCL decoding according to embodiments of the disclosure is shown in FIG. 4 . The perfect tree is shown in FIG. 4(A). The white leaves are frozen leaves and the black leaves are information leaves. The list sizes at each leaf are written below the leaf, and the early CRC detection takes place in the 10th leaf, reducing the list size to one.
  • The pruned tree of the stepped SSCL decoding for this code is shown in FIG. 4(B). The number assigned to each directed edge is the list size when traversing the tree on that edge. The list size for soft calculations at the root of tree (layer 0) is one, and for its hard decisions the list size is 8. Notice that, once the CRC detector is applied, the list size is one, and thus nodes of type SPC can also be utilized. In the first layer from the left: the first node has list size one for soft calculations and list size 8 for hard decisions, and the second node is much more memory consuming, with list size 8 for soft calculations and list size one for hard decisions. Notice that, when traversing from a node to its right child, the memory of the soft calculation is no longer needed and can be released. Similarly, when traversing from a node to its parent node, the hard decisions of this node are no longer needed. For example, the left child of the root has list size one for soft calculations. This means that when the decoding reaches the right child of the root it uses memory for 7 more soft calculation vectors. In the second layer from the left: the first and last nodes are leaves and therefore at these nodes only the path metrics need to be updated, which does not use much memory, since it is one value per list. The second node has a list of size two for soft calculations and a list of size 8 for hard decision, and the third node has a list of size 8 for soft calculations and a list of size one for hard decisions. Again, when reaching the third node in second layer, memory is added for 6 more soft calculation vectors. All nodes in the last layer are leaves and do not need additional memory other than the memory for the path metrics, which is relatively small. Thus, by combining stepped SCL with SSCL decoding, only two nodes in the pruned tree use the maximum list size for their soft calculations.
  • Another interesting observation is that in the last phase of the decoding, when the list size is one, nodes of type SPC can be considered, and thus further prune the tree in this phase, saving latency, power, and memory. The pruned tree in FIG. 4(B) has an SPC node.
  • III. Skipping the First Layers of the Decoding Tree in Channels with Small Output Alphabet
  • This section describes a method according to an embodiment that saves memory and soft calculations for the first layers of the decoding tree of the SCL decoding, when the channel output alphabet is small, and the code length and list size are large. In an embodiment, a method described here is applicable to stepped and simplified SCL decoding as well.
  • Let q be the number of channel outputs, N=2n the polar code length, and L the list size. FIG. 8 is a simplified successive cancellation list (SSCL) error decoding of S-polar codes for channels with a small output alphabet, according to an embodiment. In the beginning of the decoding, assuming that the S-polar code has been represented as a perfect binary tree at step 81, the decoder receives, at step 82, a vector of length 2n of Log Likelihood Ratios (LLR), and each entry of the vector can take one of q possible values. The soft calculations in the left child of the root, vl, is a vector αl of length 2n-1 which takes values in a set of size q2. Instead of calculating these values, it is enough to know the output of the channel in coordinates 2i and 2i+1, for 0≤i<2n-1, and a lookup table of size q2 can be used to access the value of α1,i directly. Similarly, the soft calculations in the right child of the root, vr, is a vector, αr, of length 2n-1 with entries taking values in a set of size 2q2, since αr,i depends also in the hard decision bit βl,i. A lookup table of size 2q2 is used to find the value of αr,i. Thus, to skip entirely the soft calculations of the first layer and to save the memory of layer zero, 3 lookup tables of size q2 are used and the channel outputs can be represented by N log2 q bits. If, at step 83, 3q2f+N log2 q<<Nf, where f is the number of bits that represent a soft calculation value, then this technique saves both memory and calculations. Then, at step 84, values of αl,i are determined by accessing the lookup-table of size q2, and values of αr,i are determined accessing the lookup-table of size 2q2.
  • This idea can be applied to the second layer, in which there are q4 possible values for soft calculations in the first node from the left, 2q4 for the second node from the left, 4q4 for the third node, and 8q4 for the last node of this layer. Hence, using a table of size (1+2+4+8)q4=15q4 can save the first layer entirely and all the soft calculations of the second layer can be performed directly from the lookup table, the channel output vector, and the hard decisions of the second layer, for each path in the list. If, at step 85, 15q4+N log g 2<<f N (L/2+1), then memory can be saved, calculations of the first layer can be entirely skipped, and the soft calculations of layer two can be updated using the lookup table. Then, at step 86, a value of αl,i for a leftmost node is determined by accessing a lookup table of size q4; a value of α1,i for a next leftmost node is determined by accessing a lookup table of size 2q4; determining a value of α1,i for a next rightmost node is determined by accessing a lookup table of size 4q4; and determining a value of αl,i for a rightmost node is determined by accessing a lookup table of size 8q4.
  • Example 3. Assume that q=2, n=9, L=16, and f=7. 29 bits are needed to represent the channel output vector instead of 7×512=3584. To update nodes in the second layer directly using lookup table, 15×24×7=1680 bits are needed, and the memory of the first layer is saved, which is 7×16×28=28;672 bits. Overall, 30;064 bits are saved. Since the entire memory size needed is 512×16×7+512=60,928, in this example about half the memory was saved and latency was reduced.
  • IV. Latency Improving Scheduler
  • According to an embodiment, when decoding a GCC with RS code as its outer code, for the S-Polar code, there is a degree of freedom to determine the number of frames to be decoded in a certain stage, before trying to decode the cosets of that stage using an RS decoder. The block of the decoder that determines which frames to decode and when to call the RS decoder is called a scheduler. This section describes an efficient scheduler according to an embodiment and also describes its architecture.
  • A naive scheduler decodes all frames that were not yet successfully decoded at every stage. In this approach, the RS decoder must wait until the row decoder has finished. An approach according to an embodiment decodes only enough frames for the RS decoder to be likely to succeed. In an approach according to an embodiment, the RS decoder and the row decoder can work simultaneously, which increases throughput. In addition, many frames are being decoded at latter stages, which have greater probability of success, and hence, on average an approach according to an embodiment decreases the expectation of the number of times the row decoder is being used, which increases throughput.
  • To determine how many successful frames are needed for the RS decoder to have a good probability of success, use the fact that an RS code is a Maximum Distance Separable (MDS) code. Therefore, if J is the length of the RS code and K is it dimension, a successful decoding is guaranteed if the number of erased frames, for example, frames that were not yet decoded or were failed to be decoded, plus twice the number of mis-corrected frames is at most J−K. This can be written as ne+2nme≤J−K, where ne is the number of erased frames and nee is the number of mis-corrected frames. If the probability of having more than m mis-corrects is insignificant, then having K+2m decoded frames would be enough. This is because there will be ne=J−K−2m and nme≤m, and hence ne+2nme≤J−K−2m+2m=J−K, as required. The number m can be determined using the probabilities of failure and mis-correct of the row decoder at each stage and the total probability of failure that are desired.
  • For a hardware implementation according to an embodiment, a throughput efficient scheduler can operate on groups of C frames called chunks. FIG. 9 is a flowchart of a method of operation of a latency improving scheduler, according to an embodiment of the disclosure. Assuming that the S-polar code has been represented as a perfect binary tree at step 91, the scheduler submits, at step 92, C frames to multiple row decoders and counts the number of decoded frames, for example, frames that were successfully decoded. The row decoders decode the frames as described above. Once this number reach K+2m, RS decoding at this stage begins at step 93, while more chunks of frames are decoded at the same stage. If, at step 94, the RS decoder succeeds, the upcoming chunk will be decoded at the next stage of decoding, at step 95, using the cosets from the RS decoder. Otherwise, at step 96, RS decoding is repeated on more frames, since new chunks of frames are being decoded at this point.
  • FIG. 5 illustrates the throughput of an efficient chunk scheduler, according to an embodiment of the disclosure. In FIG. 5 , RSD refers to a Reed-Solomon decoder. At (1), decoding of the ith chunk at stage 0 begins. At (2), decoding of the ith chunk at stage 0 ends and decoding of the i+1st chunk at stage 0 begins. There are also enough decoded frames for the RS decoder at stage 1 to begin. At (3), the RS decoder has succeeded and the i+2nd chunk is decoded at stage 1. At (4), decoding of the i+2nd chunk at stage 1 ends, but there are not enough decoded frames to apply RS decoder of stage 2. In addition, decoding of the i+3rd frame at stage 1 begins. At (5), decoding of i+3rd frame at stage 1 ends and decoding of the i+4th frame at stage 1 begins. There are enough decoded frames and the RS decoder at stage 2 begins.
  • System Implementations
  • It is to be understood that embodiments of the present disclosure can be implemented in various forms of hardware, software, firmware, special purpose processes, or a combination thereof. In one embodiment, the present disclosure can be implemented in hardware as an application-specific integrated circuit (ASIC), or as a field programmable gate array (FPGA). In another embodiment, the present disclosure can be implemented in software as an application program tangible embodied on a computer readable program storage device. The application program can be uploaded to, and executed by, a machine comprising any suitable architecture.
  • FIG. 6 is a block diagram of a system for implementing methods of performing error correction of S-polar codes that shorten latency, according to an embodiment of the disclosure. Referring now to FIG. 6 , a computer system 61 for implementing the present disclosure can comprise, inter alia, a central processing unit (CPU) or controller 62, a memory 63 and an input/output (I/O) interface 64. The computer system 61 is generally coupled through the I/O interface 64 to a display 65 and various input devices 36 such as a mouse and a keyboard. The support circuits can include circuits such as cache, power supplies, clock circuits, and a communication bus. The memory 63 can include random access memory (RAM), read only memory (ROM), disk drive, tape drive, etc., or a combinations thereof. The present disclosure can be implemented as a routine 67 that is stored in memory 63 and executed by the CPU or controller 62 to process the signal from the signal source 68. As such, the computer system 61 is a general purpose computer system that becomes a specific purpose computer system when executing the routine 67 of the present disclosure. Alternatively, as described above, embodiments of the present disclosure can be implemented as an ASIC or FPGA 67 that is in signal communication with the CPU or controller 62 to process the signal from the signal source 68.
  • The computer system 61 also includes an operating system and micro instruction code. The various processes and functions described herein can either be part of the micro instruction code or part of the application program (or combination thereof) which is executed via the operating system. In addition, various other peripheral devices can be connected to the computer platform such as an additional data storage device and a printing device.
  • It is to be further understood that, because some of the constituent system components and method steps depicted in the accompanying figures can be implemented in software, the actual connections between the systems components (or the process steps) may differ depending upon the manner in which the present disclosure is programmed. Given the teachings of the present disclosure provided herein, one of ordinary skill in the related art will be able to contemplate these and similar implementations or configurations of the present disclosure.
  • While the present disclosure has been described in detail with reference to exemplary embodiments, those skilled in the art will appreciate that various modifications and substitutions can be made thereto without departing from the spirit and scope of the disclosure as set forth in the appended claims.

Claims (20)

1. A method of simplified successive cancellation list (SSCL) error decoding of S-polar codes, comprising:
representing an S-polar code of length N=2n as a perfect binary tree with 2N−1 nodes, wherein n is a non-negative integer, wherein, for an lth path in a list of L paths through the binary tree, 1≤l≤L, wherein an S-Polar code is a generalized concatenated code (GCC) with Reed-Solomon (RS) codes as its outer codes and polar codes as its inner codes, wherein information is encoded into an N×J array using S−1 outer codes and S inner codes, wherein S and J are non-negative integers, wherein a GCC is encoded in S stages, where at each stage s, new information bits and parities from a previous outer code are encoded into n=Kout s+1−Kout s codewords of C(s) and stored as n rows of the array, wherein C(s) is a codeword at stages, wherein Kout s is the amount of data in an outer codeword at stage s, wherein the encoded rows are mapped into a coset array in which an s+1th outer code systematically encodes a current t=Kout s−Kout s+1 columns of the coset array, and wherein parities of obtained codewords are transmitted to a next inner code for encoding;
providing a node v in a decoding path l at a depth d in the perfect binary tree with a vector, αv (l), of length 2d of soft information from a parent node, vp;
computing, for every path in a list of paths in the binary tree, a vector, αv l (l), of length 2d-1 of soft information for a left child, vl of node v;
providing node v with a vector, βv l (l), of length 2d-1 of hard decisions from the left child and using vector, βv l (l), together with αv (l), to create a soft information vector, αv r (l), of length 2d-1, and passing vector, αv r (l) to a right child of node v;
providing node v with a vector, βv r (l), of length 2d-1 of hard decisions from its right child and using vector, βv r (l), together with βv l (l), to create a hard decision vector, βv, of length 2d of hard decisions, and passing vector, βv, to its parent node;
when v is a ith leaf of the perfect tree, 0≤i<2n, then, for every path in the list of paths, updating two path metrics according to

PM i (2l) =PM i-1 (l)+ln(1+exp(−αv l (l)))

and

PM i (2l+1) =PM i-1 (l)+ln(1+exp(−αv,i (l))),
wherein PMi (l) is a loglikelihood ratio of path (l) through leaf i representing a codeword;
selecting L of 2L paths obtained by expanding a current L paths with a 0 bit or with a 1 bit with a lowest path metric; and
if an lth path is expanded by a 0 bit, setting βv (l) to 0, otherwise, setting βv (l) to 1.
2. The method of claim 1, further comprising, when v is a frozen leaf, expanding all paths by a 0 bit and setting βv (l) to 0, for every path index, 1≤l≤L, wherein a frozen leaf forces a hard decision to be zeros.
3. The method of claim 2, further comprising, for a node that is a root of a subtree with frozen leaves only (a RATE-0 node), for a node that is a root of a subtree with information leaves (a RATE-1 node), and for a node that is a root of a subtree whose leaves are all frozen leaves, except for the right most leaf (a REP node), updating an L best path metrics that compute corresponding hard decision vectors without visiting other nodes in subtrees of the binary tree.
4. The method of claim 3, wherein for a RATE-0 node and a REP node, updating the L best path metrics is performed in parallel for all paths in the list, wherein a latency of the path metric update depends on a depth of the node and not on a list size.
5. The method of claim 3, wherein updating the L best path metrics comprises, for a RATE-1 node of depth d, sequentially calculating m=min{L−1; 2n-d} bits of the hard decision wherein a latency of the path metric update depends on a list size.
6. The method of claim 1, further comprising:
selecting t paths, where t<L, for those leaves for which a correct path is likely to be among a best t paths;
applying a CRC detector on a kth leaf; and
continuing decoding until a number of paths equals one.
7. The method of claim 6, wherein determining those leaves for which a right path is likely to be among a best t<L paths comprises one or more of selecting those paths whose path metric is greater than a predetermined threshold, using machine learning to select the best t<L decoding paths, using a classifier to select the best t<L decoding paths, or using forward prediction on each path to select the best t<L decoding paths.
8. The method of claim 6, wherein different paths are processed in parallel, each path is associated with a memory that stores 2N−1 soft decision values, and each memory is assigned a unique processing block that performs soft calculations along the decoding tree.
9. The method of claim 6, wherein each node in the decoding tree stores soft calculations and hard decisions, wherein a hard decision is represented by one bit, a soft calculation is represented by a plurality of bits, and further comprising
saving, for a node at depth d, 2n-d soft calculations and 2n-d hard decisions for each path in the list; and
pruning paths from the decoding tree.
10. A method of simplified successive cancellation list (SSCL) error decoding of S-polar codes, comprising:
representing an S-polar code of length N=2n as a perfect binary tree with 2N−1 nodes, wherein n is a non-negative integer, wherein, for an lth path in a list of L paths through the binary tree, 1≤l≤L, wherein an S-Polar code is a generalized concatenated code (GCC) with Reed-Solomon (RS) codes as its outer codes and polar codes as its inner codes, wherein information is encoded into an N×J array using S−1 outer codes and S inner codes, wherein S and J are non-negative integers, wherein a GCC is encoded in S stages, where at each stage s, new information bits and parities from a previous outer code are encoded into n=Kout s+1−Kout s codewords of C(s) and stored as n rows of the array, wherein C(s) is a codeword at stages, wherein Kout s is the amount of data in an outer codeword at stage s, wherein the encoded rows are mapped into a coset array in which an s+1th outer code systematically encodes a current t=Kout s−Kout s+1 columns of the coset array, and wherein parities of obtained codewords are transmitted to a next inner code for encoding;
receiving a vector of length 2th of log likelihood ratios (LLR), wherein each entry of the vector can take one of q possible values and L is a list size,
wherein for a first layer of decoding, soft calculations in a left child of a root, vl, is a vector αl of length 2n-1 which takes values in a set of size q2, and soft calculations in a right child of the root, vr, is a vector, αr, of length 2n-1 with entries taking values in a set of size 2q2, wherein αr,i depends on a hard decision bit βl,I;
wherein when 3q2f+N log2 q<<Nf, where f is a number of bits that represent a soft calculation value, the method further comprises:
determining a value of αl,i for coordinates 2i and 2i+1, for 0≤i<2n-1, by accessing a lookup table of size q2; and
determining a value of αr,i, by accessing a lookup table of size 2q2; and
wherein for a second layer of decoding, a first node from a left has q4 possible values for soft calculations, a second node from the left has 2q4 possible values for soft calculations, a third node from the left has 4q4 for possible values for soft calculations, and a last node has 8q4 f possible values for soft calculations,
wherein when 15q4+N logq 2<<f N (L/2+1), the method further comprises:
determining a value of αl,i for a leftmost node by accessing a lookup table of size q4;
determining a value of αl,i for a next leftmost node by accessing a lookup table of size 2q4;
determining a value of αl,i for a next rightmost node by accessing a lookup table of size 4q4; and
determining a value of αl,i for a rightmost node by accessing a lookup table of size 8q4.
11. A digital electronic circuit, tangibly embodying a program of instructions executed by the digital electronic circuit to perform method steps for error decoding S-polar codes, the method steps comprising:
receiving a plurality of frames through a digital electronic communication channel;
performing error correction on the plurality of frames to generate a corrected plurality of frames; and
outputting data included in the corrected plurality of frames to the digital electronic communication channel,
wherein performing the error correction comprises submitting the plurality of frames to multiple row decoders and counting a number of frames that were successfully decoded, wherein a frame is a plurality of rows of an S-polar code array that are decoded to codewords of C(0), wherein an S-polar code of length N=2n is represented as a perfect binary tree with 2N−1 nodes, wherein, for an lth path in a list of L paths through the binary tree, 1≤l≤L, wherein an S-Polar code is a generalized concatenated code (GCC) with Reed-Solomon (RS) codes as its outer codes and polar codes as its inner codes, wherein information is encoded into an N×J array using S−1 outer codes and S inner codes, wherein a GCC is encoded in S stages, where at each stage s, new information bits and parities from a previous outer code are encoded into n=Kout s+1−Kout s codewords of C(s) and stored as n rows of the array, wherein the encoded rows are mapped into a coset array in which an s+1th outer code systematically encodes a current t=Kout s−Kout s+1 columns of the coset array, and wherein parities of obtained codewords are transmitted to a next inner code for encoding;
performing Reed-Solomon (RS) decoding on the codewords of C(0) when a number of successfully decoded frames reaches K+2m, wherein K is a dimension of a codeword of C(0) and m is a number of mis-corrects, while decoding additional pluralities of frames at a same stage; and
wherein when the RS decoding succeeds, decoding an upcoming plurality of frames at a next stage of decoding, using cosets from the RS decoding,
otherwise, repeating RS decoding on an additional plurality of frames.
12. The digital electronic circuit of claim 11, wherein decoding each row of an S-polar code array into codewords of C(0) comprises:
providing a node v in a decoding path l at a depth d in the perfect binary tree with a vector, αv (l), of length 2d of son information from a parent node, vp;
computing, for every path in a list of paths in the binary tree, a vector, αv l (l), of length 2d-1 of soft information for a left child, vl of node v;
providing node v with a vector, βv l (l), of length 2d-1 of hard decisions from the left child and using vector, βv l (l), together with αv (l), to create a soft information vector, αv r (l), of length 2d-1, and passing vector, αv r (l) to a right child of node v;
providing node v with a vector, βv r (l), of length 2d-1 of hard decisions from its right child and using vector, βv r (l), together with βv l (l), to create a hard decision vector, βv, of length 2d of hard decisions, and passing vector, βv, to its parent node;
when v is a ith leaf of the perfect tree, 0≤i<2n, then, for every path in the list of paths, updating two path metrics according to

PM i (2l) =PM i-1 (l)+ln(1+exp(−αv l (l)))

and

PM i (2l+1) =PM i-1 (l)+ln(1+exp(−αv,i (l))),
wherein PMi (l) is a loglikelihood ratio of path (l) through leaf i representing a codeword;
selecting L of 2L paths obtained by expanding a current L paths with a 0 bit or with a 1 bit with a lowest path metric; and
if an lth path is expanded by a 0 bit, setting βv (l) to 0, otherwise, setting βv (l) to 1.
13. The digital electronic circuit of claim 12, wherein the method steps further comprise, when v is a frozen leaf, expanding all paths by a 0 bit and setting βv (l) to 0, for every path index, 1≤l≤L, wherein a frozen leaf forces a hard decision to be zeros.
14. The digital electronic circuit of claim 13, wherein the method steps further comprise, for a node that is a root of a subtree with frozen leaves only (a RATE-0 node), for a node that is a root of a subtree with information leaves (a RATE-1 node), and for a node that is a root of a subtree whose leaves are all frozen leaves, except for the right most leaf (a REP node), updating an L best path metrics that compute corresponding hard decision vectors without visiting other nodes in subtrees of the binary tree.
15. The digital electronic circuit of claim 14, wherein for a RATE-0 node and a REP node, updating the L best path metrics is performed in parallel for all paths in the list, wherein a latency of the path metric update depends on a depth of the node and not on a list size.
16. The digital electronic circuit of claim 14, wherein updating the L best path metrics comprises, for a RATE-1 node of depth d, sequentially calculating m=min{L−1; 2n-d} bits of the hard decision wherein a latency of the path metric update depends on a list size.
17. The digital electronic circuit of claim 12, wherein the method steps further comprise:
selecting t paths, where t<L, for those leaves for which a correct path is likely to be among a best t paths;
applying a CRC detector on a kth leaf; and
continuing decoding until a number of paths equals one.
18. The digital electronic circuit of claim 17, wherein determining those leaves for which a right path is likely to be among a best t<L paths comprises one or more of selecting those paths whose path metric is greater than a predetermined threshold, using machine learning to select the best t<L decoding paths, using a classifier to select the best t<L decoding paths, or using forward prediction on each path to select the best t<L decoding paths.
19. The digital electronic circuit of claim 17, wherein different paths are processed in parallel, each path is associated with a memory that stores 2N−1 soft decision values, and each memory is assigned a unique processing block that performs soft calculations along the decoding tree.
20. The digital electronic circuit of claim 17, wherein each node in the decoding tree stores soft calculations and hard decisions, wherein a hard decision is represented by one bit, a soft calculation is represented by a plurality of bits, and further comprising
saving, for a node at depth d, 2n-d soft calculations and 2n-d hard decisions for each path in the list; and
pruning paths from the decoding tree.
US17/807,217 2022-06-16 2022-06-16 Acceleration of S-polar ECC throughput by scheduler Active US11848687B1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US17/807,217 US11848687B1 (en) 2022-06-16 2022-06-16 Acceleration of S-polar ECC throughput by scheduler
KR1020220087087A KR20230172992A (en) 2022-06-16 2022-07-14 Acceleration of s-polar ecc throughput by scheduler
CN202310087663.0A CN117254880A (en) 2022-06-16 2023-02-02 Acceleration of S-polarized ECC throughput by scheduler

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US17/807,217 US11848687B1 (en) 2022-06-16 2022-06-16 Acceleration of S-polar ECC throughput by scheduler

Publications (2)

Publication Number Publication Date
US11848687B1 US11848687B1 (en) 2023-12-19
US20230412197A1 true US20230412197A1 (en) 2023-12-21

Family

ID=89133813

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/807,217 Active US11848687B1 (en) 2022-06-16 2022-06-16 Acceleration of S-polar ECC throughput by scheduler

Country Status (3)

Country Link
US (1) US11848687B1 (en)
KR (1) KR20230172992A (en)
CN (1) CN117254880A (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11387848B1 (en) * 2021-03-11 2022-07-12 Samsung Electronics Co., Ltd. Hierarchical error correction code

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11387848B1 (en) * 2021-03-11 2022-07-12 Samsung Electronics Co., Ltd. Hierarchical error correction code

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Hessam Mahdavifar, Mostafa El-Khamy, Jungwon Lee, Inyup Kang; Performance Limits and Practical Decoding of Interleaved Reed-Solomon Polar Concatenated Codes; arXiv:1308.1144v1[cs.IT] 6 Aug 2013 (Year: 2013) *

Also Published As

Publication number Publication date
CN117254880A (en) 2023-12-19
KR20230172992A (en) 2023-12-26
US11848687B1 (en) 2023-12-19

Similar Documents

Publication Publication Date Title
KR100703271B1 (en) Decoding Method and Apparatus of Low Density Parity Code Using Unified processing
US10411735B1 (en) Systems and methods for an iterative decoding scheme
US7373581B2 (en) Device, program, and method for decoding LDPC codes
JP4109191B2 (en) Method and apparatus for decoding LDPC code
JP4038518B2 (en) Method and apparatus for efficiently decoding low density parity check code
TWI758748B (en) Method employed in ldpc decoder and the decoder
US7571375B2 (en) Methods for message passing decoding using simultaneous memory accesses
JP2009100222A (en) Device and method for decoding low density parity check code
US20200044668A1 (en) Method for ldpc decoding, ldpc decoder and storage device
US7464316B2 (en) Modified branch metric calculator to reduce interleaver memory and improve performance in a fixed-point turbo decoder
US10911070B2 (en) Method and apparatus for decoding polar codes based on shared node
Deng et al. Reduced-complexity deep neural network-aided channel code decoder: A case study for BCH decoder
US20230412197A1 (en) Acceleration of s-polar ecc throughput by scheduler
US10742355B2 (en) Apparatus that receives non-binary polar code and decoding method thereof
GB2403106A (en) a turbo type decoder which performs decoding iterations on sub-blocks to improve convergence
US20100185913A1 (en) Method for decoding ldpc code and the circuit thereof
US11323139B2 (en) Apparatuses and methods for mapping frozen sets between polar codes and product codes
Ma et al. Statistical learning aided decoding of BMST tail-biting convolutional code
CN109245775B (en) Decoder and method for realizing decoding
US20220337269A1 (en) Block code encoding and decoding methods, and apparatus therefor
JP7251615B2 (en) ALIGNMENT PROCESSING DEVICE, ALIGNMENT PROCESSING METHOD, AND PROGRAM
Xia et al. High throughput polar decoding using two-staged adaptive successive cancellation list decoding
US8706792B1 (en) Low-complexity q-ary LDPC decoder
CN111541457A (en) Low-time-delay low-complexity decoding method for polar code serial offset list
CN113131947A (en) Decoding method, decoder and decoding device

Legal Events

Date Code Title Description
AS Assignment

Owner name: SAMSUNG ELECTRONICS CO., LTD., KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BERMAN, AMIT;BUZAGLO, SARIT;DOUBCHAK, ARIEL;SIGNING DATES FROM 20220424 TO 20220502;REEL/FRAME:060226/0566

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCF Information on status: patent grant

Free format text: PATENTED CASE