US20140223267A1 - Radix-4 viterbi forward error correction decoding - Google Patents

Radix-4 viterbi forward error correction decoding Download PDF

Info

Publication number
US20140223267A1
US20140223267A1 US14/246,506 US201414246506A US2014223267A1 US 20140223267 A1 US20140223267 A1 US 20140223267A1 US 201414246506 A US201414246506 A US 201414246506A US 2014223267 A1 US2014223267 A1 US 2014223267A1
Authority
US
United States
Prior art keywords
circuit
metrics
codeword
signal
stages
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/246,506
Inventor
Elyar E. Gasanov
Pavel A. Panteleev
Ilya V. Neznanov
Andrey P. Sokolov
Yurii S. Shutkin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Avago Technologies International Sales Pte Ltd
Original Assignee
LSI Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LSI Corp filed Critical LSI Corp
Priority to US14/246,506 priority Critical patent/US20140223267A1/en
Assigned to DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT reassignment DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT PATENT SECURITY AGREEMENT Assignors: AGERE SYSTEMS LLC, LSI CORPORATION
Publication of US20140223267A1 publication Critical patent/US20140223267A1/en
Assigned to AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD. reassignment AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LSI CORPORATION
Assigned to AGERE SYSTEMS LLC, LSI CORPORATION reassignment AGERE SYSTEMS LLC TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENT RIGHTS (RELEASES RF 032856-0031) Assignors: DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/25Error detection or forward error correction by signal space coding, i.e. adding redundancy in the signal constellation, e.g. Trellis Coded Modulation [TCM]
    • H03M13/256Error detection or forward error correction by signal space coding, i.e. adding redundancy in the signal constellation, e.g. Trellis Coded Modulation [TCM] with trellis coding, e.g. with convolutional codes and TCM
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/37Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35
    • H03M13/39Sequence estimation, i.e. using statistical methods for the reconstruction of the original codes
    • H03M13/395Sequence estimation, i.e. using statistical methods for the reconstruction of the original codes using a collapsed trellis, e.g. M-step algorithm, radix-n architectures with n>2
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/37Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35
    • H03M13/39Sequence estimation, i.e. using statistical methods for the reconstruction of the original codes
    • H03M13/3961Arrangements of methods for branch or transition metric calculation
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/65Purpose and implementation aspects
    • H03M13/6508Flexibility, adaptability, parametrability and configurability of the implementation
    • H03M13/6519Support of multiple transmission or communication standards
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/65Purpose and implementation aspects
    • H03M13/6522Intended application, e.g. transmission or communication standard
    • H03M13/65253GPP LTE including E-UTRA
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/65Purpose and implementation aspects
    • H03M13/6522Intended application, e.g. transmission or communication standard
    • H03M13/6544IEEE 802.16 (WIMAX and broadband wireless access)

Definitions

  • the present invention relates to forward error correction codes generally and, more particularly, to a method and/or apparatus for implementing radix-4 Viterbi forward error correction decoding.
  • the parameter L(u) is called a Log-Likelihood Ratio (LLR).
  • LLR value is a convenient measure that encapsulates both soft and hard bit information in a single number. The sign of the number corresponds to the hard decision while the magnitude gives a reliability estimate.
  • the present invention concerns a method for forward error correction decoding.
  • the method generally includes steps (A) to (D).
  • Step may calculate a plurality of metrics of a codeword using a forward error correction process on a trellis having a plurality of stages.
  • Step (B) may update the metrics over each of the stages.
  • Step (C) may permute the metrics in each of the stages.
  • Step (D) may generate a signal carrying a plurality of decoded bits of the codeword.
  • the objects, features and advantages of the present invention include providing radix-4 Viterbi forward error correction decoding that may (i) support multiple communications standards, (ii) share state metrics and branch metrics calculators between Viterbi decoding and turbo decoding, (iii) share schemes and parts between convolutional codes and turbo codes, (iv) permute state metrics and paths prior to buffering in memory and/or (v) compute state metrics and branch metrics in a single clock cycle.
  • FIG. 1 is a diagram of an example trellis for a convolutional code
  • FIG. 2 is a diagram of an example closest path through the trellis
  • FIG. 3 is a block diagram of an add-compare-select circuit
  • FIG. 4 is a diagram of fragments of the trellis
  • FIG. 5 is a block diagram of a state metrics calculator circuit
  • FIG. 6 is a diagram of four successive clock cycles of work of the state metrics calculator circuit
  • FIG. 7 is a block diagram of a scheme to permute the state metrics
  • FIG. 8 is a diagram of a portion of the trellis
  • FIG. 9 is a block diagram of a calculate path circuit
  • FIG. 10 is a block diagram of a path calculation circuit
  • FIG. 11 is a block diagram of an apparatus in accordance with a preferred embodiment of the present invention.
  • Some embodiments of the present invention generally concern a reconfigurable chip (or die) for decoding an encoded signal in accordance with two or more wireless communication standards.
  • the wireless communications standards may include, but are not limited to, a Long Term Evolution (LTE) standard (3GPP Release 8), an Institute of Electrical and Electronics Engineering (IEEE) 802.16 standard (WiMAX), a Wideband-CDMA/High Speed Packet Access (WCDMA/HSPA) standard (3GPP Release 7) and a CDMA-2000/Ultra Mobile Broadband (UMB) standard (3GPP2).
  • LTE Long Term Evolution
  • IEEE Institute of Electrical and Electronics Engineering
  • WiMAX Wideband-CDMA/High Speed Packet Access
  • WCDMA/HSPA Wideband-CDMA/High Speed Packet Access
  • UMB Universal Mobile Broadband
  • Other wired and/or wireless communications standards may be implemented to meet the criteria of a particular application.
  • the FEC decoder generally includes a radix-4 turbo decoder that uses existing branch and state metrics calculators for the Viterbi process.
  • the FEC decoder generally performs at a high speed and occupies a small silicon area.
  • a processing time of the FEC decoder may be C ⁇ 2 m ⁇ K clock cycles, where m is the constraint length and C may be a constant (e.g., approximately 1/16).
  • the value C may be 33/512 for convolutional codes with a constraint length of 8 (e.g., 256 states).
  • the FEC decoder may support the convolutional codes and the turbo codes from multiple wireless communication standards, including but not limited to, LTE, WiMAX, W-CDMA, and CDMA2000.
  • the FEC decoder may decode codewords compliant with the various communications standards while operating in different configurations.
  • the Viterbi process may be considered in a logarithmic domain.
  • the decoding process in native form, may be challenging to implement because of the exponentiation and multiplication.
  • the multiplications generally become additions and the exponentials generally disappear.
  • Additions may be transformed according to standard rules. The additions are generally replaced using the Jacobi logarithm according to formula 2 as follows:
  • the Jacobi logarithm may be called a “max*” operation denoting essentially a maximum operator adjusted by a correction factor.
  • the max* operation is generally used in the Maximum A Posteriori (MAP) process.
  • MAP Maximum A Posteriori
  • a maximum operation e.g., max(x,y) may be used.
  • FIG. 1 a diagram of an example trellis 100 for a convolutional code is shown.
  • the Viterbi process is generally based on the trellis 100 .
  • the process may be performed on a block of K received symbols that correspond to the trellis 100 having a finite number of K stages.
  • a transmitted bit u may be chosen from a set ⁇ 1, +1 ⁇ .
  • Branch metrics e.g., ⁇
  • forward state metrics e.g., ⁇
  • the forward state metrics ⁇ may also be called path metrics.
  • the example illustrated generally shows only 4 states in the trellis 100 .
  • the trellis 100 may have more states (e.g., usual 256 or 64 states).
  • the Viterbi process is essentially a largest path process. Basically, a coded sequence of bits U 0 , U 1 , U 2 , . . . may correspond to a path through an encoder trellis. Due to noise in the channel, a received sequence (e.g., r) may not correspond exactly to a path through the encoder trellis.
  • the decoder generally finds a path through the trellis 100 that is closest to the received sequence r, where the measure of “closest” may be determined by the likelihood function appropriate for the channel.
  • the closest path 110 (solid line) generally corresponding to a true sequence of the transmitted bits.
  • Other paths may exist early in the decoding, buy are usually eliminated after several iterations.
  • an input of 4-radix decoder may receive six soft values (e.g., Z 1 (1) , Z 2 (1) , Z 1 (2) , Z 2 (2) , Z 1 (3) and Z 2 (3) ).
  • a branch metric for edge e in the radix-4 Viterbi process for rate 1/3 convolutional code may be computed by formula 3 as follows:
  • u 1 (i) , u 2 (i) may be parity bits associated with the edge e.
  • the forward state metrics are recursively calculated and stored per formulae 4 and 5 as follows:
  • the paths through the trellis 100 generally have as many stages as the codeword is long. For a long data stream, a significant amount of data may be stored since the decoder would have to store 2 K paths and the paths lengths grow longer with each stage. Furthermore, the long paths may result in a long decoding latency.
  • a single surviving path (e.g., maximum likelihood path) some number of stages back from the “current” stage of the trellis 100 generally permits the decoding to be ended early.
  • the initial stages of the survivor paths tend to merge if a sufficient decoding delay is allowed. Therefore, a “window” on the trellis 100 may be kept in memory.
  • the window generally includes the current stage and some number of previous stages. The number of the previous stages that the decoding looks at to make a decision is called the decoding depth, denoted by L.
  • the decoder may generate a decision on the code bits U (t ⁇ L) .
  • the Viterbi process modified with the window may be called a sliding window Viterbi process.
  • the apparatus 120 may implement an Add-Compare-Select (ACS) circuit for state metrics calculations.
  • the circuit 120 generally comprises multiple adders (or modules) 122 a to 122 d and a circuit (or module) 124 .
  • the circuits 122 a to 124 may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • a signal (e.g., SM1) and a signal (e.g., BM1) may be received by the circuit 122 a .
  • the circuit 122 b may receive a signal (e.g., SM2) and a signal (e.g., BM2).
  • a signal (e.g., SM3) and a signal (e.g., BM3) may be received by the circuit 122 c .
  • the circuit 122 d may receive a signal (e.g., SM4) and a signal (e.g., BM4).
  • the signals SM1, SM2, SM3 and SM4 collectively may form an input signal (e.g., SMIN).
  • the signals BM1, BM2, BM3 and BM4 collectively may form an input signal (e.g., BMIN).
  • the circuit 124 may receive the sums from the circuits 122 a to 122 d .
  • a signal (e.g., IND) may be generated by the circuit 124 .
  • the circuit 124 may also generate a signal (e.g., SMOUT).
  • the circuits 122 a to 122 d may implement adder circuits. Each circuit 122 a to 122 d may be operational to add a branch metric value and a respective state metric value. The sums may be the “add” portion of the add-compare-select operations.
  • the circuit 124 may implement a compare and select circuit.
  • the circuit 124 is generally operational to compare the sum values calculated by the circuits 122 a to 122 d .
  • the circuit. 124 may also be operational to select a maximum sum value from among the sum values.
  • the selected maximum sum value may be presented in the signal SMOUT as a new state metric value.
  • the new state metric value may be computed per formula 6 as follows:
  • An index value i ⁇ 0, . . . , 3 ⁇ of the selected maximum sum value may be presented in the signal IND.
  • a width of the signal IND may be 2 bits.
  • a diagram of fragments of the trellis 100 are shown.
  • the fragments may be called a fragment 130 a and a fragment 130 b .
  • a state metrics calculator of a decoder for turbo codes may process all of the state metrics simultaneously.
  • a normal state metrics calculator for turbo codes may be implemented as part of the Viterbi process in some embodiments of the present invention. Therefore, simultaneous processing of the states may involve processing half of the states per fragment 130 a and the other half per fragment 130 b.
  • the apparatus 140 may implement a State Metrics Calculator (SCM) circuit.
  • the apparatus 140 generally comprises multiple circuits (or modules) 142 a to 142 h .
  • Each circuit 142 a to 142 h may be a copy of the circuit 120 .
  • Each circuit 142 a to 142 h may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • An input signal (e.g., AIN) may be received by the circuit 140 .
  • An input signal (e.g., GIN) may also be received by the circuit 140 .
  • the circuit 140 may generate a signal (e.g., INDOUT).
  • An output signal (e.g., AOUT) may be generated and presented by the circuit 140 .
  • the signal AIN may convey multiple (e.g., 8) input state metrics values (e.g., A1 to A8).
  • the input state metrics values A1 to A8 may correspond to the 8 left nodes of the fragments 130 a and 130 b .
  • the input state metrics values A1 to A8 may be divided into several (e.g., 2) groups.
  • a group A1IN generally includes the input state metrics values A1 to A4.
  • Another group A2IN may include the input state metrics values A5 to A8.
  • the group A1IN may be received by the circuits 142 a to 142 d .
  • the group A2IN may be received by the circuits 142 e to 142 h.
  • the signal GIN may carry multiple (e.g., 32) input branch metrics values for corresponding to the edges of the fragments 130 a and 130 b of the trellis 100 .
  • the input branch metrics values may be divided into several (e.g., 8) groups.
  • a group G1 may carry multiple (e.g., 4) input branch metrics values (e.g., ⁇ 1 to ⁇ 4) to the circuit 142 a .
  • a group G2 may carry multiple input branch metrics values (e.g., ⁇ 5 to ⁇ 8) to the circuit 142 b , and so on.
  • a group G8 may carry multiple input branch metrics values (e.g., ⁇ 29 to ⁇ 32) to the circuit 142 h.
  • Each circuit 142 a to 142 h may generate a corresponding version of the signal SNOUT.
  • the signals SNOUT may carry output state metrics values (e.g., A1OUT to A8OUT).
  • the output state metrics values A1OUT to A8OUT may correspond to the 8 right nodes of the fragments 130 a and 130 b . Collectively, the output state metrics values A1OUT to A8OUT may form the signal AOUT.
  • Each circuit 142 a to 142 h may generate a corresponding version of the signal IND.
  • the signals IND may carry pairs of index values (e.g., IND1 to IND8). Each pair of index values IND1 to IND8 generally identify where a maximum may be achieved. Collectively, the index values IND1 to IND8 may form the signal INDOUT.
  • the left side of diagram 150 may be the input state metrics values and the right side may be the output state metrics values.
  • the input values generally follow successively, but the output values do not follow consecutively.
  • the 8 input state values may be read from a single memory word (e.g., at single address or block of consecutive addresses). Since the 8 output state metrics are not consecutive, the calculated output state metrics should not be written to a single memory word because in the next iteration, the written information may be read in the successive manner.
  • the registers R1, R2, R3, R4 may buffer the information from 4 successive pieces of the state metrics.
  • the scheme 160 may comprise a state metrics permutator (A_P) scheme.
  • the inputs of the scheme 160 may be the state metrics values stored to the registers R1, R2, R3 and R4.
  • the outputs may be 4 pieces of successive state metrics. Each piece generally includes 8 values and may be written in a single memory word (e.g., a single address or block of consecutive addresses). Therefore, the 4 pieces of the state metrics values may be stored simultaneously in 4 memory banks operating in parallel.
  • An identification number of each memory bank may be determined from a few bits (e.g., the two most significant bits) of the state metrics values.
  • the identification numbers generally indicate which of the memory banks should receive the data. For example, all of the state metrics values in the signal A1 may have the two most-significant bits of “00” (e.g., indicating a memory bank 00), all of the state metrics values in the signal A2 may have the two most-significant bits of “01” (e.g., indicating a memory bank 01), and so on.
  • a given binary path may correspond to each state node of the trellis 100 .
  • a last part of the given path may be the state number, therefore only the beginning part of the given path may be stored.
  • the stored beginning part of the given path may be denoted as p1p2 . . . pr, where p1 is last bit and pr is initial bit in the path.
  • FIG. 8 a diagram of a portion 170 of the trellis 100 is shown.
  • a corresponding path p 1 i p 2 i . . . p r i exists (the corresponding path may be the beginning part of full path).
  • the maximum of the state metrics values (see formulae 4 and 5) may be archived in the node q 3 (e.g., the shaded node). Therefore, the beginning part of the path corresponding to the node q (e.g., the right node) may be 10p 1 3 p 2 3 . . . p r ⁇ 2 3 and the full path may be 00q 1 .
  • the pair of bits p r ⁇ 1 3 p r 3 may be presented at the time if a global maximum for all state metrics is archived in the node q 3 .
  • the apparatus 180 may implement a Calculate Path (C_P) circuit.
  • the apparatus 180 is generally operational to calculate a path corresponding to a node of the trellis 100 .
  • the apparatus 180 generally comprises a circuit (or module) 182 .
  • the circuit 182 may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • the circuit 180 may receive a signal (e.g., PIN). Circuit 180 may also receive the signal IND. A signal (e.g., PAIR) may be generated and presented from the circuit 180 . A signal (e.g., POUT) may be generated and presented from the circuit 180 .
  • the circuit 182 may receive multiple (e.g., 4) components within the signal PIN. The bits carried by the signal IND may be received by the circuit 182 at a selection port. The circuit 182 may route the components of the signal PIN to the signals POUT and PAIR. The signal POUT, may contain the components selected by the circuit 182 and the bits from the signal IND. The signal PAIR may contain the components selected by the circuit 182 .
  • the circuit 182 may implement a multiplexer circuit. Circuit 182 is generally operational to multiplex the components received in the signal PIN based on the bits received in the signal IND.
  • the components of the signal PIN may be the several (e.g., 4) paths, each path corresponding to a respective node of the trellis 100 (e.g., the left nodes in FIG. 8 ).
  • the signal IND may be generated by the corresponding circuit 142 a to 142 h .
  • the bits of the signal IND generally show where a maximum is archived among the paths (e.g., a most likely path).
  • the signal POUT may identify a result path for the output node (e.g., the right nodes in the FIG. 8 ).
  • the signal PAIR may carry candidates (e.g., p r ⁇ 1 p r ) to the pair of bits to be presented by the decoder.
  • the apparatus (or device or circuit) 190 may implement a path calculation circuit.
  • the apparatus 190 generally comprises multiple circuits (or modules) 192 a to 192 h .
  • Each circuit 192 a to 192 h may be a copy of the circuit 180 .
  • Each circuit 192 a to 192 h may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • An input signal (e.g., PTHIN) may be received by the circuit 190 .
  • An input signal (e.g., INDIN) may also be received by the circuit 190 .
  • the circuit 190 may generate a signal (e.g., PTHOUT).
  • An output signal (e.g., PTHPAIR) may be generated and presented by the circuit 190 .
  • the signal PTHIN may convey multiple (e.g., 8) paths (e.g., P1 to P8).
  • the paths P1 to P8 may correspond to the 8 left nodes of the fragments 130 a and 130 b .
  • the paths P1 to P8 may be divided into several (e.g., 2) groups.
  • a group e.g., P1IN
  • Another group e.g., P2IN
  • the group P11N may be received by the circuits 192 a to 192 d .
  • the group P2IN may be received by the circuits 192 e to 192 h.
  • the signal INDIN may carry the pairs of index values IND1 to IND8 generated by the circuit 140 .
  • the index values IND1 may be presented to the circuit 192 a .
  • the index values IND2 may be presented to the circuit 192 b , and so on.
  • the index values IND8 may be received by the circuit 192 h.
  • Each circuit 192 a to 192 h may generate a corresponding version of the signal POUT.
  • Each signal POUT may carry a corresponding path (e.g., P1OUT to P8OUT). Collectively, the paths P1OUT to P8OUT may form the signal PTHOUT.
  • Each circuit 192 a to 192 h may generate a corresponding version of the signal PAIR.
  • Each signal PAIR may carry a respective pair of bits (e.g., PAIR1 to PAIR8). Collectively, the pairs of bits PAIR1 to PAIR8 may form the signal PTHPAIR.
  • the circuit 190 may implement a path calculation circuit.
  • the circuit 190 may be operational to calculate paths corresponding to 8 nodes of the trellis 100 simultaneously.
  • the signal PTHIN may contain the 8 beginning parts of paths corresponding to the 8 input nodes (left nodes in FIG. 4 ).
  • the signal INDIN generally carries the 8 2-bit index values that show where the maximums are archived.
  • the signal INDIN may be a delayed version of the signal INDOUT as generated by the circuit 140 .
  • the signal PTHOUT may contain the calculated 8 beginning parts of the paths corresponding to 8 output nodes (right nodes in FIG. 4 ).
  • the signal PTHPAIR generally carries the 8 candidates to the bit pair presented by the decoder.
  • the apparatus 200 may implement a forward error correction decoder.
  • the circuit 200 generally comprises a circuit (or module) 202 and one or more circuits (or modules) 204 a to 204 d .
  • the circuits 202 to 204 d may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • An input signal (e.g., DATA) may be received by the circuit 202 .
  • the signal DATA may carry one or more encoded codewords.
  • a read address signal (e.g., R_A_ADR) may be received by the circuits 204 a (shown) and the circuit 204 b .
  • a write address signal (e.g., W_A_ADR) may be received by the circuits 204 a and 204 b (shown).
  • a read address signal (e.g., R_P_ADR) may be received by the circuits 204 c (shown) and 204 d .
  • a write address signal (e.g., W_P_ADR) may be received by the circuits 204 c and 204 d (shown).
  • An output signal (e.g., MAXPAIR) may be generated by the circuit 202 .
  • An output signal (e.g., MAXADR) may also be generated by the circuit 202 .
  • Each circuit 204 a to 204 d implements a memory circuit.
  • the circuits 204 a and 204 b may be operational to store state metrics values during the iterations.
  • Circuits 204 c and 204 d may store the path data during the iterations.
  • each circuit 204 a to 204 d may be implemented as a separate memory circuit.
  • two or more of the circuits 204 a to 204 d may be formed in a common memory circuit.
  • Other memory arrangements may be implemented to meet the criteria of a particular application.
  • the circuit 204 a may have 4 memory banks. Circuit 204 a may be used to store state metrics values.
  • the signal R_A_ADR may be a read address that successively changes from 0 to 31. The 2 most significant bite of the signal R_A_ADR may identify the numbers (e.g., 00, 01, 10, 11) of the memory banks. In some embodiments, the signal R_A_ADR may have a width of 5 bits.
  • the circuit 204 b may also have 4 memory banks. Circuit 204 b may be similar to the circuit 204 a .
  • the signal W_A_ADR may be a write address.
  • the permuted state metrics may be written from the registers R1 to R4 to all 4 memory banks simultaneously.
  • the write addresses of all memory banks may be the same address.
  • the signal W_A_ADR may have a width of 3 bits.
  • the circuit 204 c may have 4 memory banks. Circuit 204 c may store paths corresponding to the nodes of the trellis 100 .
  • the signal R_P_ADR may be a read address that successively changes from 0 to 31.
  • the 2 most significant bits of the signal R_P_ADR may identify the numbers (e.g., 00, 01, 10, 11) of the memory banks.
  • the signal R_P_ADR may have a width of 5 bits.
  • the circuit 204 d may also have 4 memory banks. Circuit 204 d may be similar to the circuit 204 c .
  • the signal W_P_ADR may be a write address.
  • the permuted paths may be written from the registers R7 to R10 to all 4 memory banks simultaneously.
  • the write addresses of all memory banks may be the same address.
  • the signal W_P_ADR may have a width of 3 bits.
  • Circuit 202 generally comprises the circuit 140 , the circuit 190 , a circuit (or module) 206 , a circuit (or module) 208 , a circuit (or module) 210 , a circuit (or module) 212 and multiple registers (or modules) R0 to R10.
  • the circuits 206 to 212 and the registers R0 to R10 may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • the circuit 206 may implement a branch metric calculation circuit.
  • the circuit 206 may be operational to calculate branch metrics for the codewords received in the signal DATA.
  • the branch metrics may be presented in the signal GIN.
  • the circuit 206 may be implemented by a common design.
  • the circuit 210 may implement a state metrics permutator circuit. Circuit 210 may be operational to permute the state metrics values as described for scheme 160 in FIG. 7 . The permuted state metrics may be stored in the circuits 204 a and 204 b alternately during even stages and odd stages.
  • the circuit 212 may implement a path permutator circuit. Circuit 212 may be operational to permute the paths in a manner similar to the circuit 210 .
  • the permuted paths may be stored in the circuits 204 c and 204 d alternately during even stages and odd stages.
  • the registers R0 to R10 may implement multi-bit register circuits.
  • Register R0 may buffer state metrics values from the circuits 204 a and 204 b to the circuit 140 .
  • Registers R1, R2 and R3 may buffer state metrics values from the register R4 to the circuit 210 .
  • the register R4 may buffer state metrics values from the circuit 140 to the circuits 208 and 210 and the registers R1, R2 and R3.
  • the registers R0, R1, R2, R3 and R4 may implement (8 ⁇ Aw)-bit registers, where Aw may be a width of each state metrics value.
  • the register R5 may buffer index values from the circuit 140 to the circuit 190 .
  • the index values may be received from the circuit 140 in the signal INDOUT.
  • the index values may be transferred to the circuit 190 in the signal INDIN.
  • Register R5 generally has a width of 2 bits per index value.
  • Register R6 may buffer path data from the circuits 204 c and 204 d to the circuit 190 .
  • the registers R7, R8 and R9 may buffer paths from the register R10 to the circuit 212 .
  • Register R10 may buffer paths from the circuit 190 to the circuit 212 and the registers R7, R8 and R9.
  • Registers R6, R7, R8, R9 and R10 may implement (8 ⁇ Pw)-bit registers, where Pw is a width of each path.
  • a codeword of length 2 ⁇ N may be received by the circuit 200 in the signal DATA.
  • FIG. 11 generally illustrates an odd stage.
  • an initial set of branch metrics may be ready to present from the circuit 206 to the circuit 140 at the moment t0+d.
  • one or more read control signals may be presented to the circuit 204 a with the signal R_A_ADR set to a zero address.
  • an initial set of state metrics may be transferred from the circuit 204 a to the register R0.
  • the set of state metrics buffered in the register R0 may be transferred to the circuit 140 .
  • Register R0 may always be enabled.
  • the signal R_A_ADR may be incremented by 1 and a new portion of the state metrics may be received by the circuit 140 .
  • the circuit 140 is generally implemented as a full logic circuit (e.g., combinational hardware logic only). Therefore at the moment t0+d, the output state metrics may be presented from the circuit 140 to the register R4. Register R4 may always be enabled. At the moment t0+d+1, an enable port of register R1 may be asserted (e.g., enable state) and the initial portion of the state metrics may be stored to the register R1. At the moment t0+d+2, the enable port of the register R2 may be enabled and the next portion of the state metrics are generally stored to the register R2. At the moment t0+d+3, an enable port of register R3 is asserted and another portion of the state metrics may be stored to the register R3.
  • an enable port of register R1 may be asserted (e.g., enable state) and the initial portion of the state metrics may be stored to the register R1.
  • the enable port of the register R2 may be enabled and the next portion of the state metrics are generally stored to the register R2.
  • the circuit 210 may receive the 4 portions of state metrics and at the same time from the registers R1 to R4, The circuit 210 may write the 4 permuted portions (e.g., successive state metrics values) into the 4 memory banks of the circuit 204 b . Therefore, at the moment t0+d+4, one or more write control signals should be presented to the circuit 204 b with the signal W_A_ADR set to the zero address.
  • the write control signals to the circuit 204 b may be asserted, the signal W_A_ADR may be incremented by 1 and a new portion from among the 32 state metrics values may be written in the 4 memory banks of the circuit 204 b.
  • the signal INDOUT may transfer index values from the circuit 140 to the register R5.
  • Register R5 may always be enabled.
  • the index values may be presented from the register R5 to the circuit 190 in the signal INDIN. Therefore, at the moment t0+d ⁇ 1, the read control signals may be presented to the circuit 204 c with the signal R_P_ADR set to the zero address.
  • an initial set of paths may be transferred from the circuit 204 c to the register R6.
  • the initial set of paths may be transferred from the register R6 to the circuit 190 .
  • Register R6 may always be enabled. In each subsequent clock cycle, the signal R_P_ADR may be incremented by 1 and a new portion of the paths is presented to the circuit 190 .
  • the circuit 190 is generally implemented as a full logic circuit (e.g., combinational hardware logic only). Therefore, at the moment t0+d+1, the output paths may be presented from the circuit 190 to the register R10.
  • Register R10 may always be enabled.
  • the enable port of register R7 may be asserted and the initial portion of the paths is stored in the register R7.
  • the enable port of register R8 may be asserted and a next portion of the paths is stored in the register R8.
  • the enable port of register R9 may be enabled and another portion of the paths may be stored in the register R9.
  • the 4 portions of the paths may be transferred from the registers R7 to R10 to the circuit 212 in parallel.
  • the circuit 212 may write the 4 permuted portions (e.g., successive paths) into the 4 memory banks of the circuit 204 d simultaneously. Therefore, at the moment t0+d+5, the write control signals may be received by the circuit 204 d with the signal W_P_ADR set to the zero address.
  • the write control signals to the circuit 204 d may be asserted, the signal W_P_ADR may be incremented by 1 and a new portion from among the 32 paths may be written in the 4 memory banks of the circuit 204 d.
  • State metrics values and paths may be received by the circuit 208 beginning at the moment t0+d+1.
  • the signal R_A_ADR may become 31 and a last portion of the state metrics may be read from the circuit 204 a .
  • a last portion of the state metrics may be written to the circuit 204 b .
  • reads for the next stage may be started. Therefore, with a 2 clock cycle pause (delay) in each stage, a maximum of 34 clock cycles (2+256/8 clock cycles) may be used per stage.
  • reading from circuit 204 b for the next stage may begin at the moment t0+d+30 because the initial portion of the state metrics of the next stage is ready in the circuit 204 b . Therefore, the number of clock cycles (iterations) per stage may be reduced from 34 to 33.
  • the signal R_A_ADR becomes 31 and the last portion of the state metrics of the current stage may be read from the circuit 204 a .
  • the signal R_A_ADR may be set to the zero address and presented to the circuit 204 b .
  • a last portion of the state metrics of the current stage may be written to the circuit 204 b .
  • the initial portion of state metrics of next stage may be transferred from the circuit 204 b to the register R0 and the signal R_A_ADR may be incremented.
  • the signal MAXADR may contain q1q2 . . . q8.
  • the last iteration has been written to the circuits 204 b and 204 d .
  • FIGS. 3 , 5 - 7 and 9 - 11 may be implemented using one or more of a conventional general purpose processor, digital, computer, microprocessor, microcontroller, RISC (reduced instruction set computer) processor, CISC (complex instruction set computer) processor, SIMD (single instruction multiple data) processor, signal processor, central processing unit (CPU), arithmetic logic unit (ALU), video digital signal processor (VDSP) and/or similar computational machines, programmed according to the teachings of the present specification, as will be apparent to those skilled in the relevant art(s).
  • RISC reduced instruction set computer
  • CISC complex instruction set computer
  • SIMD single instruction multiple data processor
  • signal processor central processing unit
  • CPU central processing unit
  • ALU arithmetic logic unit
  • VDSP video digital signal processor
  • the present invention may also be implemented by the preparation of ASICs (application specific integrated circuits), Platform ASICs, FPGAs (field programmable gate arrays), PLDs (programmable logic devices), CPLDs (complex programmable logic device), sea-of-gates, RFICs (radio frequency integrated circuits), ASSPs (application specific standard products), one or more monolithic integrated circuits, one or more chips or die arranged as flip-chip modules and/or multi-chip modules or by interconnecting an appropriate network of conventional component circuits, as is described herein, modifications of which will be readily apparent to those skilled in the art(s).
  • ASICs application specific integrated circuits
  • FPGAs field programmable gate arrays
  • PLDs programmable logic devices
  • CPLDs complex programmable logic device
  • sea-of-gates RFICs (radio frequency integrated circuits)
  • ASSPs application specific standard products
  • monolithic integrated circuits one or more chips or die arranged as flip-chip modules and/or multi-chip
  • the present invention thus may also include a computer product which may be a storage medium or media and/or a transmission medium or media including instructions which may be used to program a machine to perform one or more processes or methods in accordance with the present invention.
  • a computer product which may be a storage medium or media and/or a transmission medium or media including instructions which may be used to program a machine to perform one or more processes or methods in accordance with the present invention.
  • Execution of instructions contained in the computer product by the machine, along with operations of surrounding circuitry may transform input data into one or more files on the storage medium and/or one or more output signals representative of a physical object or substance, such as an audio and/or visual depiction.
  • the storage medium may include, but is not limited to, any type of disk including floppy disk, hard drive, magnetic disk, optical disk, CD-ROM, DVD and magneto-optical disks and circuits such as ROMs (read-only memories), RAMS (random access memories), EPROMs (electronically programmable ROMs), EEPROMs (electronically erasable ROMs), UVPROM (ultra-violet erasable ROMs), Flash memory, magnetic cards, optical cards, and/or any type of media suitable for storing electronic instructions.
  • ROMs read-only memories
  • RAMS random access memories
  • EPROMs electroly programmable ROMs
  • EEPROMs electro-erasable ROMs
  • UVPROM ultra-violet erasable ROMs
  • Flash memory magnetic cards, optical cards, and/or any type of media suitable for storing electronic instructions.
  • the elements of the invention may form part or all of one or more devices, units, components, systems, machines and/or apparatuses.
  • the devices may include, but are not limited to, servers, workstations, storage array controllers, storage systems, personal computers, laptop computers, notebook computers, palm computers, personal digital assistants, portable electronic devices, battery powered devices, set-top boxes, encoders, decoders, transcoders, compressors, decompressors, pre-processors, post-processors, transmitters, receivers, transceivers, cipher circuits, cellular telephones, digital cameras, positioning and/or navigation systems, medical equipment, heads-up displays, wireless devices, audio recording, storage and/or playback devices, video recording, storage and/or playback devices, game platforms, peripherals and/or multi-chip modules.
  • Those skilled in the relevant art(s) would understand that the elements of the invention may be implemented in other types of devices to meet the criteria of a particular application.
  • the signals illustrated in FIGS. 3 , 5 and 9 - 11 represent logical data flows.
  • the logical data flows are generally representative of physical data transferred between the respective blocks by, for example, address, data, and control signals, and/or busses.
  • the system represented by the circuit 100 may be implemented in hardware, software or a combination of hardware and software according to the teachings of the present disclosure, as would be apparent to those skilled in the relevant art(s).
  • the term “simultaneously” is meant to describe events that share some common time period but the term is not meant to be limited to events that begin at the same point in time, end at the same point in time, or have the same duration.

Abstract

A method for forward error correction decoding. The method generally includes steps (A) to (D). Step (A) may calculate a plurality of metrics of a codeword using a forward error correction process on a trellis having a plurality of stages. Step (B) may update the metrics over each of the stages. Step (C) may permute the metrics in each of the stages. Step (D) may generate a signal carrying a plurality of decoded bits of the codeword.

Description

  • This application relates to U.S. Ser. No. 13/158,636, filed Jun. 13, 2011, which claims the benefit of Russian Application No. 2010149150, filed Dec. 2, 2010, each of which is incorporated by reference in their entirety.
  • The present application is related to co-pending Russian Application No. 2010148337 filed Nov. 29, 2010, and U.S. application Ser. No. 13/156,580 filed Jun. 9, 2011 which are hereby incorporated by reference in their entirety.
  • FIELD OF THE INVENTION
  • The present invention relates to forward error correction codes generally and, more particularly, to a method and/or apparatus for implementing radix-4 Viterbi forward error correction decoding.
  • BACKGROUND OF THE INVENTION
  • Wireless standards are using extensively in convolutional codes. A Viterbi decoding convolutional code often forms part of common convolutional decoders. The original Viterbi process, described in the late 1960's, has been overlooked in favor of less complex Viterbi processes.
  • The original derivation of the Viterbi process was in the probability domain. The output of the process is a sequence of decoded bits along with corresponding reliabilities. “Soft” reliability information is described by the A Posteriori Probability (APP) (i.e., P(u|y)). For an estimate of bit u (−1/+1) having received symbol y, an optimum soft output (i.e., L(u)) is calculated according to formula 1 as follows:
  • L ( u ) = ln ( P ( u = + 1 y ) ) P ( u = - 1 y ) ( 1 )
  • The parameter L(u) is called a Log-Likelihood Ratio (LLR). The LLR value is a convenient measure that encapsulates both soft and hard bit information in a single number. The sign of the number corresponds to the hard decision while the magnitude gives a reliability estimate.
  • SUMMARY OF THE INVENTION
  • The present invention concerns a method for forward error correction decoding. The method generally includes steps (A) to (D). Step may calculate a plurality of metrics of a codeword using a forward error correction process on a trellis having a plurality of stages. Step (B) may update the metrics over each of the stages. Step (C) may permute the metrics in each of the stages. Step (D) may generate a signal carrying a plurality of decoded bits of the codeword.
  • The objects, features and advantages of the present invention include providing radix-4 Viterbi forward error correction decoding that may (i) support multiple communications standards, (ii) share state metrics and branch metrics calculators between Viterbi decoding and turbo decoding, (iii) share schemes and parts between convolutional codes and turbo codes, (iv) permute state metrics and paths prior to buffering in memory and/or (v) compute state metrics and branch metrics in a single clock cycle.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • These and other objects, features and advantages of the present invention will be apparent from the following detailed description and the appended claims and drawings in which:
  • FIG. 1 is a diagram of an example trellis for a convolutional code;
  • FIG. 2 is a diagram of an example closest path through the trellis;
  • FIG. 3 is a block diagram of an add-compare-select circuit;
  • FIG. 4 is a diagram of fragments of the trellis;
  • FIG. 5 is a block diagram of a state metrics calculator circuit;
  • FIG. 6 is a diagram of four successive clock cycles of work of the state metrics calculator circuit;
  • FIG. 7 is a block diagram of a scheme to permute the state metrics;
  • FIG. 8 is a diagram of a portion of the trellis;
  • FIG. 9 is a block diagram of a calculate path circuit;
  • FIG. 10 is a block diagram of a path calculation circuit; and
  • FIG. 11 is a block diagram of an apparatus in accordance with a preferred embodiment of the present invention.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • Some embodiments of the present invention generally concern a reconfigurable chip (or die) for decoding an encoded signal in accordance with two or more wireless communication standards. The wireless communications standards may include, but are not limited to, a Long Term Evolution (LTE) standard (3GPP Release 8), an Institute of Electrical and Electronics Engineering (IEEE) 802.16 standard (WiMAX), a Wideband-CDMA/High Speed Packet Access (WCDMA/HSPA) standard (3GPP Release 7) and a CDMA-2000/Ultra Mobile Broadband (UMB) standard (3GPP2). Other wired and/or wireless communications standards may be implemented to meet the criteria of a particular application.
  • Some embodiments may provide a Forward Error Correcting (FEC) decoder. The FEC decoder generally includes a radix-4 turbo decoder that uses existing branch and state metrics calculators for the Viterbi process. The FEC decoder generally performs at a high speed and occupies a small silicon area. For a codeword of length K, a processing time of the FEC decoder may be C×2m×K clock cycles, where m is the constraint length and C may be a constant (e.g., approximately 1/16). For example, the value C may be 33/512 for convolutional codes with a constraint length of 8 (e.g., 256 states). The FEC decoder may support the convolutional codes and the turbo codes from multiple wireless communication standards, including but not limited to, LTE, WiMAX, W-CDMA, and CDMA2000. The FEC decoder may decode codewords compliant with the various communications standards while operating in different configurations.
  • The Viterbi process may be considered in a logarithmic domain. The decoding process, in native form, may be challenging to implement because of the exponentiation and multiplication. By implementing the process in the logarithmic domain, the multiplications generally become additions and the exponentials generally disappear. Additions may be transformed according to standard rules. The additions are generally replaced using the Jacobi logarithm according to formula 2 as follows:

  • max*(x,y)=ln(e x +e y)=max(x,y)+ln(1+ê(−|x−y|))  (2)
  • The Jacobi logarithm may be called a “max*” operation denoting essentially a maximum operator adjusted by a correction factor. The max* operation is generally used in the Maximum A Posteriori (MAP) process. In the Viterbi process, a maximum operation (e.g., max(x,y)) may be used.
  • Referring to FIG. 1, a diagram of an example trellis 100 for a convolutional code is shown. The Viterbi process is generally based on the trellis 100. The process may be performed on a block of K received symbols that correspond to the trellis 100 having a finite number of K stages. A transmitted bit u may be chosen from a set {−1, +1}. Branch metrics (e.g., γ) and forward state metrics (e.g., α) are generally shown in the trellis 100. The forward state metrics α may also be called path metrics. The example illustrated generally shows only 4 states in the trellis 100. In a convolutional code, the trellis 100 may have more states (e.g., usual 256 or 64 states).
  • The Viterbi process is essentially a largest path process. Basically, a coded sequence of bits U0, U1, U2, . . . may correspond to a path through an encoder trellis. Due to noise in the channel, a received sequence (e.g., r) may not correspond exactly to a path through the encoder trellis. The decoder generally finds a path through the trellis 100 that is closest to the received sequence r, where the measure of “closest” may be determined by the likelihood function appropriate for the channel.
  • Referring to FIG. 2, a diagram of an example closest path 110 through the trellis 100 is shown. The closest path 110 (solid line) generally corresponding to a true sequence of the transmitted bits. Other paths may exist early in the decoding, buy are usually eliminated after several iterations.
  • Consider a case involving a convolutional code with rate 1/3. At each clock cycle, an input of 4-radix decoder may receive six soft values (e.g., Z1 (1), Z2 (1), Z1 (2), Z2 (2), Z1 (3) and Z2 (3)). A branch metric for edge e in the radix-4 Viterbi process for rate 1/3 convolutional code may be computed by formula 3 as follows:
  • γ ( e ) = i = 1 3 ( ( - 1 ) u 1 ( i ) Z 1 ( i ) + ( - 1 ) u 2 ( i ) Z 2 ( i ) ) ( 3 )
  • where u1 (i), u2 (i) may be parity bits associated with the edge e. The forward state metrics are recursively calculated and stored per formulae 4 and 5 as follows:
  • α 0 ( s ) = { 0 , s = 0 - , s 0 ( 4 ) α t + 1 ( s ) = max s c s { α t ( s ) + γ ( e ) } , t = 0 , 1 , , K - 2 ( 5 )
  • If a data stream is decoded using the Viterbi process as described above, the paths through the trellis 100 generally have as many stages as the codeword is long. For a long data stream, a significant amount of data may be stored since the decoder would have to store 2K paths and the paths lengths grow longer with each stage. Furthermore, the long paths may result in a long decoding latency.
  • In many cases, a single surviving path (e.g., maximum likelihood path) some number of stages back from the “current” stage of the trellis 100 generally permits the decoding to be ended early. The initial stages of the survivor paths tend to merge if a sufficient decoding delay is allowed. Therefore, a “window” on the trellis 100 may be kept in memory. The window generally includes the current stage and some number of previous stages. The number of the previous stages that the decoding looks at to make a decision is called the decoding depth, denoted by L. At time t, the decoder may generate a decision on the code bits U(t−L). An incorrect decoding decision on a finite decoding depth, called a truncation error, is typically small if the decoding depth is sufficiently large. For example, if a decoding depth of about five to ten constraint lengths is employed, little loss of performance due to truncation error may be experienced compared to using the full length. If the constraint length is m, the number of states (e.g, S) may be 2m. Considering convolutional codes with constraint lengths m=6 and m=8, the decoding depth may be set to approximately 40. The Viterbi process modified with the window may be called a sliding window Viterbi process.
  • Referring to FIG. 3, a block diagram of an apparatus 120 is shown. The apparatus (or device or circuit) 120 may implement an Add-Compare-Select (ACS) circuit for state metrics calculations. The circuit 120 generally comprises multiple adders (or modules) 122 a to 122 d and a circuit (or module) 124. The circuits 122 a to 124 may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • A signal (e.g., SM1) and a signal (e.g., BM1) may be received by the circuit 122 a. The circuit 122 b may receive a signal (e.g., SM2) and a signal (e.g., BM2). A signal (e.g., SM3) and a signal (e.g., BM3) may be received by the circuit 122 c. The circuit 122 d may receive a signal (e.g., SM4) and a signal (e.g., BM4). The signals SM1, SM2, SM3 and SM4 collectively may form an input signal (e.g., SMIN). The signals BM1, BM2, BM3 and BM4 collectively may form an input signal (e.g., BMIN). The circuit 124 may receive the sums from the circuits 122 a to 122 d. A signal (e.g., IND) may be generated by the circuit 124. The circuit 124 may also generate a signal (e.g., SMOUT).
  • The circuits 122 a to 122 d may implement adder circuits. Each circuit 122 a to 122 d may be operational to add a branch metric value and a respective state metric value. The sums may be the “add” portion of the add-compare-select operations.
  • The circuit 124 may implement a compare and select circuit. The circuit 124 is generally operational to compare the sum values calculated by the circuits 122 a to 122 d. The circuit. 124 may also be operational to select a maximum sum value from among the sum values. The selected maximum sum value may be presented in the signal SMOUT as a new state metric value. The new state metric value may be computed per formula 6 as follows:
  • SMOUT = max i { 0 , , 3 } { SM i + BM i } ( 6 )
  • An index value iε{0, . . . , 3} of the selected maximum sum value may be presented in the signal IND. A width of the signal IND may be 2 bits.
  • Referring to FIG. 4, a diagram of fragments of the trellis 100 are shown. The fragments may be called a fragment 130 a and a fragment 130 b. Since the number of states of an encoder for turbo codes is generally fixed (e.g., 8 states), a state metrics calculator of a decoder for turbo codes may process all of the state metrics simultaneously. A normal state metrics calculator for turbo codes may be implemented as part of the Viterbi process in some embodiments of the present invention. Therefore, simultaneous processing of the states may involve processing half of the states per fragment 130 a and the other half per fragment 130 b.
  • Referring to FIG. 5, a block diagram of an apparatus 140 is shown. The apparatus (or device or circuit) 140 may implement a State Metrics Calculator (SCM) circuit. The apparatus 140 generally comprises multiple circuits (or modules) 142 a to 142 h. Each circuit 142 a to 142 h may be a copy of the circuit 120. Each circuit 142 a to 142 h may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • An input signal (e.g., AIN) may be received by the circuit 140. An input signal (e.g., GIN) may also be received by the circuit 140. The circuit 140 may generate a signal (e.g., INDOUT). An output signal (e.g., AOUT) may be generated and presented by the circuit 140.
  • The signal AIN may convey multiple (e.g., 8) input state metrics values (e.g., A1 to A8). The input state metrics values A1 to A8 may correspond to the 8 left nodes of the fragments 130 a and 130 b. The input state metrics values A1 to A8 may be divided into several (e.g., 2) groups. A group A1IN generally includes the input state metrics values A1 to A4. Another group A2IN may include the input state metrics values A5 to A8. The group A1IN may be received by the circuits 142 a to 142 d. The group A2IN may be received by the circuits 142 e to 142 h.
  • The signal GIN may carry multiple (e.g., 32) input branch metrics values for corresponding to the edges of the fragments 130 a and 130 b of the trellis 100. The input branch metrics values may be divided into several (e.g., 8) groups. A group G1 may carry multiple (e.g., 4) input branch metrics values (e.g., γ1 to γ4) to the circuit 142 a. A group G2 may carry multiple input branch metrics values (e.g., γ5 to γ8) to the circuit 142 b, and so on. A group G8 may carry multiple input branch metrics values (e.g., γ29 to γ32) to the circuit 142 h.
  • Each circuit 142 a to 142 h may generate a corresponding version of the signal SNOUT. The signals SNOUT may carry output state metrics values (e.g., A1OUT to A8OUT). The output state metrics values A1OUT to A8OUT may correspond to the 8 right nodes of the fragments 130 a and 130 b. Collectively, the output state metrics values A1OUT to A8OUT may form the signal AOUT.
  • Each circuit 142 a to 142 h may generate a corresponding version of the signal IND. The signals IND may carry pairs of index values (e.g., IND1 to IND8). Each pair of index values IND1 to IND8 generally identify where a maximum may be achieved. Collectively, the index values IND1 to IND8 may form the signal INDOUT.
  • Referring to FIG. 6, a diagram 150 of four successive clock cycles of work of the circuit 140 is shown. A record q1q2 . . . qm may denote the state metrics value α(q), where q=q1q2 . . . qm. The left side of diagram 150 may be the input state metrics values and the right side may be the output state metrics values. An initial iteration of the clock cycle may occur at a time t=t′. The next iteration may occur at a time t=t′+1. Another iteration may occur at a time t=t′+2. The final iteration may occur at a time t=t′+3. As illustrated in FIG. 6, the input values generally follow successively, but the output values do not follow consecutively.
  • Since 8 successive input state metrics values are processed at the same time, the 8 input state values may be read from a single memory word (e.g., at single address or block of consecutive addresses). Since the 8 output state metrics are not consecutive, the calculated output state metrics should not be written to a single memory word because in the next iteration, the written information may be read in the successive manner. Therefore, the values obtained for the moment t=t′ may be stored in a register (e.g., R1), the values obtained for the moment t=t′+1 may be stored in another register (e.g., R2), the values obtained for the moment t=t′+2 may be stored in a register (e.g., R3), and the values obtained for the moment t=t′+3 may be stored in a register (e.g., R4). After 4 clock cycles, the registers R1, R2, R3, R4 may buffer the information from 4 successive pieces of the state metrics.
  • Referring to FIG. 7, a block diagram of a scheme 160 to permute the state metrics is shown. The scheme 160 may comprise a state metrics permutator (A_P) scheme. A record q1q2 . . . qm may denote the state metrics value α(q), where q=q1q2 . . . qm. The inputs of the scheme 160 may be the state metrics values stored to the registers R1, R2, R3 and R4. The outputs may be 4 pieces of successive state metrics. Each piece generally includes 8 values and may be written in a single memory word (e.g., a single address or block of consecutive addresses). Therefore, the 4 pieces of the state metrics values may be stored simultaneously in 4 memory banks operating in parallel. An identification number of each memory bank may be determined from a few bits (e.g., the two most significant bits) of the state metrics values. The identification numbers generally indicate which of the memory banks should receive the data. For example, all of the state metrics values in the signal A1 may have the two most-significant bits of “00” (e.g., indicating a memory bank 00), all of the state metrics values in the signal A2 may have the two most-significant bits of “01” (e.g., indicating a memory bank 01), and so on.
  • In the sliding window Viterbi process, a given binary path may correspond to each state node of the trellis 100. The length of the given path may match the decoding depth L. If the constraint length m=8, the decoding depth L may be 40. A last part of the given path may be the state number, therefore only the beginning part of the given path may be stored. A length of the beginning part of the given path may be r=L−m. The stored beginning part of the given path may be denoted as p1p2 . . . pr, where p1 is last bit and pr is initial bit in the path.
  • Referring to FIG. 8, a diagram of a portion 170 of the trellis 100 is shown. Suppose that to each left node qi (i=1, 2, 3, 4), a corresponding path p1 ip2 i . . . pr i exists (the corresponding path may be the beginning part of full path). Suppose the maximum of the state metrics values (see formulae 4 and 5) may be archived in the node q3 (e.g., the shaded node). Therefore, the beginning part of the path corresponding to the node q (e.g., the right node) may be 10p1 3p2 3 . . . pr−2 3 and the full path may be 00q1 . . . qm−2101 3p2 3 . . . pr−2 3. The pair of bits pr−1 3pr 3 may be presented at the time if a global maximum for all state metrics is archived in the node q3.
  • Referring to FIG. 9, a block diagram of an apparatus 180 is shown. The apparatus (or device or circuit) 180 may implement a Calculate Path (C_P) circuit. The apparatus 180 is generally operational to calculate a path corresponding to a node of the trellis 100. The apparatus 180 generally comprises a circuit (or module) 182. The circuit 182 may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • The circuit 180 may receive a signal (e.g., PIN). Circuit 180 may also receive the signal IND. A signal (e.g., PAIR) may be generated and presented from the circuit 180. A signal (e.g., POUT) may be generated and presented from the circuit 180. The circuit 182 may receive multiple (e.g., 4) components within the signal PIN. The bits carried by the signal IND may be received by the circuit 182 at a selection port. The circuit 182 may route the components of the signal PIN to the signals POUT and PAIR. The signal POUT, may contain the components selected by the circuit 182 and the bits from the signal IND. The signal PAIR may contain the components selected by the circuit 182.
  • The circuit 182 may implement a multiplexer circuit. Circuit 182 is generally operational to multiplex the components received in the signal PIN based on the bits received in the signal IND. The components of the signal PIN may be the several (e.g., 4) paths, each path corresponding to a respective node of the trellis 100 (e.g., the left nodes in FIG. 8). The signal IND may be generated by the corresponding circuit 142 a to 142 h. The bits of the signal IND generally show where a maximum is archived among the paths (e.g., a most likely path). The signal POUT may identify a result path for the output node (e.g., the right nodes in the FIG. 8). The signal PAIR may carry candidates (e.g., pr−1pr) to the pair of bits to be presented by the decoder.
  • Referring to FIG. 10, a block diagram of an apparatus 190 is shown. The apparatus (or device or circuit) 190 may implement a path calculation circuit. The apparatus 190 generally comprises multiple circuits (or modules) 192 a to 192 h. Each circuit 192 a to 192 h may be a copy of the circuit 180. Each circuit 192 a to 192 h may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • An input signal (e.g., PTHIN) may be received by the circuit 190. An input signal (e.g., INDIN) may also be received by the circuit 190. The circuit 190 may generate a signal (e.g., PTHOUT). An output signal (e.g., PTHPAIR) may be generated and presented by the circuit 190.
  • The signal PTHIN may convey multiple (e.g., 8) paths (e.g., P1 to P8). The paths P1 to P8 may correspond to the 8 left nodes of the fragments 130 a and 130 b. The paths P1 to P8 may be divided into several (e.g., 2) groups. A group (e.g., P1IN) generally includes the paths P1 to P4. Another group (e.g., P2IN) may include the paths P5 to P8. The group P11N may be received by the circuits 192 a to 192 d. The group P2IN may be received by the circuits 192 e to 192 h.
  • The signal INDIN may carry the pairs of index values IND1 to IND8 generated by the circuit 140. The index values IND1 may be presented to the circuit 192 a. The index values IND2 may be presented to the circuit 192 b, and so on. The index values IND8 may be received by the circuit 192 h.
  • Each circuit 192 a to 192 h may generate a corresponding version of the signal POUT. Each signal POUT may carry a corresponding path (e.g., P1OUT to P8OUT). Collectively, the paths P1OUT to P8OUT may form the signal PTHOUT.
  • Each circuit 192 a to 192 h may generate a corresponding version of the signal PAIR. Each signal PAIR may carry a respective pair of bits (e.g., PAIR1 to PAIR8). Collectively, the pairs of bits PAIR1 to PAIR8 may form the signal PTHPAIR.
  • The circuit 190 may implement a path calculation circuit. The circuit 190 may be operational to calculate paths corresponding to 8 nodes of the trellis 100 simultaneously. The signal PTHIN may contain the 8 beginning parts of paths corresponding to the 8 input nodes (left nodes in FIG. 4). The signal INDIN generally carries the 8 2-bit index values that show where the maximums are archived. The signal INDIN may be a delayed version of the signal INDOUT as generated by the circuit 140. The signal PTHOUT may contain the calculated 8 beginning parts of the paths corresponding to 8 output nodes (right nodes in FIG. 4). The signal PTHPAIR generally carries the 8 candidates to the bit pair presented by the decoder.
  • Referring to FIG. 11, a block diagram of an apparatus 200 is shown in accordance with a preferred embodiment of the present invention. The apparatus (or device or circuit) 200 may implement a forward error correction decoder. The circuit 200 generally comprises a circuit (or module) 202 and one or more circuits (or modules) 204 a to 204 d. The circuits 202 to 204 d may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • An input signal (e.g., DATA) may be received by the circuit 202. The signal DATA may carry one or more encoded codewords. A read address signal (e.g., R_A_ADR) may be received by the circuits 204 a (shown) and the circuit 204 b. A write address signal (e.g., W_A_ADR) may be received by the circuits 204 a and 204 b (shown). A read address signal (e.g., R_P_ADR) may be received by the circuits 204 c (shown) and 204 d. A write address signal (e.g., W_P_ADR) may be received by the circuits 204 c and 204 d (shown). An output signal (e.g., MAXPAIR) may be generated by the circuit 202. An output signal (e.g., MAXADR) may also be generated by the circuit 202.
  • The circuit 202 may implement a decoder circuit. Circuit 202 is generally operational to calculate a plurality of state metrics and a plurality of paths of a codeword using a forward error correction process on the trellis 100. If the codeword length K=2N, the trellis 100 generally has N stages. Approximately (2(m−3))+1 iterations (clock cycles) may be performed by the circuit 202 in each stage to update the state metrics and the paths, where m is the constraint length. The circuit 202 may also be operational to update the state metrics and paths over the N stages. Multiple sets of the state metrics and paths may be permuted in each of the N stages. When the initial iterations have been completed, the circuit 202 may present the initial two decoded bits in the signal MAXPAIR. The iterations may continue to present additional decoded bit pairs until the entire codeword has been decoded.
  • Each circuit 204 a to 204 d implements a memory circuit. The circuits 204 a and 204 b may be operational to store state metrics values during the iterations. Circuits 204 c and 204 d may store the path data during the iterations. In some embodiments, each circuit 204 a to 204 d may be implemented as a separate memory circuit. In other embodiments, two or more of the circuits 204 a to 204 d may be formed in a common memory circuit. Other memory arrangements may be implemented to meet the criteria of a particular application.
  • In some embodiments, the circuit 204 a may have 4 memory banks. Circuit 204 a may be used to store state metrics values. The width of each memory bank may be 8*Aw bits, where Aw is width of state metrics values (e.g., Aw=12). A size of each memory bank may be 256/(8*4)=8 addressable words. The signal R_A_ADR may be a read address that successively changes from 0 to 31. The 2 most significant bite of the signal R_A_ADR may identify the numbers (e.g., 00, 01, 10, 11) of the memory banks. In some embodiments, the signal R_A_ADR may have a width of 5 bits.
  • The circuit 204 b may also have 4 memory banks. Circuit 204 b may be similar to the circuit 204 a. The signal W_A_ADR may be a write address. The permuted state metrics may be written from the registers R1 to R4 to all 4 memory banks simultaneously. The write addresses of all memory banks may be the same address. The signal W_A_ADR may have a width of 3 bits.
  • In some embodiments, the circuit 204 c may have 4 memory banks. Circuit 204 c may store paths corresponding to the nodes of the trellis 100. The width of each memory bank may be 8×Pw bits, where Pw may be a width of each path (e.g., Pw=32), A size of each memory bank may be 256/(8*4)=8 addressable words. The signal R_P_ADR may be a read address that successively changes from 0 to 31. The 2 most significant bits of the signal R_P_ADR may identify the numbers (e.g., 00, 01, 10, 11) of the memory banks. In some embodiments, the signal R_P_ADR may have a width of 5 bits.
  • The circuit 204 d may also have 4 memory banks. Circuit 204 d may be similar to the circuit 204 c. The signal W_P_ADR may be a write address. The permuted paths may be written from the registers R7 to R10 to all 4 memory banks simultaneously. The write addresses of all memory banks may be the same address. The signal W_P_ADR may have a width of 3 bits.
  • Circuit 202 generally comprises the circuit 140, the circuit 190, a circuit (or module) 206, a circuit (or module) 208, a circuit (or module) 210, a circuit (or module) 212 and multiple registers (or modules) R0 to R10. The circuits 206 to 212 and the registers R0 to R10 may represent one or more modules and/or blocks that may be implemented as hardware, firmware, software, a combination of hardware, firmware and/or software, or other implementations.
  • The circuit 206 may implement a branch metric calculation circuit. The circuit 206 may be operational to calculate branch metrics for the codewords received in the signal DATA. The branch metrics may be presented in the signal GIN. The circuit 206 may produce 32 branch metrics corresponding to edges of fragments 130 a and 130 b in FIG. 4. Therefore, the width of the output branch metrics may be 32×Bw, where Bw is a width of each branch metrics. In some embodiments Bw=12. Details of an embodiment of the circuit 206 may be found in co-pending Russian Application No. 2010148337 and U.S. application Ser. No. 13/156,580, hereby incorporated by reference in their entirety. In some embodiments, the circuit 206 may be implemented by a common design.
  • The circuit 208 may implement a maximum selection circuit. Circuit 208 is generally operational to find maximums among all state metrics and present the found results in the signals MAXPAIR and MAXADR. In some embodiments, 256 state metrics may be implemented (for m=8). Signal MAXADR may carry the address (e.g., a number of the state) of the maximal state metrics. The width of the signal MAXADR is generally 8 bits. The signal MAXPAIR may contain pairs of bits (i) obtained by the circuit 190 and (ii) corresponding to state node with the number in the signal MAXADR. The width of the signal MAXPAIR may be 2 bits. The information in the signals MAXADR and MAXPAIR may be the decoded output generated by a Viterbi decoding process (or circuit) of the decoder.
  • The circuit 210 may implement a state metrics permutator circuit. Circuit 210 may be operational to permute the state metrics values as described for scheme 160 in FIG. 7. The permuted state metrics may be stored in the circuits 204 a and 204 b alternately during even stages and odd stages.
  • The circuit 212 may implement a path permutator circuit. Circuit 212 may be operational to permute the paths in a manner similar to the circuit 210. The permuted paths may be stored in the circuits 204 c and 204 d alternately during even stages and odd stages.
  • The registers R0 to R10 may implement multi-bit register circuits. Register R0 may buffer state metrics values from the circuits 204 a and 204 b to the circuit 140. Registers R1, R2 and R3 may buffer state metrics values from the register R4 to the circuit 210. The register R4 may buffer state metrics values from the circuit 140 to the circuits 208 and 210 and the registers R1, R2 and R3. The registers R0, R1, R2, R3 and R4 may implement (8×Aw)-bit registers, where Aw may be a width of each state metrics value.
  • The register R5 may buffer index values from the circuit 140 to the circuit 190. The index values may be received from the circuit 140 in the signal INDOUT. The index values may be transferred to the circuit 190 in the signal INDIN. Register R5 generally has a width of 2 bits per index value.
  • Register R6 may buffer path data from the circuits 204 c and 204 d to the circuit 190. The registers R7, R8 and R9 may buffer paths from the register R10 to the circuit 212. Register R10 may buffer paths from the circuit 190 to the circuit 212 and the registers R7, R8 and R9. Registers R6, R7, R8, R9 and R10 may implement (8×Pw)-bit registers, where Pw is a width of each path.
  • The following example generally describes the functionality of the circuit 200 for a case where a constraint length m=8 and the number of, states is S=2m=256. A codeword of length 2×N may be received by the circuit 200 in the signal DATA. The parameter N generally means that the radix-4 trellis 100 may have N stages and the Viterbi decoding process may utilize N stages. Processing each stage generally involves (2(m−3))+1 clock cycles (e.g., 33 clock cycles for m=8). If the numbers of the stages start from 1, in each odd stage, information may be read from the circuits 204 a and 204 c and written to the circuits 204 b and 204 d respectively. In each even stage, information is generally read from the circuits 204 b and 204 d and written to the circuits 204 a and 204 c respectively. FIG. 11 generally illustrates an odd stage.
  • Consider some given odd stage. Let t0 be a beginning clock cycle of the given stage. Let d be a delay (latency) of the circuit 206 to calculate the branch metrics. In some embodiments, d=4 clock cycles. Hence, an initial set of branch metrics may be ready to present from the circuit 206 to the circuit 140 at the moment t0+d. At the moment t0+d−2, one or more read control signals may be presented to the circuit 204 a with the signal R_A_ADR set to a zero address. At the moment t0+d−1, an initial set of state metrics may be transferred from the circuit 204 a to the register R0. At the moment t0+d, the set of state metrics buffered in the register R0 may be transferred to the circuit 140. Register R0 may always be enabled. In each subsequent clock cycle, the signal R_A_ADR may be incremented by 1 and a new portion of the state metrics may be received by the circuit 140.
  • The circuit 140 is generally implemented as a full logic circuit (e.g., combinational hardware logic only). Therefore at the moment t0+d, the output state metrics may be presented from the circuit 140 to the register R4. Register R4 may always be enabled. At the moment t0+d+1, an enable port of register R1 may be asserted (e.g., enable state) and the initial portion of the state metrics may be stored to the register R1. At the moment t0+d+2, the enable port of the register R2 may be enabled and the next portion of the state metrics are generally stored to the register R2. At the moment t0+d+3, an enable port of register R3 is asserted and another portion of the state metrics may be stored to the register R3. At the moment t0+d+4, the circuit 210 may receive the 4 portions of state metrics and at the same time from the registers R1 to R4, The circuit 210 may write the 4 permuted portions (e.g., successive state metrics values) into the 4 memory banks of the circuit 204 b. Therefore, at the moment t0+d+4, one or more write control signals should be presented to the circuit 204 b with the signal W_A_ADR set to the zero address.
  • The above operations may be repeated cyclically. For example, at each moment t0+d+(4×k), the write control signals to the circuit 204 b may be asserted, the signal W_A_ADR may be incremented by 1 and a new portion from among the 32 state metrics values may be written in the 4 memory banks of the circuit 204 b.
  • At the moment t0+d, the signal INDOUT may transfer index values from the circuit 140 to the register R5. Register R5 may always be enabled. At the moment t0+d+1, the index values may be presented from the register R5 to the circuit 190 in the signal INDIN. Therefore, at the moment t0+d−1, the read control signals may be presented to the circuit 204 c with the signal R_P_ADR set to the zero address. At the moment t0+d, an initial set of paths may be transferred from the circuit 204 c to the register R6. At the moment t0+d+1, the initial set of paths may be transferred from the register R6 to the circuit 190. Register R6 may always be enabled. In each subsequent clock cycle, the signal R_P_ADR may be incremented by 1 and a new portion of the paths is presented to the circuit 190.
  • The circuit 190 is generally implemented as a full logic circuit (e.g., combinational hardware logic only). Therefore, at the moment t0+d+1, the output paths may be presented from the circuit 190 to the register R10. Register R10 may always be enabled. At the moment t0+d+2, the enable port of register R7 may be asserted and the initial portion of the paths is stored in the register R7. At the moment t0+d+3, the enable port of register R8 may be asserted and a next portion of the paths is stored in the register R8. At the moment t0+d+4, the enable port of register R9 may be enabled and another portion of the paths may be stored in the register R9. At the moment, t0+d+5, the 4 portions of the paths may be transferred from the registers R7 to R10 to the circuit 212 in parallel. The circuit 212 may write the 4 permuted portions (e.g., successive paths) into the 4 memory banks of the circuit 204 d simultaneously. Therefore, at the moment t0+d+5, the write control signals may be received by the circuit 204 d with the signal W_P_ADR set to the zero address.
  • The above operations may be repeated cyclically. For example, at each moment t0+d+1+(4×k), the write control signals to the circuit 204 d may be asserted, the signal W_P_ADR may be incremented by 1 and a new portion from among the 32 paths may be written in the 4 memory banks of the circuit 204 d.
  • State metrics values and paths may be received by the circuit 208 beginning at the moment t0+d+1. At the moment t0+d+29, the signal R_A_ADR may become 31 and a last portion of the state metrics may be read from the circuit 204 a. After 2 additional clock cycles (e.g., at the moment t0+d+31), a last portion of the state metrics may be written to the circuit 204 b. After the moment t0+d+31, reads for the next stage may be started. Therefore, with a 2 clock cycle pause (delay) in each stage, a maximum of 34 clock cycles (2+256/8 clock cycles) may be used per stage.
  • Alternatively, reading from circuit 204 b for the next stage may begin at the moment t0+d+30 because the initial portion of the state metrics of the next stage is ready in the circuit 204 b. Therefore, the number of clock cycles (iterations) per stage may be reduced from 34 to 33. Returning to the moment t0+d+29, the signal R_A_ADR becomes 31 and the last portion of the state metrics of the current stage may be read from the circuit 204 a. At the moment t0+d+30, the signal R_A_ADR may be set to the zero address and presented to the circuit 204 b. At the moment t0+d+31, a last portion of the state metrics of the current stage may be written to the circuit 204 b. At the moment t0+d+32, the initial portion of state metrics of next stage may be transferred from the circuit 204 b to the register R0 and the signal R_A_ADR may be incremented. At the moment t0+d+33, the initial portion of the state metrics of the next stage is generally transferred from the register R0 to the circuit 140. Therefore, the circuit 140 may begin processing of the next stage at the moment t0+d+33. Since the previous stage start of the circuit 140 occurred at the moment t0+d, the number of clock cycles of a stage is 33 for m=8, or (2(m−3))+1 for the general case.
  • In even stages, the flow of information from the circuits 204 a and 204 c to the circuits 204 b and 204 d may be reversed. Information may be read from the circuits 204 b and 204 d to the registers R0 and R6 respectively, updated, and written from the circuits 210 and 212 into the circuits 204 a and 204 c. Therefore, processing time of a codeword of length K=2×N may be approximately 33×N=33×K/2 clock cycles. If N>20, the initial pair of bits of the decoded codeword may be presented from the circuit 200 after 33×20 clock cycles.
  • After all iterations have completed, the signal MAXADR may contain q1q2 . . . q8. Suppose the last iteration has been written to the circuits 204 b and 204 d. Let p1p 2 . . . p32 be the path corresponding to the state identified in the signal MAXADR. Therefore, q1q2 . . . q8p1p2 . . . p32 may be the last several bits of the decoded codeword, where q1 is last bit of the decoded codeword.
  • The functions performed by the diagrams of FIGS. 3, 5-7 and 9-11 may be implemented using one or more of a conventional general purpose processor, digital, computer, microprocessor, microcontroller, RISC (reduced instruction set computer) processor, CISC (complex instruction set computer) processor, SIMD (single instruction multiple data) processor, signal processor, central processing unit (CPU), arithmetic logic unit (ALU), video digital signal processor (VDSP) and/or similar computational machines, programmed according to the teachings of the present specification, as will be apparent to those skilled in the relevant art(s). Appropriate software, firmware, coding, routines, instructions, opcodes, microcode, and/or program modules may readily be prepared by skilled programmers based on the teachings of the present disclosure, as will also be apparent to those skilled in the relevant art(s). The software is generally executed from a medium or several media by one or more of the processors of the machine implementation.
  • The present invention may also be implemented by the preparation of ASICs (application specific integrated circuits), Platform ASICs, FPGAs (field programmable gate arrays), PLDs (programmable logic devices), CPLDs (complex programmable logic device), sea-of-gates, RFICs (radio frequency integrated circuits), ASSPs (application specific standard products), one or more monolithic integrated circuits, one or more chips or die arranged as flip-chip modules and/or multi-chip modules or by interconnecting an appropriate network of conventional component circuits, as is described herein, modifications of which will be readily apparent to those skilled in the art(s).
  • The present invention thus may also include a computer product which may be a storage medium or media and/or a transmission medium or media including instructions which may be used to program a machine to perform one or more processes or methods in accordance with the present invention. Execution of instructions contained in the computer product by the machine, along with operations of surrounding circuitry, may transform input data into one or more files on the storage medium and/or one or more output signals representative of a physical object or substance, such as an audio and/or visual depiction. The storage medium may include, but is not limited to, any type of disk including floppy disk, hard drive, magnetic disk, optical disk, CD-ROM, DVD and magneto-optical disks and circuits such as ROMs (read-only memories), RAMS (random access memories), EPROMs (electronically programmable ROMs), EEPROMs (electronically erasable ROMs), UVPROM (ultra-violet erasable ROMs), Flash memory, magnetic cards, optical cards, and/or any type of media suitable for storing electronic instructions.
  • The elements of the invention may form part or all of one or more devices, units, components, systems, machines and/or apparatuses. The devices may include, but are not limited to, servers, workstations, storage array controllers, storage systems, personal computers, laptop computers, notebook computers, palm computers, personal digital assistants, portable electronic devices, battery powered devices, set-top boxes, encoders, decoders, transcoders, compressors, decompressors, pre-processors, post-processors, transmitters, receivers, transceivers, cipher circuits, cellular telephones, digital cameras, positioning and/or navigation systems, medical equipment, heads-up displays, wireless devices, audio recording, storage and/or playback devices, video recording, storage and/or playback devices, game platforms, peripherals and/or multi-chip modules. Those skilled in the relevant art(s) would understand that the elements of the invention may be implemented in other types of devices to meet the criteria of a particular application.
  • As would be apparent to those skilled in the relevant art(s), the signals illustrated in FIGS. 3, 5 and 9-11 represent logical data flows. The logical data flows are generally representative of physical data transferred between the respective blocks by, for example, address, data, and control signals, and/or busses. The system represented by the circuit 100 may be implemented in hardware, software or a combination of hardware and software according to the teachings of the present disclosure, as would be apparent to those skilled in the relevant art(s). As used herein, the term “simultaneously” is meant to describe events that share some common time period but the term is not meant to be limited to events that begin at the same point in time, end at the same point in time, or have the same duration.
  • While the invention has been particularly shown and described with reference to the preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made without departing from the scope of the invention.

Claims (20)

1. A method for forward error correction decoding, comprising the steps of:
(A) calculating a plurality of metrics of a codeword using a forward error correction process on a trellis having a plurality of stages;
(B) updating said metrics over each of said stages;
(C) permuting said metrics in each of said stages;
(D) writing said metrics as permuted into a memory; and
(E) generating a signal carrying a plurality of decoded bits of said codeword based on said metrics in said memory.
2. The method according to claim 1, wherein said trellis comprises a radix-4 trellis.
3. The method according to claim 1, wherein said forward error correction process comprises at least one of a turbo decoding process and a Viterbi decoding process.
4. The method according to claim 3, wherein said calculating of said metrics is common to both said turbo decoding process and said Viterbi decoding process.
5. The method according to claim 1, wherein (i) said codeword has a length of K received symbols and (ii) said codeword is decoded using C×2m×K clock cycles, C being a constant less than one and greater than zero, and m being a constraint length of said codeword.
6. The method according to claim 5, wherein C has a value of approximately 1/16th.
7. The method according to claim 5, wherein m has a value of 8 and said codeword is decoded using approximately 33×K/2 clock cycles.
8. The method according to claim 1, wherein said codeword is compliant with (i) a first of a plurality of communications standards in a first of a plurality of configurations and (ii) a second of said communications standards in a second of said configurations.
9. The method according to claim 8, wherein said communications standards include at least two of (i) a Long Term Evolution (LTE) standard, (ii) an Institute of Electrical and Electronics Engineering (IEEE) 802.16 standard, (iii) a Wideband-CDMA/High Speed Packet Access (WCDMA/HSPA) standard and (iv) a CDMA-2000/Ultra Mobile Broadband (UMB) standard.
10. An apparatus comprising:
a memory; and
a circuit configured to (i) calculate a plurality of metrics of a codeword using a forward error correction process on a trellis having a plurality of stages, (ii) update said metrics over each of said stages, (iii) permute said metrics in each of said stages, (iv) write said metrics as permuted into said memory and (v) generate a signal carrying a plurality of decoded bits of said codeword based on said metrics in said memory.
11. The apparatus according to claim 10, wherein said trellis comprises a radix-4 trellis.
12. The apparatus according to claim 10, wherein said forward error correction process comprises at least one of a turbo decoding process and a Viterbi decoding process.
13. The apparatus according to claim 12, wherein said calculating of said metrics is common to both said turbo decoding process and said Viterbi decoding process.
14. The apparatus according to claim 10, wherein (i) said codeword has a length of K received symbols and (ii) said codeword is decoded using C×2m×K clock cycles, C being a constant less than one and greater than zero, and C being a constraint length of said codeword.
15. The apparatus according to claim 14, wherein C has a value of approximately 1/16th.
16. The apparatus according to claim 14, wherein m has a value of 8 and said codeword is decoded using approximately 33×K/2 clock cycles.
17. The apparatus according to claim 10, wherein said codeword is compliant with (i) a first of a plurality of communications standards in a first of a plurality of configurations and (ii) a second of said communications standards in a second of said configurations.
18. The apparatus according to claim 17, wherein said communications standards include at least two of (i) a Long Term Evolution (LTE) standard, (ii) an Institute of Electrical and Electronics Engineering (IEEE) 802.16 standard, (iii) a Wideband-CDMA/High Speed Packet Access (WCDMA/HSPA) standard and (iv) a CDMA-2000/Ultra Mobile Broadband (UMB) standard.
19. The apparatus according to claim 10, wherein said apparatus is implemented as at least one integrated circuit.
20. An apparatus comprising:
means for calculating a plurality of metrics of a codeword using a forward error correction process on a trellis having a plurality of stages;
means for updating said metrics over each of said stages;
means for permuting said metrics in each of said stages;
means for writing said metrics as permuted into a memory; and
means for generating a signal carrying a plurality of decoded bits of said codeword based on said metrics in said memory.
US14/246,506 2010-12-02 2014-04-07 Radix-4 viterbi forward error correction decoding Abandoned US20140223267A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/246,506 US20140223267A1 (en) 2010-12-02 2014-04-07 Radix-4 viterbi forward error correction decoding

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
RU2010149150 2010-12-02
RU2010149150/08A RU2010149150A (en) 2010-12-02 2010-12-02 METHOD AND DEVICE (OPTIONS) FOR DECODING WITH PREVIOUS CORRECTION OF ERRORS BY VITERBY RADIX-4 ALGORITHM
US13/158,636 US8775914B2 (en) 2010-12-02 2011-06-13 Radix-4 viterbi forward error correction decoding
US14/246,506 US20140223267A1 (en) 2010-12-02 2014-04-07 Radix-4 viterbi forward error correction decoding

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/158,636 Continuation US8775914B2 (en) 2010-12-02 2011-06-13 Radix-4 viterbi forward error correction decoding

Publications (1)

Publication Number Publication Date
US20140223267A1 true US20140223267A1 (en) 2014-08-07

Family

ID=46163425

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/158,636 Active 2032-06-23 US8775914B2 (en) 2010-12-02 2011-06-13 Radix-4 viterbi forward error correction decoding
US14/246,506 Abandoned US20140223267A1 (en) 2010-12-02 2014-04-07 Radix-4 viterbi forward error correction decoding

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/158,636 Active 2032-06-23 US8775914B2 (en) 2010-12-02 2011-06-13 Radix-4 viterbi forward error correction decoding

Country Status (2)

Country Link
US (2) US8775914B2 (en)
RU (1) RU2010149150A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107294544A (en) * 2017-06-27 2017-10-24 唯思泰瑞(北京)信息科技有限公司 Forward error correction, device and the decoder of convolutional code

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9467279B2 (en) 2014-09-26 2016-10-11 Intel Corporation Instructions and logic to provide SIMD SM4 cryptographic block cipher functionality
US10680749B2 (en) * 2017-07-01 2020-06-09 Intel Corporation Early-termination of decoding convolutional codes

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5497384A (en) * 1993-12-29 1996-03-05 International Business Machines Corporation Permuted trellis codes for input restricted partial response channels
US6445755B1 (en) * 1999-09-14 2002-09-03 Samsung Electronics Co, Ltd. Two-step soft output viterbi algorithm decoder using modified trace back
US20020131532A1 (en) * 2001-01-26 2002-09-19 Richard Chi Method and apparatus for detecting messages with unknown signaling characteristic
US6668026B1 (en) * 1999-05-28 2003-12-23 Sony Corporation Decoding method and apparatus
US7062000B1 (en) * 1999-01-29 2006-06-13 Sharp Kabushiki Kaisha Viterbi decoder

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5594742A (en) * 1990-12-20 1997-01-14 Communications Satellite Corporation Bidirectional trellis coding
US5815515A (en) * 1996-03-28 1998-09-29 Lsi Logic Corporation Edge metric calculation method and apparatus using permutations
US6233712B1 (en) * 1998-04-24 2001-05-15 Lsi Logic Corporation Apparatus and method for recovering information bits from a 64/256-quadrature amplitude modulation treliss coded modulation decoder
US7765459B2 (en) 2005-09-28 2010-07-27 Samsung Electronics Co., Ltd. Viterbi decoder and viterbi decoding method
US7721187B2 (en) * 2007-09-04 2010-05-18 Broadcom Corporation ACS (add compare select) implementation for radix-4 SOVA (soft-output viterbi algorithm)
US8271858B2 (en) * 2009-09-03 2012-09-18 Telefonaktiebolget L M Ericsson (Publ) Efficient soft value generation for coded bits in a turbo decoder
US8687746B2 (en) * 2010-05-27 2014-04-01 Qualcomm Incorporated SMU architecture for turbo decoder
RU2010148337A (en) * 2010-11-29 2012-06-10 ЭлЭсАй Корпорейшн (US) METHOD AND DEVICE (OPTIONS) FOR CALCULATING BRANCH METRICS FOR MULTIPLE COMMUNICATION STANDARDS

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5497384A (en) * 1993-12-29 1996-03-05 International Business Machines Corporation Permuted trellis codes for input restricted partial response channels
US7062000B1 (en) * 1999-01-29 2006-06-13 Sharp Kabushiki Kaisha Viterbi decoder
US6668026B1 (en) * 1999-05-28 2003-12-23 Sony Corporation Decoding method and apparatus
US6445755B1 (en) * 1999-09-14 2002-09-03 Samsung Electronics Co, Ltd. Two-step soft output viterbi algorithm decoder using modified trace back
US20020131532A1 (en) * 2001-01-26 2002-09-19 Richard Chi Method and apparatus for detecting messages with unknown signaling characteristic

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107294544A (en) * 2017-06-27 2017-10-24 唯思泰瑞(北京)信息科技有限公司 Forward error correction, device and the decoder of convolutional code

Also Published As

Publication number Publication date
RU2010149150A (en) 2012-06-10
US8775914B2 (en) 2014-07-08
US20120144274A1 (en) 2012-06-07

Similar Documents

Publication Publication Date Title
JP4907802B2 (en) Butterfly processor device used for communication decoding
EP3639374B1 (en) Low latency polar coding and decoding by merging of stages of the polar code graph
KR20080098391A (en) Map decoder with bidirectional sliding window architecture
JP4227481B2 (en) Decoding device and decoding method
US20180076831A1 (en) Partial sum computation for polar code decoding
US8621329B2 (en) Reconfigurable BCH decoder
CN1327653A (en) Component decoder and method thereof in mobile communication system
US20050149838A1 (en) Unified viterbi/turbo decoder for mobile communication systems
JP2014099944A (en) Methods and apparatus for low-density parity check decoding using hardware sharing and serial sum-product architecture
JP2018019401A (en) Reed-Solomon decoder and decoding method
US7020214B2 (en) Method and apparatus for path metric processing in telecommunications systems
KR20150125744A (en) High-Throughput Low-Complexity Successive-Cancellation Polar Decoder Architecture and Method
US20140223267A1 (en) Radix-4 viterbi forward error correction decoding
EP3202045A1 (en) Method and device for calculating a crc code in parallel
US20030026347A1 (en) Path metric normalization
US20030123563A1 (en) Method and apparatus for turbo encoding and decoding
KR20030036845A (en) A Decoder For Trellis-Based Channel Encoding
US11063614B1 (en) Polar decoder processor
US7979781B2 (en) Method and system for performing Viterbi decoding using a reduced trellis memory
US10084486B1 (en) High speed turbo decoder
Chen et al. Design of a low power viterbi decoder for wireless communication applications
JP2010130271A (en) Decoder and decoding method
US8644432B2 (en) Viterbi decoder for decoding convolutionally encoded data stream
US8699396B2 (en) Branch metrics calculation for multiple communications standards
US20120128102A1 (en) L-value generation in a decoder

Legal Events

Date Code Title Description
AS Assignment

Owner name: DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AG

Free format text: PATENT SECURITY AGREEMENT;ASSIGNORS:LSI CORPORATION;AGERE SYSTEMS LLC;REEL/FRAME:032856/0031

Effective date: 20140506

AS Assignment

Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LSI CORPORATION;REEL/FRAME:035390/0388

Effective date: 20140814

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: AGERE SYSTEMS LLC, PENNSYLVANIA

Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENT RIGHTS (RELEASES RF 032856-0031);ASSIGNOR:DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT;REEL/FRAME:037684/0039

Effective date: 20160201

Owner name: LSI CORPORATION, CALIFORNIA

Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENT RIGHTS (RELEASES RF 032856-0031);ASSIGNOR:DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT;REEL/FRAME:037684/0039

Effective date: 20160201