WO2013067697A1

WO2013067697A1 - Parallel decoding method and turbo decoder

Info

Publication number: WO2013067697A1
Application number: PCT/CN2011/082028
Authority: WO
Inventors: 王毅; 刘勇; 王书歌
Original assignee: 中兴通讯股份有限公司
Priority date: 2011-11-10
Filing date: 2011-11-10
Publication date: 2013-05-16

Abstract

Provided are a Turbo decoder and a parallel decoding method. The Turbo decoder includes a first interleaver and a second interleaver. The first interleaver includes a first LTE interleaving module, a first TD/W interleaving module and a first selector. The second interleaver includes a second LTE interleaving module, a second TD/W interleaving module and a second selector. The method realizes multi-mode compatibility of the Turbo decoder, solves the parallel interleaving problem in the TD/W mode, and enables the Turbo decoder in the LTE mode and the Turbo decoder in the TD/W mode to integrate as one, thus reducing the hardware scale.

Description

Parallel decoding method and Turbo decoder

The present invention relates to decoding processing techniques, and in particular, to a parallel decoding method and a turbo decoder. Background technique

Turbo code is an efficient channel coding method. Its essence is a variant of convolutional code. It is characterized by high complexity of coding and decoding, large delay, but excellent error performance. It is suitable for long code blocks with large data volume. And data transmission that does not require high latency. The great advantage of Turbo code is that it can satisfactorily satisfy the random condition in Shannon channel coding theory, and iterative decoding is used to obtain the coding gain, so the ultimate performance close to Shannon limit can be obtained.

As shown in FIG. 1, the turbo decoder includes two soft input soft output (SISO) decoding units (first decoding unit 101, second decoding unit 103) connected to each other by a first interleaver 102. The output of the second decoding unit 103 is connected to the second interleaver 104 for deinterleaving, and the output of the second interleaver 104 is connected to the first decoding unit 101 and the hard decision module 105. The Turbo decoder receives the externally sent three-way data, that is, the system bit sb, the uninterleaved first parity bit ρθ, and the interleaved second parity bit pi, and the output data of each decoding unit is externally assigned. The information (abl, ab2) is input as a priori information, systematic bits (sb) and check bits (ρθ or pi).

The decoding process of the turbo decoder is an iterative process. The extra assignment information ab2 output by the second decoding unit 103 is deinterleaved by the second interleaver 104 and is used as the a priori information of the first decoding unit 101. Decoding, at the same time, the input of the first decoding unit 101 further includes a system bit sb and a first parity bit ρθ; the external assignment information abl output by the first decoding unit 101 is interleaved by the second interleaver 102 as a second The a priori information of the decoding unit 103 performs auxiliary decoding, and at the same time, the input of the second decoding unit 103 further includes the interleaved system bit sb and the second check ratio. The pi is repeated iteratively until the hard bit data hdb output by the hard decision module 105 satisfies the decoding requirement or the number of iterations reaches the specified value. The hardware structures of the first decoding unit 101 and the second decoding unit 103 are identical, and there is no simultaneous decoding. Therefore, the two decoding units can be designed as the same set of circuits, and are time-multiplexed. Sharing, saving hardware resources. The first decoding unit 101 and the second decoding unit 103 mainly implement a log-domain Max-Log-MAP algorithm, in which multiplication and exponential operations are simplified to addition and maximum operations, thereby reducing computational complexity. , Conducive to hardware implementation.

In the context of the rapid development of mobile communication technologies, mobile terminal systems have placed higher demands on the coexistence of multiple communication modes and the speed of various modes. Turbo decoders, which are one of the key components in mobile communication terminal systems, are also subject to higher requirements, on the one hand, to meet the speed of the system, and on the other hand, to reduce the size of the hardware circuit as much as possible. However, the existing Turbo decoder can only support one mode, or only supports Long-Term Evolution (LTE) mode, or only (TD/W, TD-SCDMA/WCDMA/HSPA+) mode. In addition, in the parallel decoding process, it is necessary to ensure that the multiplexed data output by the interleaver is mapped to multiple regions of the buffer, otherwise the same tempo will appear in the next iteration. The above data problem, or the same beat, needs to write more than two pieces of output data back to the same area, resulting in a memory access violation. Since the TD/W mode Turbo interleaving algorithm cannot solve the memory conflict problem, the turbo coder supporting the TD/W mode cannot implement parallel interleaving. In view of this, it is necessary to solve the following problems in the Turbo decoder: 1. Implement multi-mode compatibility of the Turbo decoder; 2. Solve the parallel interleaving problem of the Turbo decoder in the TD/W mode. Summary of the invention

In view of this, the main object of the present invention is to provide a parallel decoding method and decoder for solving the multi-mode compatibility problem of the turbo decoder. Secondly, another object of the present invention is to solve the Turbo decoder in the TD. Parallel interleaving problem in /W mode, making multimode of Turbo decoder Compatibility is more optimized.

In order to achieve the above object, the technical solution of the present invention is achieved as follows:

The present invention provides an interleaver, the interleaver including: an LTE interlace module, a TD/W interlace module, and a selector;

The LTE interleaving module is configured to perform interlace processing and/or deinterleave processing on the input data and the a priori information in the LTE mode;

The TD/W interleaving module is configured to perform TD/W mode interleaving processing and/or deinterleaving processing on the input data and the a priori information;

The selector is configured to select data obtained by outputting the LTE interleaving module or data obtained by the TD/W interworking module.

The present invention also provides a Turbo decoder, the Turbo decoder comprising: a first interleaver and a second interleaver;

The first interleaver includes a first LTE interlace module, a first TD/W interleaving module, and a first selector, where the first LTE interleaving module is configured to perform input data and external information obtained by the last MAP iteration. The interleaving process of the LTE mode, the first TD/W interleaver is configured to perform TD/W mode interleaving processing on the input data and the external assignment information obtained by the last MAP iteration, and the first selector is configured to use the preset mode according to the preset mode. Selecting to output the interleaved data obtained by the first LTE interleaving module or the interleaved data obtained by the first TD/W interleaving module;

The second interleaver includes a second LTE interlace module, a second TD/W interleaving module, and a second selector, where the second LTE interleaving module is configured to deinterleave the external assignment information output by the parallel MAP unit in the LTE mode. Processing, the second TD/W interleaving module is configured to perform de-interleaving processing and/or de-interleaving processing in the TD/W mode on the external assignment information output by the parallel MAP unit, and the second selector is configured to select according to the preset mode. And outputting data obtained by the second LTE interleaving module or data obtained by the second TD/W interleaving module.

In the above solution, the Turbo decoder further includes: a prior information buffer, a system ratio a special buffer, a parity bit buffer, a parity bit selector, a parallel MAP unit, and a main control module;

Wherein, the main control module issues a read command to the a priori information buffer, the system bit buffer and the check bit buffer, and the a priori information buffer and the system bit buffer are received by the main control module. After the command is read, the a priori information ab and the systematic bit sb are respectively output to the first interleaver, and the first interleaver performs interleaving processing on the input data according to a preset mode, and outputs the interleaved data or directly inputs the input data. Data is output to the parallel MAP unit;

After receiving the read command issued by the main control module, the check bit buffer outputs a first check bit ρθ or a second check bit pi to the parallel MAP unit through a check bit selector;

The parallel MAP unit performs MAP calculation processing on the data of the input itself, and outputs the obtained MAP calculation result to the second interleaver; the second interleaver deinterleaves the input data according to a preset mode. / or deinterleaving processing to output de-interleaved and / or deinterleaved data or directly output the input data to the a priori information buffer.

In the above solution, the turbo decoder further includes: a buffering device connected between the second TD/W interleaving module and the a priori information buffer, or as a component of the second TD/W interleaving module The prior information buffer is connected;

The cache device includes N first FIFO groups and N select one selectors, and a first in first out group is connected to an N select selector, wherein each FIFO group includes N first in first out buffers (FIFO), the N inputs of an N-selector are connected to the outputs of N FIFOs in a FIFO grou; where N represents the maximum value of the number of parallel decoded data paths, and N is the power of 2.

In the above solution, the parallel MAP unit includes N MAP sub-units; the a priori information buffer, the system bit buffer, and the check bit buffer are evenly divided into N sub-blocks (Bank). In the above solution, the MAP subunit includes: a Beta overlap reverse recursive module and a Beta sliding window reverse recursive module, wherein the beta overla reverse recursive module is configured to calculate an overlap portion of the beta sliding window, and the beta sliding window is reversely delivered. The push module is used to calculate the sliding window portion of the beta sliding window.

In the above solution, the system bit buffer, and the first parity bit buffer and the second parity bit buffer in the parity bit buffer all adopt a ping-pong structure, and each buffer includes one ping ( Ping sub-buffer and a pang sub-buffer; the total memory bank size of each sub-buffer is 6144.

In the above solution, the system bit buffer, the first parity bit buffer, and the second parity bit buffer respectively comprise two parts, one part of the memory group length is 5120, and the other part of the memory group length is 1024.

The present invention also provides a parallel decoding method, the method comprising: performing interleaving processing on multiple data through a first interleaver in a turbo decoder, and performing parallel MAP calculation through a parallel MAP unit, according to the foregoing The multi-path row address generated by the second interleaver and the sequence of outputting the multiplexed data by the parallel MAP unit, buffering the de-interleaved and/or de-interleaved multiplexed data and the multiplexed column address generated by the second interleaver to In the buffer device of the turbo decoder, the multiplexed data buffered by the buffer device and the corresponding column address are output to the a priori information buffer.

In the above solution, the de-interleaved and/or de-interleaved multiplexed data and the first according to the multiplexed row address generated by the second interleaver and the parallel MAP unit outputting multiplexed data Buffering the multiplexed column address generated by the second interleaver into the buffer device of the turbo decoder includes: inputting the multiplexed data output by the parallel MAP unit to the row address generated by the second interleaver Corresponding multiple FIFO groups in the cache device, and then generating the multiplexed a priori information and the second interleaver output by the MAP unit according to the sequence in which the parallel MAP unit outputs the multiplexed data The column address corresponds to the FIFO buffer stored in each FIFO group.

In the above solution, the outputting the multiplexed data cached by the cache device includes: Each N-selector in the cache device selects and outputs data buffered by the FIFO buffer with the most stagnation data in each FIFO group and a corresponding column address to a plurality of sub-blocks of the a priori information buffer.

In the above solution, the performing parallel MAP calculation by the parallel MAP unit includes: performing, by the MAP subunits in the parallel MAP unit, MAP calculation, output, and a priori information ab, system bit sb, and parity bit pb of the input itself. For the external information eb and the hard bit hdb, including the Alpha calculation process and the Beta calculation process;

The Alpha calculation process and the Beta calculation process include: dividing each beta sliding window into "overlap" and "sliding window", setting the overlap and the sliding window to have the same length; when performing the calculation, the first beat is calculated first. The overlap portion of the beta sliding window, the second beat calculates the sliding window portion of the first beta sliding window, and calculates the overlap portion of the second beta sliding window. The third beat calculates the alpha value of the first sliding window and The outer information eb and the hard bit hdb are obtained, and the beta sliding window portion of the second sliding window is calculated, and the overlap portion of the third sliding window is calculated, and so on, until the last beta value and alpha value are obtained.

The Turbo decoder provided by the invention comprises an interleaver compatible with the LTE mode and the TD/W mode, realizes multi-mode compatibility of the Turbo decoder, and can realize parallel decoding in the TD/W mode, and solves the TD. The parallel interleaving problem in /W mode enables the LTE mode Turbo decoder and the TD/W Turbo decoder to be combined into one, achieving the goal of reducing the hardware scale. DRAWINGS

1 is a schematic structural diagram of a Turbo decoder in the prior art;

2 is a schematic structural diagram of a Turbo decoder according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a parallel decoding process performed by a Turbo decoder according to Embodiment 2 of the present invention; FIG. 4 is a schematic structural diagram of a buffer device in a Turbo decoder according to Embodiment 2 of the present invention; FIG. A schematic diagram of a structure of a bit buffer, a first parity buffer, and a second parity buffer; FIG. 6 is a schematic diagram of a composition structure of a MAP subunit and a MAP calculation process according to Embodiment 4 of the present invention. detailed description

The basic idea of the present invention is to provide an interleaver and a turbo decoder that are compatible with the LTE mode and the TD/W mode, and implement parallel decoding in the TD/W mode based on the Turbo decoder.

An interleaver according to the present invention, the interleaver includes: an LTE interlace module, a TD/W interleaving module, and a first selector, where the LTE interleaving module is configured to use the input data and the last MAP iteration The TD/W interleaving module is configured to perform interleaving processing and/or interleaving processing in the TD/W mode on the input data and the external assignment information obtained by the last MAP iteration. Deinterleave processing; a selector, configured to select data obtained by outputting the LTE interleaving module or data obtained by the TD/W interleaving module.

A turbo decoder according to the present invention, the turbo decoder includes: a first interleaver and a second interleaver, wherein the first interleaver includes a first LTE interlace module, and a first TD/W interleaving module And the first selector, the first LTE interleaver module is configured to perform inter-LTE processing on the input data and the external assignment information obtained by the last MAP iteration, where the first TD/W interleaver is used to input the data. Performing an interleaving process in the TD/W mode, the first selector is configured to select, according to the preset mode, output the interleaved data obtained by the first LTE interlacing module or the interleaved data obtained by the first TD/W interleaving module;

The second interleaver includes a second LTE interlace module, a second TD/W interleaving module, and a second selector, where the second LTE interleaving module is configured to deinterleave the external assignment information output by the parallel MAP unit in the LTE mode. Processing, deinterleaving the data as the a priori information of the next MAP iteration is buffered in the a priori information buffer; the second TD/W interleaving module is configured to perform TD/W on the external assignment information output by the parallel MAP unit The de-interleaving (even times, the same below) and/or de-interleaving processing, de-interleaving and/or de-interleaving processing of the pattern is buffered in the a priori information buffer as a priori information for the next MAP iteration. The second selector is used to select according to a preset mode And outputting data obtained by the second LTE interleaving module or data obtained by the second TD/W interleaving module. In practical applications, the TD/W mode deinterleaving is performed in an even number of MAP iterations, and the TD/W mode deinterleaving is performed in an odd number of MAP iterations.

The Turbo decoder further includes: a prior information buffer, a system bit buffer, a parity bit buffer, a parity bit selector, a parallel MAP unit, and a main control module;

The parallel MAP unit performs MAP calculation processing on the data of the input itself, and outputs the obtained MAP calculation result to the second interleaver; the second interleaver deinterleaves the input data according to a preset mode. And/or deinterleaving processing, outputting the deinterleaved and/or deinterleaved data to the a priori information buffer, or directly outputting the input data to the a priori information buffer.

Here, the de-interleaving process is for the TD/W mode, and the LTE mode does not require de-interleaving. In TD/W mode, when the number of MAP iterations is even, the calculation result of the parallel MAP unit output is itself uninterleaved positive sequence data, so deinterleaving is performed without deinterleaving. The purpose of deinterleaving is to The interleaved data is directly read from the a priori information buffer with the positive sequence address at the next odd MAP iteration, thus solving the memory conflict problem of the multiple parallel interleaving in the TD/W mode. The order of the data after the de-interlacing process is actually the order after the interleaving.

The second TD/W interleaving module further includes a buffer device connected to the second Between the TD/W interleaving module and the a priori information buffer, or as a component internal to the second TD/W interleaving module, the a priori information buffer, the buffer device comprising N first in first out groups (FIFO groups) And N select one selectors, a first in first out group is connected to an N select one selector, wherein each FIFO group includes N first in first out buffers (FIFOs), and N selects one input N input The end is connected to the output of N FIFOs in a FIFO grou; where N represents the maximum value of the number of parallel decoded data paths, and N is the power of 2.

Here, the parallel MAP unit includes N MAP sub-units; the a priori information buffer, the systematic bit buffer, and the check bit buffer are evenly divided into N sub-blocks (Bank).

Specifically, the MAP subunit may include: a Beta overlap recursive module and a Beta sliding window reverse recursive module, wherein the beta overla reverse recursive module is configured to calculate an overlap portion of the beta sliding window, and the beta sliding The window reverse recursive module is used to calculate the sliding window portion of the beta sliding window.

The system bit buffer, and the first parity bit buffer and the second parity bit buffer in the parity bit buffer all adopt a ping-pong structure, and each buffer includes a ping (ping) Buffer and a pang sub-buffer; the total memory bank size of each sub-buffer is 6144.

Here, the system bit buffer, the first parity bit buffer, and the second parity bit buffer comprise two parts, one part of which has a memory group length of 5120 and the other part of which has a length of 1024.

The present invention also provides a parallel decoding method, the method may include: interleaving multiplexed data through a first interleaver in a Turbo decoder, performing parallel MAP calculation through a parallel MAP unit, and passing through a second The interleaver de-interleaves and/or deinterleaves the N data output by the parallel MAP unit, and writes the de-interleaved and/or de-interleaved data back into the N sub-blocks of the a priori information buffer.

The performing, by the second interleaver, the N external assignment information output by the parallel MAP unit Deinterleaving and/or deinterleaving processing, and writing back the deinterleaved and/or deinterleaved data to the N sub-blocks of the a priori information buffer, including: according to the multi-line address generated by the second interleaver, And inputting the multiple external assignment information outputted by the parallel MAP unit to a corresponding plurality of FIFO groups in the cache device, and then performing the MAP according to the sequence of the multiple routing information output by the MAP unit. The multipath a priori information output by the unit and the column address generated by the second interleaver are correspondingly stored in the FIFO buffers in each FIFO group.

The outputting the multiplexed data buffered by the cache device includes: each N-selector in the cache device selects and outputs a buffer cached by a FIFO buffer with the most stagnation data in each FIFO group. The external assignment information and its corresponding column address are written to the plurality of sub-blocks of the a priori information buffer with the multi-way column address as a write address.

The performing the parallel MAP calculation by the parallel MAP unit may include: each MAP subunit in the parallel MAP unit performs MAP calculation on the a priori information ab, the system bit sb, and the check bit pb of the input itself, and the output is external. The information eb and the hard bit hdb include an Alpha calculation process and a Beta calculation process; wherein, the Alpha calculation process and the Beta calculation process include: dividing each beta sliding window into "overlap" and "sliding window", setting overlap and sliding The length of the window is equal; when calculating, the first beat calculates the overlap portion of the first beta sliding window, the second beat calculates the sliding window portion of the first beta sliding window, and calculates the overlap of the second beta sliding window. In part, the third beat calculates the alpha value of the first sliding window and obtains the outer information eb and the hard bit hdb, and simultaneously calculates the beta sliding window portion of the second sliding window, and simultaneously calculates the overla portion of the third sliding window, in turn By analogy, until the last beta and alpha values are obtained.

Embodiment 1

In this embodiment, an interleaver supporting both the LTE mode and the TD/W mode is provided, and at the same time, a Turbo decoder including the interlace is provided, so that the Turbo decoder can be compatible with the LTE mode and the TD/W mode. .

Specifically, the interleaver provided in this embodiment is configured to input the input according to a preset mode. The data is subjected to an interleaving process and/or a deinterleaving process, and the interleaver mainly includes: an LTE interleaving module, a TD/W interleaving module, and a selector, where the LTE interleaving module is configured to perform interlacing processing on the input data in an LTE mode. And/or deinterleaving processing; the TD/W interleaving module, configured to perform TD/W mode interleaving processing and/or deinterleaving processing on the input data; and a selector, configured to select and output the LTE interleaving module Data or data obtained by the TD/W interleaving module.

The LTE interleaving module and the TD/W interleaving module in the interleaver can only turn on one of them. Specifically, the LTE interleaving module and the TD/W interleaving module in the interleaver can determine whether to enable the interleaving function according to a preset mode.

As shown in FIG. 2, the Turbo decoder provided in this embodiment mainly includes: a prior information buffer, a system bit buffer, a parity bit buffer, a parity bit selector, a first interleaver, a parallel MAP unit, The second interleaver and the main control module.

The first interleaver is configured to perform interleaving processing on the input data according to a preset mode, where the first interleaver includes a first LTE interlace module, a first TD/W interleaving module, and a first selector, where the first LTE interlace The module is configured to perform interleaving processing on the input data in an LTE mode, where the first TD/W interleaver is configured to perform interleaving processing on the input data in a TD/W mode, and the first selector is configured to use a preset mode (Mode). And selecting to output the interleaved data obtained by the first LTE interleaving module or the interleaved data obtained by the first TD/W interleaving module to the parallel MAP unit. If the current mode is the LTE mode, the first selector selects and outputs the a priori information ab and the system bit sb output by the first LTE interleaving module to the parallel MAP unit. If the current mode is the TD/W mode, the first selector selects the output by The a priori information ab and the systematic bit sb output by the first TD/W interleaving module are given to the parallel MAP unit.

The parallel MAP unit is configured to perform MAP calculation processing on the data of the input itself, and output the obtained MAP calculation result to the second interleaver.

The second interleaver is configured to perform deinterleaving and/or deinterleaving processing on the input data according to a preset mode, where the second interleaver includes a second LTE interlacing module, a second TD/W interleaving module, and a second a second LTE interleaving module, configured to perform deinterleaving processing on the input data in an LTE mode, where the second TD/W interleaving module is configured to deinterleave the input data in a TD/W mode and/or In the de-interleaving process, the second selector is configured to select, according to the preset mode, output data obtained by the second LTE interleaving module or data obtained by the second TD/W interleaving module to the a priori information buffer.

The main control module is used to control the start and end of the decoding process; the a priori information buffer is used to store the a priori information ab obtained in the last MAP iteration, the system bit buffer is used to store the system bit sb, and the check bit buffer includes a first check bit buffer storing a first check bit ρθ and a second check bit buffer for storing a second check bit pi, each buffer receiving a read command issued by the main control module, Output the data saved by itself.

Specifically, when the Turbo decoder proposed in this embodiment is used for decoding, a MAP iterative process in the decoding process is:

The main control module issues a read command to the a priori information buffer, the system bit buffer, and the check bit buffer;

After receiving the read command issued by the main control module, the a priori information buffer and the system bit buffer respectively output the a priori information ab and the systematic bit sb to the first interleaver; in the LTE mode, the current MAP iteration number is an odd number The first interleaver interleaves the input a priori information ab and the systematic bit sb according to a preset mode, and outputs the result to the parallel MAP unit. When the current number of MAP iterations is even, the first interleaver directly inputs the first The information ab and the system bit sb are output to the parallel MAP unit. In the TD/W mode, when the current number of MAP iterations is an odd number, the first interleaver uses the a priori information ab that has been deinterleaved in advance as an interleaving according to a preset mode. The a priori information ab is directly output to the parallel MAP unit, and the input system bit sb is interleaved and output to the parallel MAP unit. When the current MAP iteration number is even, the first interleaver directly inputs the a priori information ab. And the system bit sb is output to the parallel MAP unit.

In particular, in the zeroth map iteration, the a priori information ab data is empty, the first interleaver is straight The input system bit sb is output to the parallel MAP unit.

After receiving the read command issued by the main control module, the first check bit buffer and the second check bit buffer in the check bit buffer acquire the current number of MAP iterations from the main control module, and determine the current number of MAP iterations. When the number is odd, the second check bit pi is output to the parallel MAP unit by the check bit selector, and when the current MAP iteration number is determined to be an even number, the first check bit ρθ is output to the parallel MAP unit by the check bit selector;

The parallel MAP unit performs MAP iterative operation on the input a priori information ab, the systematic bit sb, and the first parity bit ρθ or the second parity bit pi, and outputs the MAP calculation result to the second interleaver.

When the current MAP iteration number is an odd number, the second interleaver deinterleaves the input MAP external assignment information to obtain the de-interleaved external assignment information eb; when the current MAP iteration number is even, the second interlace in the LTE mode The input MAP calculation result is directly used as its own external assignment information eb, and the second interleaver in the TD/W mode de-interleaves the input MAP calculation result to obtain the external assignment information eb. The second selector outputs the external assignment information eb as the a priori information ab required for the next MAP iteration to the a priori information buffer buffer, or the logarithmic domain similarity llr corresponding to the external assignment information eb is positive and negative The hard bit output information hdb converted to the Turbo decoder is output to an external device, which can be controlled by the main control module. For example, the main control module controls the second interleaver to convert the positive and negative polarity of the logarithmic domain 11r corresponding to the external assignment information eb into hard bit data hdb, and determines that the current number of MAP iterations reaches a preset threshold or hard bit. When the data hdb check is correct, the Turbo decoder is controlled to output the current hard bit data hdb as a decoding result of the Turbo decoder to the external device, and the current decoding process is ended; the current MAP iteration number is not up to the pre-determination When the threshold is set and the hard bit data hdb is incorrectly verified, the second interleaver is controlled to output the external information eb as the a priori information ab required for the next MAP iteration to the a priori information buffer buffer, and continues. The next MAP iteration of the current decoding process.

During the decoding process of the Turbo decoder, the system bits and check bits of the external input are entered. The row repeats the MAP iteration, and each iteration, the number of iterations is incremented by 1, until the hard bit data hdb output by the Turbo decoder is correctly verified or the number of iterations reaches the specified value.

In practical applications, the main control module can count the number of MAP iterations and end the decoding process when the number of MAP iterations reaches a preset threshold. Alternatively, the main control module can also check the hard bit output information hdb, and when the check is correct, end the decoding process.

The first interleaver and the second interleaver described above may be identical in hardware structure and algorithm for the LTE mode, except that the first interleaver is used to implement interleaving, and the second interleaver is used to implement deinterleaving. For the TD/W mode, the first interleaver only needs to interleave the input system bit sb, and does not perform any processing on the check bit pb and the a priori information ab, directly input the first check bit ρθ or the first The second parity bit pi and the a priori information ab are output to the parallel MAP unit; the second interleaver de-interleaves the external assignment information eb output by the parallel MAP unit at an even number of MAP iterations and buffers the processing result in the prior information. In the buffer, the second interleaver deinterleaves the external assignment information eb output by the parallel MAP unit at an odd number of MAP iterations and buffers the processing result in the a priori information buffer.

Embodiment 2

In this embodiment, a parallel decoding method and a corresponding Turbo decoder are provided, which can solve the memory conflict problem of parallel decoding in the TD/W mode, thereby implementing parallel decoding in the TD/W mode.

In this embodiment, the method for parallel decoding mainly includes: after performing interleaving processing, parallel MAP calculation, and deinterleaving and/or de-interleaving processing on the multiplexed data, the multiplexed data after the de-interleaving and/or de-interleaving processing and Its corresponding column address is buffered by the cache device and output to a plurality of sub-blocks of the a priori information buffer.

Specifically, according to the row address generated by the second interleaver, the multiple derivation information output by the parallel MAP unit and the multi-path column address generated by the second interleaver are respectively input to corresponding multiple advanced firsts. Out group ( FIFO group ), and then output multiple channels according to the parallel MAP unit And assigning the information to the FIFO grou's first in first out buffer (FIFO, First In First Out).

In practical applications, the buffered multiplexed data is outputted, including: each N-selector in the cache device selects and outputs each advanced first by comparing the number of data retained in the N first-in first-out buffers (FIFOs) The data buffered by the FIFO buffer with the most data retained in the group and its corresponding column address are sent to multiple sub-blocks (Bank) of the a priori information buffer.

Correspondingly, the Turbo decoder provided in this embodiment includes: a buffer device, where the buffer device includes N first-in first-out groups (FIFO groups) and N-n selection selectors, and a first-in first-out group (FIFO) Group ) connects an N-selector, where each FIFO grou contains N FIFOs, and the N inputs of an N-selector are connected to the outputs of N FIFOs in a FIFO grou. Here, N represents the maximum value of the number of parallel decoded data paths, and N is a power of 2, and in this embodiment, the value range of N is [1, 16], and specifically, N can be determined according to the needs of practical applications. Value.

Specifically, the buffer device may be added to the turbo decoder provided in the first embodiment, and the buffer device is used as a part of the second interleaver to implement parallel decoding of the embodiment. The buffer device may be connected between the second TD/W interleaving module and the a priori information buffer, or may be connected to the a priori information buffer as a component inside the second TD/W interleaving module. Correspondingly, the parallel MAP unit includes N MAP sub-units, such as MAP-0, MAP-1 as shown in FIG.

MAP — N-l. The a priori information buffer, the systematic bit buffer, and the check bit buffer are evenly divided into N banks, Bank-0, Bank-1, ..., Bank_Nl shown in Figure 3, each bank having the same size. Correspondingly, one data code block is divided into N sub-data code blocks, each sub-data code block has the same length, and one bank stores one sub-data code block. If the length of one data block is K, the length of each sub-code block is L = K / N.

In this embodiment, each interleave module in each interleaver includes an interleave address calculation unit and an interleave unit, where the interleave address calculation unit is configured to perform interleave address calculation according to the corresponding mode, Obtaining a row address and a column address for performing interleaving processing or deinterleaving processing on the data; a cross unit, configured to input the column address obtained by the interleaving address calculating unit into its own data, and obtain a row address according to the interleaving address calculating unit Perform cross sorting. For example, the interleave address calculation unit in the LTE interleave module is configured to perform interleave address calculation according to an interleave algorithm and/or a deinterleave algorithm in the LTE mode; the first interleave address calculation unit in the TD/W interleave module is used to follow the TD The interleaving algorithm and/or the de-interleaving algorithm in the /W mode performs interleaving address calculation. In FIG. 3, the first interleaving unit and the first interleaving address calculating unit are used to jointly implement interleaving processing on data, and the second interleaving unit and the second interleaving address calculating unit are used to jointly implement deinterleaving processing on data.

As shown in Figure 3, when the Turbo decoder performs parallel decoding in the LTE mode, the Turbo decoder performs a MAP iteration on the externally input systematic bits and parity bits as follows:

The first interleave address calculation unit outputs N "column addresses" to N banks of the system bit buffer and the a priori information buffer for each beat as a "read address" for reading data held by itself, and each The beat outputs N "row addresses" to the cross unit; after receiving the read command of the main control module, the system bit buffer and the a priori information buffer read from their own N banks according to the N "column addresses", respectively. Taking N systematic bits and N a priori information, implementing column interleaving, and outputting column interleaved N systematic bits or N first risk information to the intersecting unit; the intersecting unit according to the N "row addresses" And respectively inputting N system bits and N pre-risk information for cross-ordering, implementing row interleaving, and sending N system bits and N a priori information after row interleaving to N MAP sub-units of parallel MAP unit ;

After receiving the read command of the main control module, each of the N banks of the first check bit buffer outputs N first check bits ρθ to the N MAP sub-units of the parallel MAP unit through the check bit selector. Or N banks of the second parity bit buffer output N second parity bits pi to the check bit selector to N MAP sub-units of the parallel MAP unit; N MAP sub-units of the parallel MAP unit Enter the system bits, a priori, respectively The information, and the first check bit ρθ or the second check bit p 1 are subjected to Max-Log-MAP calculation, and N MAP calculation results are obtained, and the N MAP calculation results are output to the second cross unit, and the second cross unit According to the N row addresses output by the second interleave address calculation unit, the N MAP calculation results of the input itself are cross-ordered to implement line deinterleaving, and the current MAP iteration number does not reach a preset threshold or hard bit data hdb If the verification is incorrect, according to the N column addresses output by the second interleave address calculation unit, the N MAP calculation results are written as their own external assignment information to the N banks of the a priori information buffer to implement column deinterleaving. At this point, once the MAP iteration is completed, the data buffered by the N banks of the a priori information buffer is used as the a priori information of the next iteration.

Wherein, for the N system bits and the N pieces of risk information: when the current number of MAP iterations is even, the system bits and the a priori information need not be interleaved, and therefore, the N pieces output by the first interleave address calculation unit a column address obtained by modulo L based on the positive sequence address of the N system bits and the N pieces of risk information, the N row addresses output by the first interleave address calculation unit and N MAPs in the parallel MAP unit The order of the subunits is the same; when the current number of MAP iterations is an odd number, the system bits and the a priori information need to be interleaved. Therefore, the N column addresses output by the first interleave address calculation unit are based on the N system bits. The interleave address of the N pieces of a priori information is obtained by modulo L, and the N row addresses output by the first interleave address calculation unit are obtained by quoting L based on the interleave address of the N system bits and the N pieces of prior information. . Here, the positive order address of the systematic bits and a priori information is its save address. The interleaving address of the systematic bit and the a priori information is calculated by interleaving the systematic bit and the positive sequence address of the prior information by the first interleaving address calculating unit. Similarly, the N column addresses and the N row addresses output by the second interleaving address calculation unit are also obtained in the above manner, and are not described again.

For N first parity bits ρθ or second parity bits pi, the read address is obtained by modulo the length of the reserved sub-data code block in each bank based on its positive sequence address; the second interleave address calculation unit outputs The N column addresses, based on the N first parity bits ρθ or the second parity The positive sequence address of the bit pi is obtained by modulo L, and the N row addresses output by the second interleave address calculation unit are identical to the order of the N MAP subunits in the parallel MAP unit.

In the LTE mode, the N column addresses output by the first interleave address calculation unit are the same, and in actual applications, one column address can be used.

When the Turbo decoder performs parallel decoding in the TD/W mode, the Turbo decoder performs the same MAP iteration process on the externally input system bits and check bits, and the MAP iterative process under the LTE module is basically the same. To eliminate the memory conflict, after the second cross unit outputs the N external assignment information, the N external assignment information is input to the buffer device, and the cache device generates the row address generated by the second interleave address calculation unit and The parallel MAP unit outputs the sequence of the multiple assignment information, buffers the N external assignment information and the corresponding column address generated by the second interleave address calculation unit to itself, and then according to each FIFO in each FIFO group The number of data retained, the FIFO that selects the current data for each beat writes its buffered data to the N banks of the a priori information buffer.

Specifically, as shown in FIG. 4, the buffer device includes N FIFO grous and N N-selectors, and each FIFO grou includes N FIFOs, that is, FIFO_0, FIFO shown in FIG. — 1, FIFO — Nl. The N pieces of external assignment information output by the parallel MAP unit and the N column addresses generated by the second interleaver are stored in the FIFO group according to the N row addresses generated by the second interleaver, and The order of the N pieces of external information output by the parallel MAP unit determines N pieces of external assignment information output by the parallel MAP unit and a first in first out buffer (FIFO) to which the N column addresses should be stored.

As shown in FIG. 4, the cache device outputs the cached external assignment information and the column address in the above manner, and writes the external assignment information to the N of the a priori information buffers by using the column address as a write address. In the Bank, the method includes: in the TD/W mode, the second interleave unit, according to the N row addresses generated by the second interleave address calculation unit, the N column addresses generated by the second interleave address calculation unit and the parallel MAP The N external assignment information output by the unit is input into the buffer device together Corresponding FIFO grou, according to the sequence of N pieces of external information output by the parallel MAP unit, buffer the N pieces of external assignment information and the N column addresses output by the second interleave address calculation unit to the corresponding FIFO. Thereafter, each of the N-selectors in the cache device selects one of the N FIFOs of the FIFO grou to which it is connected, and selects one of the FIFO output assignment information to the corresponding bank in the a priori information buffer. According to the modeling statistics, the FIFO depth in each FIFO group converges within a limited range, and the maximum depth of the FIFO does not exceed 8 when N is 16.

Embodiment 3

In this embodiment, the structure of the turbo decoder is the same as that of the first embodiment, except that the system bit buffer, the first parity bit buffer, and the second parity buffer all adopt a ping-pong structure, and each The buffer contains a ping sub-buffer and a pang sub-buffer.

In order to meet the requirement of a maximum code block length of 6144 in LTE mode, the total length (Length) of each memory buffer (Memory grou) is 6144. In TD/W mode, since the maximum code block length is only 5114, each sub-buffer in TD/W mode has some spare space on the basis of storing one longest code block. In view of this, the system will be separately The bit buffer, the first parity buffer, and the second parity buffer are split into two parts, one portion of the Memory grou is 5120 in length, and the other Memory grou is 1024 in length.

As shown in (a) of Figure 5, in LTE mode, the two memory groups in each buffer are used to store the corresponding data. As shown in Figure 5(b), the maximum code block length in the TD/W mode is 5114. The Memory Grou with a length of 5120 in each buffer is used to store the corresponding data. The Memory group with a length of 1024 in each buffer is shown. Combined, it can be used to store system bits as interlaced. The memory group of length 1024 is combined to have a total length of 6144, and the interleaved system bit code block length is 5114. Therefore, it can be used to store system bits after interleaving.

Embodiment 4 In this embodiment, the structure of the turbo decoder and the parallel decoding process are identical to those of the second embodiment. Specifically, in this embodiment, N is 16 , that is, the parallel MAP unit internally includes 16 MAP sub-units, and the parallel MAP unit adopts a Max-Log-MAP algorithm, and each MAP sub-unit is a Max-Log-MAP algorithm. Hardware implementation circuit.

As shown in Figure 6, the data length to be calculated for each MAP subunit is L, and L is the sub-data block length of a Bank. The Max-MAP-Log algorithm implemented by the MAP subunit includes a forward recursive Alpha calculation process and a reverse recursive Beta calculation process. The data of the corresponding positions of the Alpha sequence and the Beta sequence are further calculated. MAP calculation results.

As shown in Figure 6 a), in the related art, a logical structure diagram of a MAP sub-unit in the parallel MAP module, a total of three inputs sb, pb, ab, output external information eb and hard bit hdb.

Figure 6b) is a schematic diagram of the implementation of the conventional recursive method under the logical structure shown in a), wherein the beta recursive process is reversed, and the process of beta reverse recursion is: assigning an initial value to the first last beta Then, according to the formula beta(i- 1 ) = f 1 (beta(i), gamma(i)), the second to last beta value and the third last beta value are obtained, until the first beta value of the positive number is obtained. . The alpha recursion process is positive. The process of alpha reverse recursion is: first assign an initial value to the first alpha of the positive number, and then according to the formula alpha(i+l) = £2(alpha(i), gamma( i)) - Get the second alpha value, the third alpha value... until you get the last alpha value of the positive number. among them,

In summary, the output order of the alpha recursive result is as follows: 0,1,2,3...k, the output order of the beta recursive result is as follows: k,kl,k-2...1 , 0. Where k represents the data length. The order of the alpha recursive result and the beta recursive result are adjusted to be consistent by the gamma buffer and the beta buffer, and the eb is calculated based on the formula eb(i) = G(alpha(i), beta(i)). Hdb or eb.

Specifically, the three inputs of sb, pb, and ab are input in the reverse order of beta, and the obtained gamma value is naturally reversed, then the inverse gamma value sent to the beta reverse recursive module can be directly used for beta reverse recursion, and , the reverse gamma value is buffered into the gamma buffer, and after the beta reverse recursion is completed, the gamma value is read from the gamma buffer in the positive order. To give alpha forward recursive module to do alpha forward recursion. Moreover, the beta value of the reverse recursion obtained by the beta recursion is first buffered into the beta buffer, and when the alpha positive sequence recursively outputs the alpha value of the positive sequence, the beta value is read from the beta buffer in the positive order, with the positive sequence The alpha value is sent to the eb calculation module together, and the eb calculation module outputs the eb value of the positive sequence after the eb calculation. Thus, the hdb of the decision output is also the positive sequence. In summary, in b) of Fig. 6, the beta value of the reverse order is first calculated and saved, and the alpha value of the positive sequence is calculated, and the beta value of the positive sequence is read, and the calculation of eb and hdb is completed.

In order to reduce the waiting time and increase the throughput rate, and also to reduce the memory resources of the cached Beta value during the waiting process, the sliding window control mode shown in c) of FIG. 6 is generally adopted, specifically, a packet of length k is divided into several. A sliding window, each time calculating the beta of a sliding window, begins to calculate the alpha corresponding to the sliding window, and starts the calculation of eb and hdb of the sliding window. While calculating the alpha and eb\hdb of the current sliding window, the beta calculation of the next sliding window is started. Thus, when the alpha and eb\hdb calculations of the current sliding window are completed, the beta of the next sliding window is also calculated, and then You can start the calculation of alpha and eb\hdb for the next sliding window, and so on. In this way, it is not necessary to buffer the entire packet with k gamma values and k beta values, and only need to buffer the gamma value and the beta value of the two sliding windows, thereby reducing the size of the buffer.

Figure d) is a schematic diagram of the flow mode of the recursive mode shown in Figure c). Each sliding window of beta has an overlap length, so the length of the beta sliding window is twice the length of the alpha sliding window, each After the alpha sliding window is calculated, wait for a sliding window time before starting the calculation of the next sliding window.

Figure 6 is a logical structure diagram of the improved MAP subunit of Figure a). Each beta sliding window is divided into two parts: "overlap" and "sliding window". Set the length of the Beta overlap and the Beta sliding window to be equal. And set the Beta overla reverse recursive module and the Beta sliding window reverse recursive module, the beta overlap reverse recursive module is used to calculate the overlap portion of the beta sliding window, and the beta sliding window reverse recursive module is used to calculate the sliding window of the beta sliding window. section.

In Figure 6 f) is the pipeline corresponding to Figure e), first the first beat calculates the first beta sliding window The overlap portion, the second beat calculates the sliding window portion of the first beta sliding window, and calculates the overlap portion of the second beta sliding window, and the third beat calculates the alpha/eb/hdb of the first sliding window, Calculate the beta sliding window portion of the second sliding window, calculate the overlap portion of the third sliding window, and so on. Obviously, the improved pipeline of Figure f) is shorter and faster than the pipeline of Figure d), improving throughput and pipeline efficiency.

The above is only the preferred embodiment of the present invention and is not intended to limit the scope of the present invention.

Claims

Claim

An interleaver, the interleaver comprising: an LTE interlace module, a TD/W interleaving module, and a selector;

2. A Turbo decoder, the Turbo decoder comprising: a first interleaver and a second interleaver;

3. The turbo decoder according to claim 2, wherein the turbo decoder further comprises: a prior information buffer, a system bit buffer, a parity bit buffer, and a checksum. a bit selector, a parallel MAP unit, and a main control module;

4. The turbo decoder according to claim 3, wherein the turbo decoder further comprises: a buffer device connected between the second TD/W interleaving module and the a priori information buffer, Or as a component of the second TD/W interleaving module, connected to the a priori information buffer; the buffer device includes N FIFO groups and N select one selectors, and a first in first out group connection An N-select selector, wherein each FIFO group includes N first-in first-out buffers (FIFOs), and N inputs of an N-selector are connected to outputs of N FIFOs in a FIFO grou; wherein, N Indicates the maximum value of the number of parallel decoded data paths, where N is the power of 2.

5. The turbo decoder according to claim 4, wherein:

The parallel MAP unit includes N MAP subunits;

The a priori information buffer, the systematic bit buffer, and the check bit buffer are evenly divided into N sub-blocks (Bank).

6. The turbo decoder according to claim 5, wherein:

The MAP subunit includes: a Beta overla reverse recursive module and a Beta sliding window reverse recursive module, wherein a beta overla reverse recursive module is used to calculate an overlap portion of a beta sliding window, and a beta sliding window reverse recursive module is used for calculation The sliding window portion of the beta sliding window.

The turbo decoder according to any one of claims 3 to 6, wherein: the system bit buffer, and a first parity buffer and a second of the parity buffer The check bit buffers are all in a ping-pong structure. Each buffer contains a ping sub-buffer and a pang sub-buffer; the total memory bank size of each sub-buffer is 6144.

8. The turbo decoder according to claim 7, wherein the system bit buffer, the first parity bit buffer, and the second parity bit buffer respectively comprise two parts, a part of the memory group The length is 5120, and the other part of the memory group is 1024.

9. A parallel decoding method, the method comprising:

After the multiplexed data is interleaved by the first interleaver in the Turbo decoder and the parallel MAP is calculated by the parallel MAP unit, the multiplexed row address generated by the second interleaver and the parallel MAP unit are outputted. The order of the road data, buffering the deinterleaved and/or deinterleaved multiplexed data and the multiplexed column address generated by the second interleaver into the buffer device of the turbo decoder, and then the buffer device The buffered multiplexed data and the corresponding column address are output to the a priori information buffer.

The parallel decoding method according to claim 9, wherein the demultiplexing is performed according to the multiplexed row address generated by the second interleaver and the sequence of outputting multiplexed data by the parallel MAP unit. And/or the demultiplexed multiplexed data and the multiplexed column address generated by the second interleaver are buffered into the buffer device of the Turbo decoder, including:

And inputting, by the row address generated by the second interleaver, the multiplexed data output by the parallel MAP unit to a corresponding plurality of FIFO groups in the cache device, and outputting according to the parallel MAP unit. Sequence of multiplexed data, multiple multiplexed signals output by the MAP unit And the column address generated by the second interleaver is correspondingly stored in the FIFO buffer in each FIFO group.

The parallel decoding method according to claim 9, wherein the outputting the multiplexed data buffered by the cache device comprises:

Each of the N select selectors in the cache device selects and outputs data buffered by the FIFO buffer having the most accumulated data in each of the FIFO groups and a corresponding column address to a plurality of sub-blocks of the a priori information buffer.

The parallel decoding method according to any one of claims 9 to 11, wherein the parallel MAP calculation by the parallel MAP unit comprises:

Each MAP subunit in the parallel MAP unit performs MAP calculation on the a priori information ab, the system bit sb, and the check bit pb of the input itself, and outputs the external information eb and the hard bit hdb, including the Alpha calculation process and the Beta calculation process;